🤖 distilbert-hindi-eou-detector
A fine-tuned DistilBERT model for End-of-Utterance (EOU) Detection in conversational Hindi. This model identifies whether a Hindi dialogue phrase marks the end of a speaker's turn, making it suitable for voice assistants, dialogue systems, or turn-taking logic in chatbots.
🧠 Model Description
- Base model:
distilbert-base-multilingual-cased - Language: Hindi
- Task: Binary Classification — End of Utterance Detection
- Labels:
1: End of Utterance (EOU)0: Not End of Utterance (NOT_EOU)
🗂️ Training Dataset
This model was fine-tuned on the hindi-conversational-eou dataset — a balanced collection of 1000 Hindi conversational phrases labeled for end-of-turn detection.
Each example in the dataset is a short Hindi phrase labeled with:
"text": The utterance string"label":0or1(as defined above)
📊 Evaluation Metrics
(Note: These are example metrics — replace with your actual numbers if available)
- Accuracy: 92.4%
- F1 Score: 91.7%
- Precision: 90.5%
- Recall: 93.0%
🛠️ Intended Use
This model is ideal for:
- Voice assistants (to detect pauses vs. final utterances)
- Dialogue systems and conversational AI
- Research in Hindi language conversation modeling
🧪 Example Usage
from transformers import pipeline
classifier = pipeline("text-classification", model="yashsoni78/distilbert-hindi-eou-detector")
# Example phrases
examples = [
"क्या तुम मेरे साथ चलोगे?",
"अगर हम वहाँ जाते तो",
]
for text in examples:
result = classifier(text)
print(f"{text} => {result}")
🔍 Limitations
- Trained on a small dataset (1000 examples); may not generalize to complex or domain-specific Hindi.
- Only binary EOU detection, no deeper semantic understanding.
- Assumes input is in colloquial conversational Hindi.
🧾 Citation
If you use this model in your research or application, please cite:
@misc{distilbert_hindi_eou_2025,
title = {distilbert-hindi-eou-detector},
author = {Yash Soni},
year = {2025},
howpublished = {\url{https://huggingface.co/yashsoni78/distilbert-hindi-eou-detector}},
note = {Fine-tuned model for Hindi end-of-utterance detection}
}
📄 License
This model is released under the MIT License. You are free to use, modify, and distribute with attribution.
🙏 Acknowledgements
- Base model: distilbert-base-multilingual-cased
- Dataset: hindi-end-of-utterance-detection
- Created with the help of 🤗 Transformers
- Downloads last month
- 6
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for yashsoni78/distilbert-hindi-eou-detector
Base model
distilbert/distilbert-base-uncased