DSL-13-SRMAP
/

mBERT_WR

+---
+license: cc-by-4.0
+tags:
+- sentiment-classification
+- telugu
+- mbert
+- multilingual
+- baseline
+language: te
+datasets:
+- DSL-13-SRMAP/TeSent_Benchmark-Dataset
+model_name: mBERT_WR
+---
+# mBERT_WR: BERT-base Multilingual Telugu Sentiment Classification Model (With Rationale)
+## Model Overview
+**mBERT_WR** is a Telugu sentiment classification model based on **BERT-base-multilingual-cased (mBERT)**, Google's transformer model trained on Wikipedia texts in 104 languages (including Telugu) for both Masked Language Modeling (MLM) and Next Sentence Prediction (NSP) tasks.
+"WR" in the model name stands for "**With Rationale**", meaning this model is trained using both sentiment labels and **human-annotated rationales** from the TeSent_Benchmark-Dataset.
+---
+## Model Details
+- **Architecture:** BERT-base Multilingual Cased (12 layers, ~100 million parameters)
+- **Pretraining Data:** Wikipedia texts in 104 languages, including Telugu
+- **Pretraining Objectives:** Masked Language Modeling (MLM) and Next Sentence Prediction (NSP)
+- **Fine-tuning Data:** [TeSent_Benchmark-Dataset](https://huggingface.co/datasets/dsl-13-srmap/tesent_benchmark-dataset), using both sentence-level sentiment labels (positive, negative, neutral) and rationale annotations
+- **Task:** Sentence-level sentiment classification (3-way)
+- **Rationale Usage:** **Used** during training and/or inference ("WR" = With Rationale)
+---
+## Intended Use
+- **Primary Use:** Benchmarking Telugu sentiment classification on the TeSent_Benchmark-Dataset as a strong multilingual baseline, especially for models trained with rationales
+- **Research Setting:** Widely used in academic NLP research, especially effective in low-resource settings and for multilingual applications
+---
+## Why mBERT?
+mBERT supports cross-lingual transfer with shared multilingual representations and acceptable performance on Telugu sentiment tasks, even with limited data.
+It generalizes well across languages, making it effective for multilingual applications. However, it is not optimized for Telugu morphology or syntax and may lag behind regionally tuned models (IndicBERT, L3Cube-Telugu-BERT) in capturing fine-grained language nuances.
+With rationale supervision, mBERT_WR can provide **explicit explanations** for its predictions.
+---
+## Performance and Limitations
+**Strengths:**
+- Powerful and reliable baseline for multilingual and cross-lingual NLP
+- Good generalization to Telugu sentiment tasks
+- Provides **explicit rationales** for predictions, aiding explainability
+- Widely used and validated in academic research
+**Limitations:**
+- Not specifically tuned for Telugu; may miss fine-grained, language-specific nuances
+- May be outperformed by Telugu-specialized models for highly nuanced or domain-specific tasks
+---
+## Training Data
+- **Dataset:** [TeSent_Benchmark-Dataset](https://huggingface.co/datasets/dsl-13-srmap/tesent_benchmark-dataset)
+- **Data Used:** The **Content** (Telugu sentence), **Label** (sentiment label), and **Rationale** (human-annotated rationale) columns are used for mBERT_WR training
+---
+## Language Coverage
+- **Language:** Telugu (`te`)
+- **Model Scope:** Evaluated for Telugu sentiment classification, with cross-lingual potential
+---
+## Citation and More Details
+For detailed experimental setup, evaluation metrics, and comparisons with rationale-based models, **please refer to our paper**.
+---
+## License
+Released under [CC BY 4.0](LICENSE).