Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,88 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: cc-by-4.0
|
3 |
+
tags:
|
4 |
+
- sentiment-classification
|
5 |
+
- telugu
|
6 |
+
- mbert
|
7 |
+
- multilingual
|
8 |
+
- baseline
|
9 |
+
language: te
|
10 |
+
datasets:
|
11 |
+
- DSL-13-SRMAP/TeSent_Benchmark-Dataset
|
12 |
+
model_name: mBERT_WR
|
13 |
+
---
|
14 |
+
|
15 |
+
# mBERT_WR: BERT-base Multilingual Telugu Sentiment Classification Model (With Rationale)
|
16 |
+
|
17 |
+
## Model Overview
|
18 |
+
|
19 |
+
**mBERT_WR** is a Telugu sentiment classification model based on **BERT-base-multilingual-cased (mBERT)**, Google's transformer model trained on Wikipedia texts in 104 languages (including Telugu) for both Masked Language Modeling (MLM) and Next Sentence Prediction (NSP) tasks.
|
20 |
+
"WR" in the model name stands for "**With Rationale**", meaning this model is trained using both sentiment labels and **human-annotated rationales** from the TeSent_Benchmark-Dataset.
|
21 |
+
|
22 |
+
---
|
23 |
+
|
24 |
+
## Model Details
|
25 |
+
|
26 |
+
- **Architecture:** BERT-base Multilingual Cased (12 layers, ~100 million parameters)
|
27 |
+
- **Pretraining Data:** Wikipedia texts in 104 languages, including Telugu
|
28 |
+
- **Pretraining Objectives:** Masked Language Modeling (MLM) and Next Sentence Prediction (NSP)
|
29 |
+
- **Fine-tuning Data:** [TeSent_Benchmark-Dataset](https://huggingface.co/datasets/dsl-13-srmap/tesent_benchmark-dataset), using both sentence-level sentiment labels (positive, negative, neutral) and rationale annotations
|
30 |
+
- **Task:** Sentence-level sentiment classification (3-way)
|
31 |
+
- **Rationale Usage:** **Used** during training and/or inference ("WR" = With Rationale)
|
32 |
+
|
33 |
+
---
|
34 |
+
|
35 |
+
## Intended Use
|
36 |
+
|
37 |
+
- **Primary Use:** Benchmarking Telugu sentiment classification on the TeSent_Benchmark-Dataset as a strong multilingual baseline, especially for models trained with rationales
|
38 |
+
- **Research Setting:** Widely used in academic NLP research, especially effective in low-resource settings and for multilingual applications
|
39 |
+
|
40 |
+
---
|
41 |
+
|
42 |
+
## Why mBERT?
|
43 |
+
|
44 |
+
mBERT supports cross-lingual transfer with shared multilingual representations and acceptable performance on Telugu sentiment tasks, even with limited data.
|
45 |
+
It generalizes well across languages, making it effective for multilingual applications. However, it is not optimized for Telugu morphology or syntax and may lag behind regionally tuned models (IndicBERT, L3Cube-Telugu-BERT) in capturing fine-grained language nuances.
|
46 |
+
|
47 |
+
With rationale supervision, mBERT_WR can provide **explicit explanations** for its predictions.
|
48 |
+
|
49 |
+
---
|
50 |
+
|
51 |
+
## Performance and Limitations
|
52 |
+
|
53 |
+
**Strengths:**
|
54 |
+
- Powerful and reliable baseline for multilingual and cross-lingual NLP
|
55 |
+
- Good generalization to Telugu sentiment tasks
|
56 |
+
- Provides **explicit rationales** for predictions, aiding explainability
|
57 |
+
- Widely used and validated in academic research
|
58 |
+
|
59 |
+
**Limitations:**
|
60 |
+
- Not specifically tuned for Telugu; may miss fine-grained, language-specific nuances
|
61 |
+
- May be outperformed by Telugu-specialized models for highly nuanced or domain-specific tasks
|
62 |
+
|
63 |
+
---
|
64 |
+
|
65 |
+
## Training Data
|
66 |
+
|
67 |
+
- **Dataset:** [TeSent_Benchmark-Dataset](https://huggingface.co/datasets/dsl-13-srmap/tesent_benchmark-dataset)
|
68 |
+
- **Data Used:** The **Content** (Telugu sentence), **Label** (sentiment label), and **Rationale** (human-annotated rationale) columns are used for mBERT_WR training
|
69 |
+
|
70 |
+
---
|
71 |
+
|
72 |
+
## Language Coverage
|
73 |
+
|
74 |
+
- **Language:** Telugu (`te`)
|
75 |
+
- **Model Scope:** Evaluated for Telugu sentiment classification, with cross-lingual potential
|
76 |
+
|
77 |
+
---
|
78 |
+
|
79 |
+
## Citation and More Details
|
80 |
+
|
81 |
+
For detailed experimental setup, evaluation metrics, and comparisons with rationale-based models, **please refer to our paper**.
|
82 |
+
|
83 |
+
|
84 |
+
---
|
85 |
+
|
86 |
+
## License
|
87 |
+
|
88 |
+
Released under [CC BY 4.0](LICENSE).
|