Raj411 commited on
Commit
11afdad
·
verified ·
1 Parent(s): 344c36e

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +88 -0
README.md ADDED
@@ -0,0 +1,88 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-4.0
3
+ tags:
4
+ - sentiment-classification
5
+ - telugu
6
+ - mbert
7
+ - multilingual
8
+ - baseline
9
+ language: te
10
+ datasets:
11
+ - DSL-13-SRMAP/TeSent_Benchmark-Dataset
12
+ model_name: mBERT_WR
13
+ ---
14
+
15
+ # mBERT_WR: BERT-base Multilingual Telugu Sentiment Classification Model (With Rationale)
16
+
17
+ ## Model Overview
18
+
19
+ **mBERT_WR** is a Telugu sentiment classification model based on **BERT-base-multilingual-cased (mBERT)**, Google's transformer model trained on Wikipedia texts in 104 languages (including Telugu) for both Masked Language Modeling (MLM) and Next Sentence Prediction (NSP) tasks.
20
+ "WR" in the model name stands for "**With Rationale**", meaning this model is trained using both sentiment labels and **human-annotated rationales** from the TeSent_Benchmark-Dataset.
21
+
22
+ ---
23
+
24
+ ## Model Details
25
+
26
+ - **Architecture:** BERT-base Multilingual Cased (12 layers, ~100 million parameters)
27
+ - **Pretraining Data:** Wikipedia texts in 104 languages, including Telugu
28
+ - **Pretraining Objectives:** Masked Language Modeling (MLM) and Next Sentence Prediction (NSP)
29
+ - **Fine-tuning Data:** [TeSent_Benchmark-Dataset](https://huggingface.co/datasets/dsl-13-srmap/tesent_benchmark-dataset), using both sentence-level sentiment labels (positive, negative, neutral) and rationale annotations
30
+ - **Task:** Sentence-level sentiment classification (3-way)
31
+ - **Rationale Usage:** **Used** during training and/or inference ("WR" = With Rationale)
32
+
33
+ ---
34
+
35
+ ## Intended Use
36
+
37
+ - **Primary Use:** Benchmarking Telugu sentiment classification on the TeSent_Benchmark-Dataset as a strong multilingual baseline, especially for models trained with rationales
38
+ - **Research Setting:** Widely used in academic NLP research, especially effective in low-resource settings and for multilingual applications
39
+
40
+ ---
41
+
42
+ ## Why mBERT?
43
+
44
+ mBERT supports cross-lingual transfer with shared multilingual representations and acceptable performance on Telugu sentiment tasks, even with limited data.
45
+ It generalizes well across languages, making it effective for multilingual applications. However, it is not optimized for Telugu morphology or syntax and may lag behind regionally tuned models (IndicBERT, L3Cube-Telugu-BERT) in capturing fine-grained language nuances.
46
+
47
+ With rationale supervision, mBERT_WR can provide **explicit explanations** for its predictions.
48
+
49
+ ---
50
+
51
+ ## Performance and Limitations
52
+
53
+ **Strengths:**
54
+ - Powerful and reliable baseline for multilingual and cross-lingual NLP
55
+ - Good generalization to Telugu sentiment tasks
56
+ - Provides **explicit rationales** for predictions, aiding explainability
57
+ - Widely used and validated in academic research
58
+
59
+ **Limitations:**
60
+ - Not specifically tuned for Telugu; may miss fine-grained, language-specific nuances
61
+ - May be outperformed by Telugu-specialized models for highly nuanced or domain-specific tasks
62
+
63
+ ---
64
+
65
+ ## Training Data
66
+
67
+ - **Dataset:** [TeSent_Benchmark-Dataset](https://huggingface.co/datasets/dsl-13-srmap/tesent_benchmark-dataset)
68
+ - **Data Used:** The **Content** (Telugu sentence), **Label** (sentiment label), and **Rationale** (human-annotated rationale) columns are used for mBERT_WR training
69
+
70
+ ---
71
+
72
+ ## Language Coverage
73
+
74
+ - **Language:** Telugu (`te`)
75
+ - **Model Scope:** Evaluated for Telugu sentiment classification, with cross-lingual potential
76
+
77
+ ---
78
+
79
+ ## Citation and More Details
80
+
81
+ For detailed experimental setup, evaluation metrics, and comparisons with rationale-based models, **please refer to our paper**.
82
+
83
+
84
+ ---
85
+
86
+ ## License
87
+
88
+ Released under [CC BY 4.0](LICENSE).