|
|
--- |
|
|
library_name: transformers |
|
|
tags: |
|
|
- text-classification |
|
|
- emotion-detection |
|
|
- sentiment-analysis |
|
|
- distilbert |
|
|
language: |
|
|
- en |
|
|
license: apache-2.0 |
|
|
base_model: distilbert-base-uncased |
|
|
pipeline_tag: text-classification |
|
|
metrics: |
|
|
- accuracy |
|
|
- f1 |
|
|
--- |
|
|
|
|
|
# DistilBERT Emotion Classifier |
|
|
|
|
|
## Model Description |
|
|
|
|
|
This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) for multi-class emotion classification. The model classifies text into different emotional categories, enabling applications in sentiment analysis, customer feedback analysis, and social media monitoring. |
|
|
|
|
|
**Developed by:** Sathwik3 |
|
|
|
|
|
**Model type:** Text Classification (Emotion Detection) |
|
|
|
|
|
**Language(s):** English |
|
|
|
|
|
**License:** Apache 2.0 |
|
|
|
|
|
**Base model:** [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Architecture |
|
|
|
|
|
The model is based on DistilBERT, a distilled version of BERT that retains 97% of BERT's language understanding while being 40% smaller and 60% faster. The architecture consists of: |
|
|
- 6 transformer layers |
|
|
- 768 hidden dimensions |
|
|
- 12 attention heads |
|
|
- ~66M parameters |
|
|
- Classification head for emotion prediction |
|
|
|
|
|
### Training Objective |
|
|
|
|
|
The model was fine-tuned using cross-entropy loss for multi-class classification, optimizing for accurate emotion categorization across multiple emotional states. |
|
|
|
|
|
## Intended Uses |
|
|
|
|
|
### Direct Use |
|
|
|
|
|
The model can be directly used for: |
|
|
- **Emotion detection** in text documents |
|
|
- **Sentiment analysis** of customer reviews and feedback |
|
|
- **Social media monitoring** to understand emotional tone |
|
|
- **Content moderation** based on emotional content |
|
|
- **Mental health applications** for emotion tracking in journals |
|
|
- **Chatbot enhancement** for emotion-aware responses |
|
|
|
|
|
### Downstream Use |
|
|
|
|
|
This model can be integrated into larger systems for: |
|
|
- Customer service platforms for automated response routing |
|
|
- Market research tools for analyzing consumer sentiment |
|
|
- Educational platforms for emotional intelligence training |
|
|
- Healthcare applications for mental wellness monitoring |
|
|
|
|
|
### Out-of-Scope Use |
|
|
|
|
|
The model should **not** be used for: |
|
|
- Clinical diagnosis or medical decision-making |
|
|
- Making critical decisions about individuals without human oversight |
|
|
- Applications where misclassification could cause harm |
|
|
- Languages other than English (without additional fine-tuning) |
|
|
- Real-time crisis intervention or emergency response |
|
|
|
|
|
## Limitations and Bias |
|
|
|
|
|
### Limitations |
|
|
|
|
|
- **Language limitation:** The model is trained primarily on English text and may not perform well on other languages or code-switched text |
|
|
- **Context sensitivity:** Short texts or texts lacking context may be misclassified |
|
|
- **Domain specificity:** Performance may vary across different domains (e.g., formal vs. informal text) |
|
|
- **Sarcasm and irony:** The model may struggle with non-literal expressions |
|
|
- **Cultural nuances:** Emotion expression varies across cultures, which may affect performance |
|
|
|
|
|
### Bias Considerations |
|
|
|
|
|
- The model's predictions may reflect biases present in the training data |
|
|
- Emotion categories may not universally apply across all cultures and contexts |
|
|
- Performance may vary across demographic groups depending on training data representation |
|
|
- Users should validate model outputs, especially in sensitive applications |
|
|
|
|
|
### Recommendations |
|
|
|
|
|
- Always review model predictions in high-stakes applications |
|
|
- Use the model as a decision support tool, not a sole decision-maker |
|
|
- Evaluate performance on your specific use case before deployment |
|
|
- Monitor for bias and fairness issues in production |
|
|
- Provide clear communication to end users about the model's capabilities and limitations |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
|
|
Use the code below to get started with the model: |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
import torch |
|
|
|
|
|
# Load model and tokenizer |
|
|
model_name = "Sathwik3/distilbert-emotion-classifier" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
model = AutoModelForSequenceClassification.from_pretrained(model_name) |
|
|
|
|
|
# Example text |
|
|
text = "I am so happy and excited about this amazing opportunity!" |
|
|
|
|
|
# Tokenize and predict |
|
|
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512) |
|
|
with torch.no_grad(): |
|
|
outputs = model(**inputs) |
|
|
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1) |
|
|
predicted_class = torch.argmax(predictions, dim=-1).item() |
|
|
|
|
|
print(f"Predicted emotion class: {predicted_class}") |
|
|
print(f"Confidence scores: {predictions}") |
|
|
``` |
|
|
|
|
|
For pipeline usage: |
|
|
|
|
|
```python |
|
|
from transformers import pipeline |
|
|
|
|
|
# Create emotion classification pipeline |
|
|
emotion_classifier = pipeline("text-classification", model="Sathwik3/distilbert-emotion-classifier") |
|
|
|
|
|
# Classify emotion |
|
|
result = emotion_classifier("I am so happy and excited about this amazing opportunity!") |
|
|
print(result) |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Data |
|
|
|
|
|
The model was fine-tuned on an emotion classification dataset. Specific dataset details: |
|
|
- **Dataset:** Emotion dataset |
|
|
- **Size:** 16000 |
|
|
- **Emotion categories:** ['sadness', 'joy', 'love', 'anger', 'fear', 'surprise'] |
|
|
- **Data split:** Train,Validation,Test |
|
|
|
|
|
### Training Procedure |
|
|
|
|
|
#### Preprocessing |
|
|
|
|
|
- Text tokenization using DistilBERT tokenizer |
|
|
- Maximum sequence length: 512 tokens |
|
|
- Truncation and padding applied as needed |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
|
|
- **Training regime:** Mixed precision (fp16) |
|
|
- **Optimizer:** AdamW |
|
|
- **Learning rate:** 2e-5 |
|
|
- **Batch size:** 64 |
|
|
- **Number of epochs:** 2 |
|
|
- **Weight decay:** 0.01 |
|
|
|
|
|
|
|
|
## Evaluation |
|
|
|
|
|
### Testing Data & Metrics |
|
|
|
|
|
#### Testing Data |
|
|
|
|
|
- **Test set:** [Description of test data - placeholder] |
|
|
- **Test set size:** [Number of examples - placeholder] |
|
|
- **Distribution:** [Class distribution information - placeholder] |
|
|
|
|
|
#### Metrics |
|
|
|
|
|
The model's performance is evaluated using: |
|
|
- **Accuracy:** Overall classification accuracy |
|
|
- **F1 Score:** Macro and weighted F1 scores for balanced evaluation |
|
|
- **Precision:** Per-class and average precision |
|
|
- **Recall:** Per-class and average recall |
|
|
- **Confusion Matrix:** For detailed error analysis |
|
|
|
|
|
### Results |
|
|
|
|
|
#### Overall Performance |
|
|
|
|
|
| Metric | Value | |
|
|
|--------|-------| |
|
|
| Accuracy | 0.9295 | |
|
|
| Weighted F1 | 0.9292 | |
|
|
|
|
|
|
|
|
|
|
|
## Technical Specifications |
|
|
|
|
|
### Model Architecture |
|
|
|
|
|
- **Base Model:** DistilBERT (distilbert-base-uncased) |
|
|
- **Model Size:** ~66M parameters (base) + classification head |
|
|
- **Layers:** 6 transformer layers |
|
|
- **Hidden Size:** 768 |
|
|
- **Attention Heads:** 12 |
|
|
- **Intermediate Size:** 3072 |
|
|
- **Max Sequence Length:** 512 tokens |
|
|
- **Vocabulary Size:** 30,522 tokens |
|
|
|
|
|
|
|
|
#### Software |
|
|
|
|
|
- **Framework:** PyTorch |
|
|
- **Library:** Hugging Face Transformers |
|
|
- **Python Version:** 3.10 |
|
|
- **Key Dependencies:** |
|
|
- transformers |
|
|
- torch |
|
|
- tokenizers |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model in your research or applications, please cite: |
|
|
|
|
|
**BibTeX:** |
|
|
|
|
|
```bibtex |
|
|
@misc{sathwik3-distilbert-emotion, |
|
|
author = {Sathwik3}, |
|
|
title = {DistilBERT Emotion Classifier}, |
|
|
year = {2024}, |
|
|
publisher = {Hugging Face}, |
|
|
howpublished = {\url{https://huggingface.co/Sathwik3/distilbert-emotion-classifier}} |
|
|
} |
|
|
``` |
|
|
|
|
|
Please also cite the original DistilBERT paper: |
|
|
|
|
|
```bibtex |
|
|
@article{sanh2019distilbert, |
|
|
title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter}, |
|
|
author={Sanh, Victor and Debut, Lysandre and Chaumond, Julien and Wolf, Thomas}, |
|
|
journal={arXiv preprint arXiv:1910.01108}, |
|
|
year={2019} |
|
|
} |
|
|
``` |
|
|
|
|
|
**APA:** |
|
|
|
|
|
Sathwik3. (2024). *DistilBERT Emotion Classifier*. Hugging Face. https://huggingface.co/Sathwik3/distilbert-emotion-classifier |
|
|
|
|
|
## Model Card Authors |
|
|
|
|
|
Sathwik3 |
|
|
|
|
|
## Model Card Contact |
|
|
|
|
|
For questions or feedback about this model, please open an issue in the model's repository or contact via Hugging Face. |
|
|
|
|
|
--- |
|
|
|
|
|
*This model card follows the guidelines from [Mitchell et al. (2019)](https://arxiv.org/abs/1810.03993) and the Hugging Face Model Card template.* |