results

This model is a fine-tuned version of distilbert-base-uncased on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.1699
Accuracy: 0.941

Model Description

This model is a fine-tuned version of DistilBERT-base-uncased, tailored for emotion recognition in text. It classifies input text into one of six emotion categories: sadness, joy, love, anger, fear, and surprise. The fine-tuning was performed on the dair-ai/emotion dataset, which includes 20,000 labeled text-emotion pairs. DistilBERT, being a smaller and faster variant of BERT, ensures this model is efficient while delivering robust performance for emotion classification tasks.

Model Type: Text Classification
Base Model: DistilBERT-base-uncased
Fine-Tuning Task: Emotion Recognition (6 classes)
Languages: English
License: Apache 2.0

Intended Uses & Limitations

Intended Uses

Emotion Classification: Classify text into one of six emotions: sadness, joy, love, anger, fear, or surprise.
Sentiment Analysis: Infer sentiment (e.g., joy as positive, anger as negative) based on predicted emotions, though not explicitly trained for this purpose.
Chatbots and Virtual Assistants: Enhance conversational AI by detecting user emotions for empathetic responses.
Content Moderation: Identify content with strong emotions like anger or fear for moderation purposes.

Limitations

Emotion Granularity: Restricted to six emotions, potentially missing nuanced or complex emotional states.
Contextual Understanding: May struggle with sarcasm, irony, or emotions requiring deeper context.
Language: Trained on English text only, with limited performance on other languages.
Dataset Bias: Performance may reflect biases in the training data, such as underrepresentation of certain emotional expressions.
Short Texts: Suboptimal performance on very short inputs (e.g., single words) due to limited context.

Training and Evaluation Data

The model was fine-tuned on the dair-ai/emotion dataset, comprising 20,000 English text samples labeled with one of six emotions:

0: sadness
1: joy
2: love
3: anger
4: fear
5: surprise

The dataset is divided as follows:

Training Set: 16,000 samples
Validation Set: 2,000 samples
Test Set: 2,000 samples

The dataset is balanced across the six emotion classes, promoting effective learning for each category.

Training Procedure

Preprocessing

Tokenization: Text was tokenized using the DistilBERT tokenizer, with a maximum sequence length of 512 tokens. Padding and truncation ensured uniform input sizes.
Data Formatting: Converted to PyTorch tensors for training compatibility.

Demo

Try the model in action here.

Training Hyperparameters

Fine-tuning was conducted using the Hugging Face Trainer API with:

Epochs: 3
Batch Size: 16 (training), 64 (evaluation)
Learning Rate: 2e-5
Optimizer: AdamW
Weight Decay: 0.01
Warmup Steps: 500

Training Process

Loss Function: Cross-entropy loss for multi-class classification.
Evaluation Metric: Accuracy on the validation set.
Training Duration: 3 epochs, with logging every 10 steps.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 64
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 3

Training results

Framework versions

Transformers 4.51.3
Pytorch 2.6.0+cu124
Datasets 3.5.1
Tokenizers 0.21.1

YonasMersha
/

fine-tuned-distilbert-emotion