Amazon-Beauty-Product-Reviews-distilBERT-base for Sentiment Analysis
Table of Contents
Model Details
Model Description
This model is a fine-tuned version of distilbert-base-uncased on a balanced subset of Amazon beauty reviews dataset.
It achieves the following results on the evaluation set:
- Loss: 0.5171
- Accuracy: 0.7862
- Precision: 0.7876
- Recall: 0.7860
- F1: 0.7867
Developer Information
- Developed by: Jiali Han
- Model Type: Text Classification
- Language(s): English
- License: Apache-2.0
- Parent Model: For more details about DistilBERT, please check out this model card.
- Resources for more information:
Uses
Direct Application
This model can be used for sentiment analysis on Amazon beauty product reviews.
Misuse and Out-of-scope Use
The model should not be used to create hostile or alienating environments for people intentionally. In addition, the model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model.
Risks, Limitations and Biases
The model may produce biased predictions, particularly impacting underrepresented groups.
Users should evaluate the model’s risks for their specific use cases.
For further bias evaluation, consider datasets such as:
Training and Evaluation
Training Data
The author uses the Amazon beauty reviews dataset, which has been balanced to address class imbalance issues.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 0
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 1
Training results
For detailed training logs, please refer to the Tensorboard page.
Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 |
---|---|---|---|---|---|---|---|
0.7283 | 0.0299 | 500 | 0.6867 | 0.7073 | 0.7038 | 0.7071 | 0.7030 |
0.6718 | 0.0598 | 1000 | 0.6067 | 0.7340 | 0.7478 | 0.7340 | 0.7377 |
0.6473 | 0.0898 | 1500 | 0.6154 | 0.7390 | 0.7508 | 0.7390 | 0.7416 |
0.616 | 0.1197 | 2000 | 0.6448 | 0.7423 | 0.7373 | 0.7420 | 0.7377 |
0.6123 | 0.1496 | 2500 | 0.6286 | 0.7241 | 0.7677 | 0.7243 | 0.7284 |
0.5874 | 0.1795 | 3000 | 0.5774 | 0.7516 | 0.7539 | 0.7515 | 0.7523 |
0.5746 | 0.2095 | 3500 | 0.5708 | 0.7564 | 0.7636 | 0.7563 | 0.7582 |
0.5917 | 0.2394 | 4000 | 0.5839 | 0.7596 | 0.7602 | 0.7595 | 0.7598 |
0.5774 | 0.2693 | 4500 | 0.6225 | 0.7526 | 0.7482 | 0.7524 | 0.7492 |
0.594 | 0.2992 | 5000 | 0.5531 | 0.7662 | 0.7694 | 0.7661 | 0.7673 |
0.5591 | 0.3292 | 5500 | 0.5770 | 0.7665 | 0.7645 | 0.7663 | 0.7645 |
0.5548 | 0.3591 | 6000 | 0.5805 | 0.7613 | 0.7579 | 0.7611 | 0.7584 |
0.5742 | 0.3890 | 6500 | 0.5592 | 0.7639 | 0.7665 | 0.7638 | 0.7636 |
0.5374 | 0.4189 | 7000 | 0.5548 | 0.7712 | 0.7776 | 0.7711 | 0.7735 |
0.5488 | 0.4489 | 7500 | 0.5622 | 0.7747 | 0.7747 | 0.7745 | 0.7746 |
0.5557 | 0.4788 | 8000 | 0.5698 | 0.7642 | 0.7822 | 0.7643 | 0.7670 |
0.556 | 0.5087 | 8500 | 0.5380 | 0.7754 | 0.7777 | 0.7753 | 0.7764 |
0.5325 | 0.5386 | 9000 | 0.5791 | 0.7754 | 0.7746 | 0.7751 | 0.7736 |
0.5301 | 0.5686 | 9500 | 0.5569 | 0.7753 | 0.7738 | 0.7751 | 0.7744 |
0.5232 | 0.5985 | 10000 | 0.5391 | 0.7782 | 0.7806 | 0.7780 | 0.7789 |
0.5462 | 0.6284 | 10500 | 0.5499 | 0.7729 | 0.7698 | 0.7726 | 0.7683 |
0.5614 | 0.6583 | 11000 | 0.5243 | 0.7803 | 0.7818 | 0.7801 | 0.7808 |
0.5376 | 0.6883 | 11500 | 0.5406 | 0.7795 | 0.7772 | 0.7794 | 0.7780 |
0.5287 | 0.7182 | 12000 | 0.5227 | 0.7797 | 0.7852 | 0.7796 | 0.7806 |
0.5149 | 0.7481 | 12500 | 0.5423 | 0.7803 | 0.7788 | 0.7801 | 0.7792 |
0.5312 | 0.7780 | 13000 | 0.5338 | 0.7771 | 0.7860 | 0.7771 | 0.7781 |
0.5204 | 0.8079 | 13500 | 0.5183 | 0.7843 | 0.7857 | 0.7841 | 0.7849 |
0.5412 | 0.8379 | 14000 | 0.5192 | 0.7844 | 0.7893 | 0.7843 | 0.7860 |
0.515 | 0.8678 | 14500 | 0.5135 | 0.7845 | 0.7858 | 0.7843 | 0.7850 |
0.5033 | 0.8977 | 15000 | 0.5254 | 0.7862 | 0.7882 | 0.7860 | 0.7870 |
0.5023 | 0.9276 | 15500 | 0.5251 | 0.7863 | 0.7853 | 0.7861 | 0.7856 |
0.5042 | 0.9576 | 16000 | 0.5215 | 0.7865 | 0.7864 | 0.7864 | 0.7864 |
0.5237 | 0.9875 | 16500 | 0.5171 | 0.7862 | 0.7876 | 0.7860 | 0.7867 |
Evaluation Results
The fine-tuned DistilBERT model was evaluated on a dataset with the following splits:
- Training Samples: 133,665
- Validation Samples: 33,417
The evaluation was conducted on a three-class sentiment classification task. Below are the detailed results:
Classification Report
Label | Precision | Recall | F1-Score | Support |
---|---|---|---|---|
0 | 0.78 | 0.78 | 0.78 | 11163 |
1 | 0.69 | 0.70 | 0.69 | 11099 |
2 | 0.89 | 0.87 | 0.88 | 11155 |
Accuracy | 0.78 | 33417 | ||
Macro Avg | 0.79 | 0.78 | 0.78 | 33417 |
Weighted Avg | 0.79 | 0.78 | 0.79 | 33417 |
Confusion Matrix
0 | 1 | 2 | |
---|---|---|---|
0 | 8672 | 2331 | 160 |
1 | 2292 | 7793 | 1014 |
2 | 169 | 1237 | 9749 |
Framework versions
- Transformers 4.50.3
- Pytorch 2.6.0+cu124
- Tokenizers 0.21.1
- Downloads last month
- 5
Model tree for jhan21/amazon-reviews-sentiment-distilbert-base-uncased
Base model
distilbert/distilbert-base-uncased