mT5-Small (Maltese Sentiment Analysis)
This model is a fine-tuned version of google/mt5-small on the Maltese Sentiment Analysis dataset. It achieves the following results on the test set:
- Loss: 1.2452
- F1: 1.0
Intended uses & limitations
The model is fine-tuned on a specific task and it should be used on the same or similar task. Any limitations present in the base model are inherited.
Training procedure
The model was fine-tuned using a customised script.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Use adafactor and the args are: No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 200.0
- early_stopping_patience: 20
Training results
Training Loss | Epoch | Step | Validation Loss | F1 |
---|---|---|---|---|
No log | 1.0 | 19 | 13.3064 | 0.5973 |
No log | 2.0 | 38 | 4.2820 | 0.2544 |
No log | 3.0 | 57 | 1.2517 | 1.0 |
No log | 4.0 | 76 | 0.3598 | 1.0 |
No log | 5.0 | 95 | 0.2682 | 1.0 |
No log | 6.0 | 114 | 0.3151 | 1.0 |
No log | 7.0 | 133 | 0.2757 | 1.0 |
No log | 8.0 | 152 | 0.2476 | 1.0 |
No log | 9.0 | 171 | 0.2500 | 1.0 |
No log | 10.0 | 190 | 0.4524 | 1.0 |
No log | 11.0 | 209 | 0.2950 | 1.0 |
No log | 12.0 | 228 | 0.2435 | 1.0 |
No log | 13.0 | 247 | 0.2639 | 1.0 |
No log | 14.0 | 266 | 0.2524 | 1.0 |
No log | 15.0 | 285 | 0.2416 | 1.0 |
No log | 16.0 | 304 | 0.2450 | 1.0 |
No log | 17.0 | 323 | 0.2824 | 1.0 |
No log | 18.0 | 342 | 0.4300 | 1.0 |
No log | 19.0 | 361 | 0.2379 | 1.0 |
No log | 20.0 | 380 | 0.2422 | 1.0 |
No log | 21.0 | 399 | 0.2543 | 1.0 |
No log | 22.0 | 418 | 0.4920 | 1.0 |
No log | 23.0 | 437 | 0.2496 | 1.0 |
Framework versions
- Transformers 4.49.0.dev0
- Pytorch 2.4.1+cu121
- Datasets 3.2.0
- Tokenizers 0.21.0
License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Permissions beyond the scope of this license may be available at https://mlrs.research.um.edu.mt/.
Citation
This work was first presented in MELABenchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource Maltese NLP. Cite it as follows:
@inproceedings{micallef-borg-2025-melabenchv1,
title = "{MELAB}enchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource {M}altese {NLP}",
author = "Micallef, Kurt and
Borg, Claudia",
editor = "Che, Wanxiang and
Nabende, Joyce and
Shutova, Ekaterina and
Pilehvar, Mohammad Taher",
booktitle = "Findings of the Association for Computational Linguistics: ACL 2025",
month = jul,
year = "2025",
address = "Vienna, Austria",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2025.findings-acl.1053/",
doi = "10.18653/v1/2025.findings-acl.1053",
pages = "20505--20527",
ISBN = "979-8-89176-256-5",
}
- Downloads last month
- -
Model tree for MLRS/mt5-small_sentiment-mlt
Base model
google/mt5-smallCollection including MLRS/mt5-small_sentiment-mlt
Evaluation results
- Macro-averaged F1 on Maltese Sentiment AnalysisMELABench Leaderboard100.000