metadata

library_name: transformers
language:
  - mt
license: cc-by-nc-sa-4.0
base_model: google/mt5-small
model-index:
  - name: mt5-small_sentiment-mlt
    results:
      - task:
          type: sentiment-analysis
          name: Sentiment Analysis
        dataset:
          type: sentiment_mlt
          name: Maltese Sentiment Analysis
        metrics:
          - type: f1
            args: macro
            value: 100
            name: Macro-averaged F1
        source:
          name: MELABench Leaderboard
          url: https://huggingface.co/spaces/MLRS/MELABench
extra_gated_fields:
  Name: text
  Surname: text
  Date of Birth: date_picker
  Organisation: text
  Country: country
  I agree to use this model in accordance to the license and for non-commercial use ONLY: checkbox

mT5-Small (Maltese Sentiment Analysis)

This model is a fine-tuned version of google/mt5-small on the Maltese Sentiment Analysis dataset. It achieves the following results on the test set:

Loss: 1.2452
F1: 1.0

Intended uses & limitations

The model is fine-tuned on a specific task and it should be used on the same or similar task. Any limitations present in the base model are inherited.

Training procedure

The model was fine-tuned using a customised script.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Use adafactor and the args are: No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 200.0
early_stopping_patience: 20

Training results

Training Loss	Epoch	Step	Validation Loss	F1
No log	1.0	19	13.3064	0.5973
No log	2.0	38	4.2820	0.2544
No log	3.0	57	1.2517	1.0
No log	4.0	76	0.3598	1.0
No log	5.0	95	0.2682	1.0
No log	6.0	114	0.3151	1.0
No log	7.0	133	0.2757	1.0
No log	8.0	152	0.2476	1.0
No log	9.0	171	0.2500	1.0
No log	10.0	190	0.4524	1.0
No log	11.0	209	0.2950	1.0
No log	12.0	228	0.2435	1.0
No log	13.0	247	0.2639	1.0
No log	14.0	266	0.2524	1.0
No log	15.0	285	0.2416	1.0
No log	16.0	304	0.2450	1.0
No log	17.0	323	0.2824	1.0
No log	18.0	342	0.4300	1.0
No log	19.0	361	0.2379	1.0
No log	20.0	380	0.2422	1.0
No log	21.0	399	0.2543	1.0
No log	22.0	418	0.4920	1.0
No log	23.0	437	0.2496	1.0

Framework versions

Transformers 4.49.0.dev0
Pytorch 2.4.1+cu121
Datasets 3.2.0
Tokenizers 0.21.0

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Permissions beyond the scope of this license may be available at https://mlrs.research.um.edu.mt/.

Citation

This work was first presented in MELABenchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource Maltese NLP. Cite it as follows:

@inproceedings{micallef-borg-2025-melabenchv1,
    title = "{MELAB}enchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource {M}altese {NLP}",
    author = "Micallef, Kurt  and
      Borg, Claudia",
    editor = "Che, Wanxiang  and
      Nabende, Joyce  and
      Shutova, Ekaterina  and
      Pilehvar, Mohammad Taher",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2025",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.findings-acl.1053/",
    doi = "10.18653/v1/2025.findings-acl.1053",
    pages = "20505--20527",
    ISBN = "979-8-89176-256-5",
}