mt-en-vi / README.md
hai2131's picture
Update README.md
b096c4c verified
metadata
library_name: transformers
base_model: facebook/mbart-large-50-many-to-many-mmt
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: mt-en-vi
    results: []
datasets:
  - thainq107/iwslt2015-en-vi

mt-en-vi

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on a dataset from iwslt2015. It achieves the following results on the evaluation set:

  • Loss: 1.2564
  • Bleu: 34.9585

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu
1.1887 0.2399 500 1.2961 34.2118
1.1422 0.4798 1000 1.2807 34.6929
1.127 0.7198 1500 1.2689 34.7625
1.1215 0.9597 2000 1.2577 34.5726
1.031 1.1996 2500 1.2728 35.0683
1.0128 1.4395 3000 1.2632 34.8100
1.0184 1.6795 3500 1.2565 34.7721
1.0131 1.9194 4000 1.2564 34.9585

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 2.20.0
  • Tokenizers 0.21.1