mt-en-vi

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on a dataset from iwslt2015. It achieves the following results on the evaluation set:

  • Loss: 1.2564
  • Bleu: 34.9585

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu
1.1887 0.2399 500 1.2961 34.2118
1.1422 0.4798 1000 1.2807 34.6929
1.127 0.7198 1500 1.2689 34.7625
1.1215 0.9597 2000 1.2577 34.5726
1.031 1.1996 2500 1.2728 35.0683
1.0128 1.4395 3000 1.2632 34.8100
1.0184 1.6795 3500 1.2565 34.7721
1.0131 1.9194 4000 1.2564 34.9585

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 2.20.0
  • Tokenizers 0.21.1
Downloads last month
10
Safetensors
Model size
611M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for hai2131/mt-en-vi

Finetuned
(157)
this model

Dataset used to train hai2131/mt-en-vi

Space using hai2131/mt-en-vi 1