iva_mt_wslot-m2m100_418M-en-es-massive_unfiltered

This model is a fine-tuned version of facebook/m2m100_418M on the iva_mt_wslot-exp dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0114
  • Bleu: 67.6426
  • Gen Len: 18.9134

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 7
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
0.0129 1.0 2879 0.0118 65.4383 18.8697
0.009 2.0 5758 0.0109 66.6878 18.9331
0.0066 3.0 8637 0.0107 66.6143 18.8687
0.0049 4.0 11516 0.0108 66.9832 18.8067
0.0037 5.0 14395 0.0109 67.452 18.8598
0.0028 6.0 17274 0.0112 67.4281 18.9213
0.0023 7.0 20153 0.0114 67.6426 18.9134

Framework versions

  • Transformers 4.28.1
  • Pytorch 2.0.0+cu118
  • Datasets 2.11.0
  • Tokenizers 0.13.3
Downloads last month
24
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Evaluation results