nllb-200-1.3B-ft-eng-to-cym

This model is a fine-tuned version of facebook/nllb-200-1.3B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5683
  • Bleu: 39.1969
  • Gen Len: 35.2025

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 6000
  • training_steps: 30000

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.129 0.0455 2000 0.9534 28.7045 42.0879
0.989 0.0910 4000 0.8191 28.0358 46.0137
0.9079 0.1365 6000 0.7438 29.7605 49.7891
0.834 0.1820 8000 0.6941 31.4068 46.6953
0.7823 0.2275 10000 0.6595 31.7358 39.6693
0.756 0.2730 12000 0.6388 35.5019 39.7181
0.7265 0.3185 14000 0.6221 34.0568 41.3639
0.7173 0.3640 16000 0.6071 40.6291 38.7305
0.7075 0.4094 18000 0.5959 41.9787 37.3835
0.7038 0.4549 20000 0.5881 37.075 40.9609
0.6903 0.5004 22000 0.5817 38.2801 37.5365
0.6741 0.5459 24000 0.5764 37.3169 39.1055
0.6797 0.5914 26000 0.5712 38.957 36.1172
0.6761 0.6369 28000 0.5690 38.7328 35.4466
0.6665 0.6824 30000 0.5683 39.1969 35.2025

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
15
Safetensors
Model size
1.37B params
Tensor type
F32
·
Inference Providers NEW

Model tree for DewiBrynJones/nllb-200-1.3B-ft-eng-to-cym

Finetuned
(10)
this model

Collection including DewiBrynJones/nllb-200-1.3B-ft-eng-to-cym