tatoeba-ru-tok

This model is a fine-tuned version of Helsinki-NLP/opus-mt-ru-en on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5932
  • Bleu: 47.6666

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 15
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu
1.0515 1.0 1167 0.8539 37.3803
0.8186 2.0 2334 0.7284 41.5032
0.7002 3.0 3501 0.6803 43.5555
0.6501 4.0 4668 0.6485 45.0023
0.6091 5.0 5835 0.6302 45.6329
0.5778 6.0 7002 0.6180 45.8879
0.553 7.0 8169 0.6109 46.6945
0.533 8.0 9336 0.6041 46.6169
0.5128 9.0 10503 0.6002 47.0549
0.5015 10.0 11670 0.5961 47.2017
0.4851 11.0 12837 0.5962 47.5851
0.4795 12.0 14004 0.5939 47.5400
0.4659 13.0 15171 0.5932 47.6666
0.4608 14.0 16338 0.5939 47.6703
0.4593 15.0 17505 0.5936 47.6572

Framework versions

  • Transformers 4.52.4
  • Pytorch 2.7.1+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
31
Safetensors
Model size
76.2M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NetherQuartz/tatoeba-ru-tok

Finetuned
(3)
this model

Dataset used to train NetherQuartz/tatoeba-ru-tok

Collection including NetherQuartz/tatoeba-ru-tok