alexue4's picture
End of training
750214d
|
raw
history blame
4.74 kB
metadata
license: apache-2.0
base_model: google/mt5-small
tags:
  - generated_from_trainer
model-index:
  - name: text-translit-detector-ru
    results: []

text-translit-detector-ru

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0632
  • Mean Distance: 0
  • Max Distance: 1

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 15
  • eval_batch_size: 15
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss Mean Distance Max Distance
0.8541 1.0 2664 0.3404 0 1
0.0451 2.0 5328 0.0605 0 1
0.0112 3.0 7992 0.0411 0 1
0.0068 4.0 10656 0.0205 0 1
0.007 5.0 13320 0.0242 0 1
0.0022 6.0 15984 0.0272 0 1
0.0054 7.0 18648 0.0080 0 1
0.0036 8.0 21312 0.0252 0 1
0.0039 9.0 23976 0.0210 0 1
0.0026 10.0 26640 0.0170 0 1
0.0026 11.0 29304 0.0043 0 1
0.0029 12.0 31968 0.0135 0 1
0.0011 13.0 34632 0.0313 0 1
0.0017 14.0 37296 0.0353 0 1
0.0014 15.0 39960 0.0117 0 1
0.0014 16.0 42624 0.0140 0 1
0.0013 17.0 45288 0.0220 0 1
0.0009 18.0 47952 0.0247 0 1
0.0017 19.0 50616 0.0322 0 1
0.0022 20.0 53280 0.0314 0 1
0.0006 21.0 55944 0.0305 0 1
0.001 22.0 58608 0.0292 0 1
0.0008 23.0 61272 0.0373 0 1
0.0008 24.0 63936 0.0309 0 1
0.0008 25.0 66600 0.0385 0 1
0.0014 26.0 69264 0.0134 0 1
0.0004 27.0 71928 0.0239 0 1
0.0011 28.0 74592 0.0164 0 1
0.0002 29.0 77256 0.0186 0 1
0.0001 30.0 79920 0.0298 0 1
0.0008 31.0 82584 0.0277 0 1
0.0003 32.0 85248 0.0377 0 1
0.0003 33.0 87912 0.0354 0 1
0.0007 34.0 90576 0.0585 0 1
0.0005 35.0 93240 0.0568 0 1
0.0001 36.0 95904 0.0567 0 1
0.0009 37.0 98568 0.0605 0 1
0.0002 38.0 101232 0.0613 0 1
0.0002 39.0 103896 0.0563 0 1
0.0002 40.0 106560 0.0632 0 1

Framework versions

  • Transformers 4.35.0
  • Pytorch 2.1.0+cu118
  • Datasets 2.14.6
  • Tokenizers 0.14.1