colorlessideas's picture
End of training
27dc258 verified
metadata
library_name: transformers
base_model: Benjaminpwh/xlsr-toratan-240-copt-base_K
tags:
  - generated_from_trainer
model-index:
  - name: xls-r-300m-toratan-120-copt
    results: []

xls-r-300m-toratan-120-copt

This model is a fine-tuned version of Benjaminpwh/xlsr-toratan-240-copt-base_K on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0124
  • Cer: 0.0045

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer
4.5582 4.1667 400 1.8811 0.5809
1.5024 8.3333 800 0.8199 0.2577
0.9223 12.5 1200 0.4515 0.1573
0.5883 16.6667 1600 0.2131 0.0854
0.369 20.8333 2000 0.0768 0.0305
0.2309 25.0 2400 0.0288 0.0117
0.1515 29.1667 2800 0.0124 0.0045

Framework versions

  • Transformers 4.52.0.dev0
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1