w2v-bert-2.0-chichewa_34_307h

This model is a fine-tuned version of facebook/w2v-bert-2.0 on the CLEAR-GLOBAL/CHICHEWA_34_307H - NA dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2792
  • Wer: 0.3856
  • Cer: 0.1100

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • training_steps: 100000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
2.7235 0.2896 1000 2.9405 0.9854 0.8901
0.1802 0.5792 2000 0.9285 0.6857 0.2027
0.1404 0.8688 3000 0.6584 0.5723 0.1737
0.0446 1.1584 4000 0.5458 0.5495 0.1613
0.051 1.4480 5000 0.5079 0.5297 0.1528
0.0326 1.7376 6000 0.5507 0.5111 0.1529
0.033 2.0272 7000 0.4940 0.4774 0.1412
0.0341 2.3168 8000 0.4784 0.4954 0.1410
0.0308 2.6064 9000 0.4140 0.4981 0.1390
0.0216 2.8960 10000 0.3997 0.4689 0.1340
0.0262 3.1856 11000 0.3943 0.4716 0.1374
0.0216 3.4752 12000 0.3600 0.4463 0.1306
0.0137 3.7648 13000 0.3348 0.4286 0.1236
0.0154 4.0544 14000 0.3559 0.4290 0.1247
0.0147 4.3440 15000 0.3498 0.4234 0.1232
0.0334 4.6337 16000 0.3606 0.4261 0.1236
0.0097 4.9233 17000 0.3384 0.4054 0.1176
0.0099 5.2129 18000 0.3286 0.4323 0.1237
0.0167 5.5025 19000 0.3260 0.4192 0.1210
0.0097 5.7921 20000 0.3196 0.4198 0.1220
0.0101 6.0817 21000 0.3173 0.4121 0.1177
0.0152 6.3713 22000 0.3083 0.3943 0.1132
0.0116 6.6609 23000 0.3192 0.4119 0.1157
0.0165 6.9505 24000 0.3216 0.4117 0.1186
0.0071 7.2401 25000 0.3019 0.3828 0.1134
0.0125 7.5297 26000 0.3002 0.3975 0.1144
0.0056 7.8193 27000 0.3025 0.3924 0.1131
0.0137 8.1089 28000 0.2918 0.3876 0.1122
0.0062 8.3985 29000 0.2874 0.3845 0.1138
0.0066 8.6881 30000 0.2793 0.3847 0.1100
0.0181 8.9777 31000 0.2827 0.3642 0.1070
0.0045 9.2673 32000 0.2890 0.3878 0.1152
0.0043 9.5569 33000 0.3049 0.4021 0.1164
0.0113 9.8465 34000 0.2855 0.3759 0.1085
0.0119 10.1361 35000 0.2992 0.3782 0.1120

Framework versions

  • Transformers 4.48.1
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
9
Safetensors
Model size
606M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for CLEAR-Global/w2v-bert-2.0-chichewa_34_307h

Finetuned
(302)
this model

Collection including CLEAR-Global/w2v-bert-2.0-chichewa_34_307h