mms-1b-250_250h-hau-ft

This model is a fine-tuned version of facebook/mms-1b-all on the /MNT/MD0/SYNVOICES/DATA/HAUSA_250_250H - NA dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3308
  • Wer: 0.3513
  • Cer: 0.0887

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 2
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • total_eval_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 2.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
0.3613 0.0320 500 0.4146 0.4071 0.1056
0.407 0.0641 1000 0.4004 0.4009 0.1034
0.342 0.0961 1500 0.3979 0.4061 0.1057
0.4212 0.1282 2000 0.3871 0.4145 0.1047
0.4009 0.1602 2500 0.3869 0.4030 0.1034
0.4283 0.1923 3000 0.3827 0.4032 0.1017
0.3129 0.2243 3500 0.3873 0.4055 0.1024
0.2488 0.2564 4000 0.3773 0.3883 0.0994
0.475 0.2884 4500 0.3820 0.3897 0.1000
0.4095 0.3205 5000 0.3688 0.3887 0.0990
0.3402 0.3525 5500 0.3722 0.3846 0.0982
0.2797 0.3846 6000 0.3696 0.3943 0.0987
0.4146 0.4166 6500 0.3725 0.3949 0.0993
0.3188 0.4487 7000 0.3661 0.3849 0.0987
0.2401 0.4807 7500 0.3653 0.3821 0.0977
0.2492 0.5128 8000 0.3594 0.3776 0.0964
0.3665 0.5448 8500 0.3685 0.3844 0.0982
0.4324 0.5769 9000 0.3651 0.3756 0.0957
0.2253 0.6089 9500 0.3625 0.3847 0.0970
0.2664 0.6410 10000 0.3580 0.3738 0.0947
0.4003 0.6730 10500 0.3556 0.3689 0.0936
0.3316 0.7051 11000 0.3570 0.3763 0.0952
0.2544 0.7371 11500 0.3531 0.3720 0.0941
0.2287 0.7692 12000 0.3579 0.3931 0.0972
0.3805 0.8012 12500 0.3558 0.3760 0.0957
0.3921 0.8333 13000 0.3533 0.3916 0.0975
0.3213 0.8653 13500 0.3541 0.3836 0.0957
0.4347 0.8973 14000 0.3528 0.3807 0.0956
0.3049 0.9294 14500 0.3548 0.3765 0.0951
0.3651 0.9614 15000 0.3506 0.3733 0.0942
0.2681 0.9935 15500 0.3512 0.3715 0.0936
0.2743 1.0255 16000 0.3489 0.3689 0.0932
0.3168 1.0576 16500 0.3495 0.3718 0.0937
0.2572 1.0896 17000 0.3476 0.3702 0.0931
0.3063 1.1217 17500 0.3398 0.3619 0.0916
0.2725 1.1537 18000 0.3471 0.3663 0.0923
0.3038 1.1858 18500 0.3448 0.3648 0.0920
0.1957 1.2178 19000 0.3453 0.3596 0.0906
0.2435 1.2498 19500 0.3429 0.3581 0.0908
0.2705 1.2819 20000 0.3400 0.3689 0.0921
0.2804 1.3139 20500 0.3414 0.3605 0.0915
0.2005 1.3460 21000 0.3417 0.3663 0.0915
0.3056 1.3780 21500 0.3427 0.3671 0.0917
0.2708 1.4101 22000 0.3393 0.3583 0.0909
0.2275 1.4421 22500 0.3396 0.3566 0.0902
0.3978 1.4742 23000 0.3383 0.3562 0.0902
0.39 1.5062 23500 0.3409 0.3548 0.0906
0.2801 1.5383 24000 0.3367 0.3604 0.0909
0.4354 1.5703 24500 0.3372 0.3565 0.0901
0.2681 1.6024 25000 0.3362 0.3549 0.0901
0.3148 1.6344 25500 0.3366 0.3542 0.0898
0.2573 1.6665 26000 0.3338 0.3518 0.0895
0.3137 1.6985 26500 0.3329 0.3551 0.0897
0.3578 1.7306 27000 0.3351 0.3510 0.0887
0.2697 1.7626 27500 0.3343 0.3549 0.0891
0.236 1.7947 28000 0.3330 0.3514 0.0887
0.1999 1.8267 28500 0.3326 0.3518 0.0890
0.3329 1.8588 29000 0.3305 0.3475 0.0883
0.2248 1.8908 29500 0.3310 0.3516 0.0889
0.2766 1.9229 30000 0.3308 0.3503 0.0886
0.4033 1.9549 30500 0.3311 0.3534 0.0890
0.3129 1.9870 31000 0.3308 0.3516 0.0887

Framework versions

  • Transformers 4.48.1
  • Pytorch 2.5.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
8
Safetensors
Model size
965M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for CLEAR-Global/mms-1b-250_250h-hau-ft

Finetuned
(285)
this model

Collection including CLEAR-Global/mms-1b-250_250h-hau-ft