mms-1b-100_400h-hau-ft

This model is a fine-tuned version of facebook/mms-1b-all on the /MNT/MD0/SYNVOICES/DATA/HAUSA_100_400H - NA dataset. It achieves the following results on the evaluation set:

Loss: 0.3479
Wer: 0.3637
Cer: 0.0925

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 2
gradient_accumulation_steps: 2
total_train_batch_size: 32
total_eval_batch_size: 16
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 2.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
0.2656	0.0352	500	0.4470	0.4257	0.1129
0.3361	0.0703	1000	0.4310	0.4270	0.1113
0.3601	0.1055	1500	0.4189	0.4154	0.1084
0.3229	0.1406	2000	0.4152	0.4121	0.1068
0.2776	0.1758	2500	0.4124	0.4124	0.1063
0.29	0.2109	3000	0.4082	0.4076	0.1055
0.2527	0.2461	3500	0.4018	0.4063	0.1050
0.2838	0.2812	4000	0.4009	0.4101	0.1048
0.2896	0.3164	4500	0.4078	0.4132	0.1069
0.1849	0.3515	5000	0.3945	0.4033	0.1040
0.2475	0.3867	5500	0.4011	0.4023	0.1034
0.2712	0.4218	6000	0.3901	0.3935	0.1017
0.2279	0.4570	6500	0.3956	0.3951	0.1018
0.2572	0.4921	7000	0.3888	0.3961	0.1017
0.2063	0.5273	7500	0.3953	0.4051	0.1042
0.2289	0.5624	8000	0.4044	0.4129	0.1084
0.1979	0.5976	8500	0.3861	0.3939	0.1010
0.2124	0.6328	9000	0.3852	0.3913	0.1007
0.2312	0.6679	9500	0.3782	0.3890	0.0995
0.184	0.7031	10000	0.3745	0.3880	0.0993
0.2669	0.7382	10500	0.3773	0.3885	0.0998
0.2357	0.7734	11000	0.3827	0.3869	0.0992
0.2739	0.8085	11500	0.3777	0.3855	0.0984
0.1769	0.8437	12000	0.3736	0.3871	0.0986
0.161	0.8788	12500	0.3731	0.3863	0.0990
0.2052	0.9140	13000	0.3759	0.3882	0.0988
0.1487	0.9491	13500	0.3730	0.3862	0.0985
0.1493	0.9843	14000	0.3710	0.3826	0.0983
0.1698	1.0194	14500	0.3771	0.3880	0.0999
0.2625	1.0546	15000	0.3687	0.3850	0.0980
0.1515	1.0897	15500	0.3658	0.3820	0.0975
0.2034	1.1249	16000	0.3681	0.3781	0.0969
0.2207	1.1600	16500	0.3634	0.3833	0.0975
0.1458	1.1952	17000	0.3654	0.3771	0.0965
0.1909	1.2303	17500	0.3659	0.3767	0.0962
0.1686	1.2655	18000	0.3649	0.3735	0.0957
0.2135	1.3006	18500	0.3624	0.3790	0.0958
0.1748	1.3358	19000	0.3590	0.3736	0.0956
0.1961	1.3709	19500	0.3607	0.3779	0.0966
0.1926	1.4061	20000	0.3622	0.3732	0.0955
0.1369	1.4412	20500	0.3589	0.3723	0.0951
0.2438	1.4764	21000	0.3576	0.3746	0.0950
0.1785	1.5115	21500	0.3564	0.3723	0.0947
0.1621	1.5467	22000	0.3579	0.3711	0.0947
0.107	1.5819	22500	0.3540	0.3691	0.0942
0.1623	1.6170	23000	0.3554	0.3717	0.0947
0.1913	1.6522	23500	0.3534	0.3703	0.0939
0.1274	1.6873	24000	0.3534	0.3660	0.0933
0.1101	1.7225	24500	0.3511	0.3694	0.0936
0.2139	1.7576	25000	0.3500	0.3671	0.0933
0.1948	1.7928	25500	0.3513	0.3690	0.0936
0.1652	1.8279	26000	0.3488	0.3662	0.0932
0.2417	1.8631	26500	0.3504	0.3649	0.0928
0.2448	1.8982	27000	0.3496	0.3635	0.0925
0.1266	1.9334	27500	0.3483	0.3641	0.0926
0.1857	1.9685	28000	0.3482	0.3646	0.0927

Framework versions

Transformers 4.48.1
Pytorch 2.5.1+cu121
Datasets 3.2.0
Tokenizers 0.21.0

CLEAR-Global
/

mms-1b-100_400h-hau-ft

mms-1b-100_400h-hau-ft

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for CLEAR-Global/mms-1b-100_400h-hau-ft

Collection including CLEAR-Global/mms-1b-100_400h-hau-ft

Hausa MMS-1B Models

Evaluation results