git-base-ww

This model is a fine-tuned version of microsoft/git-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 50
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer Score
3.6106	3.128	50	4.5591	9.3966
1.2545	6.256	100	0.8047	3.2328
0.2342	9.384	150	0.4631	0.9336
0.1497	12.512	200	0.4565	1.2560
0.106	15.64	250	0.4637	2.1828
0.0813	18.768	300	0.4687	2.2207
0.0612	21.896	350	0.4750	6.5422
0.0536	25.0	400	0.4805	6.7198
0.0426	28.128	450	0.4867	2.6293
0.0361	31.256	500	0.4890	7.3362
0.031	34.384	550	0.4939	7.0353
0.0267	37.512	600	0.5003	2.7284
0.0241	40.64	650	0.5009	6.9310
0.0227	43.768	700	0.5015	6.9078
0.021	46.896	750	0.5036	6.7776
0.0203	50.0	800	0.5035	6.7310