git-base-500

This model is a fine-tuned version of microsoft/git-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4804
  • Wer Score: 8.3286

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Score
3.6377 0.8 50 4.5566 8.9044
1.224 1.592 100 0.7253 1.1039
0.2503 2.384 150 0.4574 0.9295
0.2197 3.176 200 0.4328 1.8787
0.1877 3.976 250 0.4220 1.2766
0.1712 4.768 300 0.4191 5.0394
0.1551 5.5600 350 0.4194 3.4952
0.1555 6.352 400 0.4185 3.5920
0.1405 7.144 450 0.4188 8.9391
0.1339 7.944 500 0.4157 3.1930
0.1236 8.736 550 0.4161 3.3608
0.1179 9.528 600 0.4179 4.5639
0.1143 10.32 650 0.4203 7.8746
0.0984 11.112 700 0.4218 3.6105
0.0977 11.912 750 0.4210 6.4683
0.0896 12.704 800 0.4242 8.9421
0.0876 13.496 850 0.4268 6.4528
0.0791 14.288 900 0.4296 8.9367
0.0861 15.08 950 0.4295 8.9462
0.0713 15.88 1000 0.4345 5.5215
0.0706 16.672 1050 0.4319 8.9839
0.0659 17.464 1100 0.4378 7.1995
0.0633 18.256 1150 0.4366 8.9056
0.0612 19.048 1200 0.4425 7.9755
0.0601 19.848 1250 0.4428 7.8967
0.054 20.64 1300 0.4432 8.9534
0.053 21.432 1350 0.4470 8.7013
0.0528 22.224 1400 0.4489 7.5454
0.0522 23.016 1450 0.4488 4.6481
0.0471 23.816 1500 0.4528 8.6278
0.0458 24.608 1550 0.4552 8.7969
0.0414 25.4 1600 0.4549 8.6983
0.0415 26.192 1650 0.4553 5.8256
0.0398 26.992 1700 0.4584 7.7479
0.0395 27.784 1750 0.4587 8.6667
0.0359 28.576 1800 0.4622 8.4624
0.0347 29.368 1850 0.4610 6.7873
0.034 30.16 1900 0.4634 8.4725
0.0328 30.96 1950 0.4655 8.4480
0.0322 31.752 2000 0.4658 8.4839
0.0301 32.544 2050 0.4655 8.4976
0.0308 33.336 2100 0.4673 4.3256
0.0275 34.128 2150 0.4699 7.0305
0.0286 34.928 2200 0.4708 4.2640
0.0277 35.72 2250 0.4722 8.4534
0.0279 36.512 2300 0.4712 8.4665
0.0262 37.304 2350 0.4723 5.3303
0.026 38.096 2400 0.4735 8.4020
0.0241 38.896 2450 0.4750 8.4779
0.0241 39.688 2500 0.4750 8.4194
0.024 40.48 2550 0.4765 7.0854
0.0226 41.272 2600 0.4788 8.4164
0.0215 42.064 2650 0.4766 3.8375
0.0227 42.864 2700 0.4768 3.4355
0.0223 43.656 2750 0.4770 3.7461
0.0211 44.448 2800 0.4780 4.7300
0.0222 45.24 2850 0.4785 3.8919
0.0211 46.032 2900 0.4796 7.5161
0.02 46.832 2950 0.4797 7.9827
0.0212 47.624 3000 0.4797 8.0400
0.0202 48.416 3050 0.4807 8.3190
0.02 49.208 3100 0.4799 8.3375
0.0206 50.0 3150 0.4804 8.3286

Framework versions

  • Transformers 4.52.4
  • Pytorch 2.6.0+cu124
  • Datasets 3.6.0
  • Tokenizers 0.21.2
Downloads last month
-
Safetensors
Model size
177M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for WafaaFraih/git-base-500

Base model

microsoft/git-base
Finetuned
(117)
this model