modernbert-medium-amharic-32k

This model is a fine-tuned version of answerdotai/ModernBERT-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2505
  • Model Preparation Time: 0.0018

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 160
  • eval_batch_size: 160
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • lr_scheduler_warmup_steps: 10000
  • num_epochs: 32
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time
9.7902 0.1249 840 8.9451 0.0018
8.4279 0.2498 1680 8.1059 0.0018
7.8804 0.3747 2520 7.6286 0.0018
7.2817 0.4996 3360 6.9665 0.0018
6.7251 0.6244 4200 6.5003 0.0018
6.3085 0.7493 5040 6.1143 0.0018
5.9401 0.8742 5880 5.7576 0.0018
5.5838 0.9991 6720 5.4081 0.0018
5.2326 1.1240 7560 5.0634 0.0018
4.9028 1.2489 8400 4.7484 0.0018
4.6081 1.3738 9240 4.4710 0.0018
4.3543 1.4987 10080 4.2316 0.0018
4.1333 1.6236 10920 4.0414 0.0018
3.9614 1.7484 11760 3.8866 0.0018
3.8307 1.8733 12600 3.7526 0.0018
3.7194 1.9982 13440 3.6655 0.0018
3.6201 2.1231 14280 3.5835 0.0018
3.55 2.2480 15120 3.5199 0.0018
3.486 2.3729 15960 3.4576 0.0018
3.4329 2.4978 16800 3.4101 0.0018
3.3823 2.6227 17640 3.3603 0.0018
3.3336 2.7475 18480 3.3151 0.0018
3.3001 2.8724 19320 3.2799 0.0018
3.2564 2.9973 20160 3.2413 0.0018
3.2167 3.1222 21000 3.2086 0.0018
3.185 3.2471 21840 3.1729 0.0018
3.1562 3.3720 22680 3.1419 0.0018
3.1297 3.4969 23520 3.1340 0.0018
3.1039 3.6218 24360 3.0988 0.0018
3.084 3.7467 25200 3.0719 0.0018
3.0624 3.8715 26040 3.0589 0.0018
3.0377 3.9964 26880 3.0316 0.0018
3.0094 4.1213 27720 3.0182 0.0018
2.9898 4.2462 28560 3.0015 0.0018
2.9737 4.3711 29400 2.9769 0.0018
2.9583 4.4960 30240 2.9632 0.0018
2.9433 4.6209 31080 2.9401 0.0018
2.9282 4.7458 31920 2.9166 0.0018
2.9148 4.8707 32760 2.9144 0.0018
2.9028 4.9955 33600 2.8962 0.0018
2.8735 5.1204 34440 2.8910 0.0018
2.8645 5.2453 35280 2.8809 0.0018
2.8557 5.3702 36120 2.8674 0.0018
2.8409 5.4951 36960 2.8418 0.0018
2.8309 5.6200 37800 2.8343 0.0018
2.8194 5.7449 38640 2.8334 0.0018
2.8121 5.8698 39480 2.8114 0.0018
2.8005 5.9946 40320 2.8057 0.0018
2.7761 6.1195 41160 2.8075 0.0018
2.7691 6.2444 42000 2.7829 0.0018
2.7644 6.3693 42840 2.7794 0.0018
2.7557 6.4942 43680 2.7668 0.0018
2.748 6.6191 44520 2.7544 0.0018
2.7354 6.7440 45360 2.7461 0.0018
2.7317 6.8689 46200 2.7419 0.0018
2.7242 6.9938 47040 2.7316 0.0018
2.7032 7.1186 47880 2.7293 0.0018
2.6991 7.2435 48720 2.7157 0.0018
2.6917 7.3684 49560 2.7101 0.0018
2.689 7.4933 50400 2.7082 0.0018
2.6805 7.6182 51240 2.6988 0.0018
2.6742 7.7431 52080 2.6902 0.0018
2.6714 7.8680 52920 2.6819 0.0018
2.6668 7.9929 53760 2.6772 0.0018
2.6425 8.1178 54600 2.6705 0.0018
2.6387 8.2426 55440 2.6682 0.0018
2.6351 8.3675 56280 2.6562 0.0018
2.6343 8.4924 57120 2.6499 0.0018
2.6286 8.6173 57960 2.6480 0.0018
2.6237 8.7422 58800 2.6473 0.0018
2.6221 8.8671 59640 2.6284 0.0018
2.6134 8.9920 60480 2.6271 0.0018
2.5943 9.1169 61320 2.6303 0.0018
2.5975 9.2417 62160 2.6180 0.0018
2.5897 9.3666 63000 2.6117 0.0018
2.5867 9.4915 63840 2.6102 0.0018
2.582 9.6164 64680 2.5970 0.0018
2.5795 9.7413 65520 2.6003 0.0018
2.5748 9.8662 66360 2.5937 0.0018
2.5684 9.9911 67200 2.5877 0.0018
2.553 10.1160 68040 2.5921 0.0018
2.555 10.2409 68880 2.5877 0.0018
2.5497 10.3657 69720 2.5732 0.0018
2.5525 10.4906 70560 2.5714 0.0018
2.5429 10.6155 71400 2.5703 0.0018
2.5421 10.7404 72240 2.5672 0.0018
2.5398 10.8653 73080 2.5627 0.0018
2.5323 10.9902 73920 2.5595 0.0018
2.518 11.1151 74760 2.5462 0.0018
2.515 11.2400 75600 2.5546 0.0018
2.5152 11.3649 76440 2.5428 0.0018
2.512 11.4897 77280 2.5347 0.0018
2.5132 11.6146 78120 2.5453 0.0018
2.5056 11.7395 78960 2.5306 0.0018
2.5035 11.8644 79800 2.5265 0.0018
2.5044 11.9893 80640 2.5222 0.0018
2.4849 12.1142 81480 2.5327 0.0018
2.4875 12.2391 82320 2.5261 0.0018
2.4859 12.3640 83160 2.5190 0.0018
2.4828 12.4888 84000 2.5112 0.0018
2.479 12.6137 84840 2.5071 0.0018
2.4762 12.7386 85680 2.5027 0.0018
2.4768 12.8635 86520 2.5072 0.0018
2.472 12.9884 87360 2.4876 0.0018
2.4571 13.1133 88200 2.4961 0.0018
2.4578 13.2382 89040 2.4876 0.0018
2.4518 13.3631 89880 2.4921 0.0018
2.4574 13.4880 90720 2.4857 0.0018
2.4548 13.6128 91560 2.4841 0.0018
2.4492 13.7377 92400 2.4803 0.0018
2.4489 13.8626 93240 2.4827 0.0018
2.4459 13.9875 94080 2.4761 0.0018
2.4345 14.1124 94920 2.4744 0.0018
2.4317 14.2373 95760 2.4719 0.0018
2.43 14.3622 96600 2.4650 0.0018
2.4342 14.4871 97440 2.4541 0.0018
2.4305 14.6120 98280 2.4639 0.0018
2.4248 14.7368 99120 2.4631 0.0018
2.426 14.8617 99960 2.4515 0.0018
2.4245 14.9866 100800 2.4575 0.0018
2.4114 15.1115 101640 2.4604 0.0018
2.4071 15.2364 102480 2.4520 0.0018
2.4083 15.3613 103320 2.4480 0.0018
2.4041 15.4862 104160 2.4421 0.0018
2.4057 15.6111 105000 2.4427 0.0018
2.4069 15.7360 105840 2.4477 0.0018
2.4051 15.8608 106680 2.4403 0.0018
2.4021 15.9857 107520 2.4481 0.0018
2.3886 16.1106 108360 2.4382 0.0018
2.3876 16.2355 109200 2.4311 0.0018
2.3891 16.3604 110040 2.4271 0.0018
2.3853 16.4853 110880 2.4290 0.0018
2.388 16.6102 111720 2.4284 0.0018
2.3861 16.7351 112560 2.4282 0.0018
2.3847 16.8599 113400 2.4206 0.0018
2.3828 16.9848 114240 2.4247 0.0018
2.371 17.1097 115080 2.4128 0.0018
2.3695 17.2346 115920 2.4041 0.0018
2.3693 17.3595 116760 2.4086 0.0018
2.3669 17.4844 117600 2.4096 0.0018
2.3672 17.6093 118440 2.4118 0.0018
2.3689 17.7342 119280 2.4080 0.0018
2.3625 17.8591 120120 2.3945 0.0018
2.3618 17.9839 120960 2.4039 0.0018
2.3518 18.1088 121800 2.3983 0.0018
2.3551 18.2337 122640 2.3955 0.0018
2.3492 18.3586 123480 2.3946 0.0018
2.3491 18.4835 124320 2.3924 0.0018
2.3505 18.6084 125160 2.3953 0.0018
2.3503 18.7333 126000 2.3955 0.0018
2.3478 18.8582 126840 2.3878 0.0018
2.3474 18.9831 127680 2.3870 0.0018
2.3339 19.1079 128520 2.3850 0.0018
2.3385 19.2328 129360 2.3855 0.0018
2.3341 19.3577 130200 2.3786 0.0018
2.3381 19.4826 131040 2.3737 0.0018
2.3298 19.6075 131880 2.3733 0.0018
2.3336 19.7324 132720 2.3724 0.0018
2.3315 19.8573 133560 2.3711 0.0018
2.3324 19.9822 134400 2.3644 0.0018
2.3183 20.1070 135240 2.3774 0.0018
2.3195 20.2319 136080 2.3710 0.0018
2.3199 20.3568 136920 2.3710 0.0018
2.319 20.4817 137760 2.3574 0.0018
2.3202 20.6066 138600 2.3547 0.0018
2.3164 20.7315 139440 2.3651 0.0018
2.3164 20.8564 140280 2.3638 0.0018
2.3169 20.9813 141120 2.3609 0.0018
2.3048 21.1062 141960 2.3642 0.0018
2.3035 21.2310 142800 2.3641 0.0018
2.3061 21.3559 143640 2.3534 0.0018
2.3043 21.4808 144480 2.3568 0.0018
2.3043 21.6057 145320 2.3583 0.0018
2.3047 21.7306 146160 2.3521 0.0018
2.3032 21.8555 147000 2.3546 0.0018
2.3016 21.9804 147840 2.3519 0.0018
2.2917 22.1053 148680 2.3346 0.0018
2.2935 22.2302 149520 2.3432 0.0018
2.2912 22.3550 150360 2.3410 0.0018
2.2929 22.4799 151200 2.3394 0.0018
2.2905 22.6048 152040 2.3385 0.0018
2.2914 22.7297 152880 2.3315 0.0018
2.2905 22.8546 153720 2.3341 0.0018
2.2925 22.9795 154560 2.3362 0.0018
2.2798 23.1044 155400 2.3437 0.0018
2.28 23.2293 156240 2.3295 0.0018
2.2761 23.3541 157080 2.3256 0.0018
2.2776 23.4790 157920 2.3312 0.0018
2.2767 23.6039 158760 2.3289 0.0018
2.2774 23.7288 159600 2.3284 0.0018
2.2787 23.8537 160440 2.3250 0.0018
2.2739 23.9786 161280 2.3261 0.0018
2.267 24.1035 162120 2.3241 0.0018
2.2661 24.2284 162960 2.3222 0.0018
2.2644 24.3533 163800 2.3291 0.0018
2.2647 24.4781 164640 2.3198 0.0018
2.263 24.6030 165480 2.3186 0.0018
2.2671 24.7279 166320 2.3198 0.0018
2.2662 24.8528 167160 2.3177 0.0018
2.2641 24.9777 168000 2.3102 0.0018
2.2553 25.1026 168840 2.3185 0.0018
2.2547 25.2275 169680 2.3063 0.0018
2.2558 25.3524 170520 2.3042 0.0018
2.2556 25.4773 171360 2.3110 0.0018
2.2529 25.6021 172200 2.3106 0.0018
2.2535 25.7270 173040 2.3057 0.0018
2.2547 25.8519 173880 2.3117 0.0018
2.2546 25.9768 174720 2.3055 0.0018
2.2465 26.1017 175560 2.3035 0.0018
2.2415 26.2266 176400 2.3056 0.0018
2.2497 26.3515 177240 2.2983 0.0018
2.2422 26.4764 178080 2.2987 0.0018
2.2423 26.6012 178920 2.2987 0.0018
2.2407 26.7261 179760 2.2946 0.0018
2.2412 26.8510 180600 2.2873 0.0018
2.2426 26.9759 181440 2.2966 0.0018
2.2345 27.1008 182280 2.2956 0.0018
2.2359 27.2257 183120 2.2895 0.0018
2.2361 27.3506 183960 2.2884 0.0018
2.2361 27.4755 184800 2.2931 0.0018
2.2335 27.6004 185640 2.2840 0.0018
2.2318 27.7252 186480 2.2819 0.0018
2.2297 27.8501 187320 2.2834 0.0018
2.2283 27.9750 188160 2.2902 0.0018
2.2253 28.0999 189000 2.2932 0.0018
2.224 28.2248 189840 2.2878 0.0018
2.2252 28.3497 190680 2.2812 0.0018
2.2222 28.4746 191520 2.2802 0.0018
2.221 28.5995 192360 2.2781 0.0018
2.224 28.7244 193200 2.2783 0.0018
2.2206 28.8492 194040 2.2776 0.0018
2.224 28.9741 194880 2.2716 0.0018
2.2137 29.0990 195720 2.2868 0.0018
2.2178 29.2239 196560 2.2783 0.0018
2.2163 29.3488 197400 2.2689 0.0018
2.2143 29.4737 198240 2.2677 0.0018
2.2139 29.5986 199080 2.2765 0.0018
2.2134 29.7235 199920 2.2711 0.0018
2.2138 29.8483 200760 2.2725 0.0018
2.2148 29.9732 201600 2.2738 0.0018
2.2126 30.0981 202440 2.2757 0.0018
2.2093 30.2230 203280 2.2714 0.0018
2.2076 30.3479 204120 2.2771 0.0018
2.2092 30.4728 204960 2.2673 0.0018
2.208 30.5977 205800 2.2763 0.0018
2.207 30.7226 206640 2.2677 0.0018
2.2054 30.8475 207480 2.2717 0.0018
2.2063 30.9723 208320 2.2770 0.0018
2.2045 31.0972 209160 2.2580 0.0018
2.1999 31.2221 210000 2.2613 0.0018
2.2027 31.3470 210840 2.2622 0.0018
2.2046 31.4719 211680 2.2637 0.0018
2.2022 31.5968 212520 2.2596 0.0018
2.199 31.7217 213360 2.2664 0.0018
2.1969 31.8466 214200 2.2601 0.0018
2.2006 31.9715 215040 2.2599 0.0018

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.4.1
  • Tokenizers 0.21.1
Downloads last month
30
Safetensors
Model size
50.8M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for yosefw/modernbert-medium-amharic-32k

Finetuned
(503)
this model