modernbert-medium-amharic-50k-1024

This model is a fine-tuned version of yosefw/modernbert-medium-amharic-50k on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2719
  • Model Preparation Time: 0.0017

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 40
  • eval_batch_size: 40
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 6
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time
2.2741 0.1249 721 2.3156 0.0017
2.2434 0.2498 1442 2.3270 0.0017
2.2381 0.3747 2163 2.3179 0.0017
2.2344 0.4996 2884 2.3133 0.0017
2.2346 0.6245 3605 2.3114 0.0017
2.2261 0.7494 4326 2.3129 0.0017
2.2244 0.8742 5047 2.3043 0.0017
2.228 0.9991 5768 2.2971 0.0017
2.2206 1.1240 6489 2.2945 0.0017
2.2198 1.2489 7210 2.3039 0.0017
2.2187 1.3738 7931 2.2985 0.0017
2.218 1.4987 8652 2.2926 0.0017
2.2187 1.6236 9373 2.2912 0.0017
2.2192 1.7485 10094 2.2885 0.0017
2.2187 1.8734 10815 2.2992 0.0017
2.2141 1.9983 11536 2.2906 0.0017
2.2075 2.1232 12257 2.2908 0.0017
2.2074 2.2481 12978 2.2970 0.0017
2.2125 2.3729 13699 2.2939 0.0017
2.2024 2.4978 14420 2.2982 0.0017
2.2066 2.6227 15141 2.2868 0.0017
2.2138 2.7476 15862 2.2868 0.0017
2.2046 2.8725 16583 2.2893 0.0017
2.209 2.9974 17304 2.2882 0.0017
2.2022 3.1223 18025 2.2852 0.0017
2.2036 3.2472 18746 2.2798 0.0017
2.204 3.3721 19467 2.2926 0.0017
2.2014 3.4970 20188 2.2893 0.0017
2.2044 3.6219 20909 2.2868 0.0017
2.2023 3.7468 21630 2.2836 0.0017
2.1976 3.8716 22351 2.2832 0.0017
2.2019 3.9965 23072 2.2912 0.0017
2.1997 4.1214 23793 2.2795 0.0017
2.1972 4.2463 24514 2.2865 0.0017
2.1959 4.3712 25235 2.2841 0.0017
2.1948 4.4961 25956 2.2772 0.0017
2.1967 4.6210 26677 2.2702 0.0017
2.1948 4.7459 27398 2.2854 0.0017
2.1969 4.8708 28119 2.2856 0.0017
2.1983 4.9957 28840 2.2733 0.0017
2.193 5.1206 29561 2.2844 0.0017
2.1886 5.2455 30282 2.2813 0.0017
2.1908 5.3703 31003 2.2755 0.0017
2.1928 5.4952 31724 2.2728 0.0017
2.1979 5.6201 32445 2.2717 0.0017
2.1941 5.7450 33166 2.2741 0.0017
2.1941 5.8699 33887 2.2743 0.0017
2.192 5.9948 34608 2.2792 0.0017

Framework versions

  • Transformers 4.50.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.4.1
  • Tokenizers 0.21.1
Downloads last month
25
Safetensors
Model size
54.3M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for yosefw/modernbert-medium-amharic-50k-1024

Finetuned
(1)
this model