roberta-base-amharic-32k-256-512-v2

This model is a fine-tuned version of yosefw/roberta-base-amharic-32k-256 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.0425
  • Model Preparation Time: 0.0033

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 12
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time
5.9352 0.1250 1097 4.5422 0.0033
3.0997 0.2500 2194 2.2981 0.0033
2.4198 0.3750 3291 2.2186 0.0033
2.3486 0.4999 4388 2.2021 0.0033
2.3108 0.6249 5485 2.1808 0.0033
2.2912 0.7499 6582 2.1681 0.0033
2.2729 0.8749 7679 2.1540 0.0033
2.2661 0.9999 8776 2.1429 0.0033
2.2515 1.1249 9873 2.1510 0.0033
2.2449 1.2499 10970 2.1359 0.0033
2.2353 1.3748 12067 2.1340 0.0033
2.2353 1.4998 13164 2.1250 0.0033
2.2284 1.6248 14261 2.1245 0.0033
2.2269 1.7498 15358 2.1264 0.0033
2.2179 1.8748 16455 2.1105 0.0033
2.2214 1.9998 17552 2.1179 0.0033
2.2141 2.1248 18649 2.1098 0.0033
2.2055 2.2497 19746 2.1134 0.0033
2.2077 2.3747 20843 2.1123 0.0033
2.2025 2.4997 21940 2.1050 0.0033
2.1998 2.6247 23037 2.0985 0.0033
2.201 2.7497 24134 2.1059 0.0033
2.1963 2.8747 25231 2.1060 0.0033
2.1931 2.9997 26328 2.0956 0.0033
2.1884 3.1246 27425 2.0942 0.0033
2.1879 3.2496 28522 2.0907 0.0033
2.1859 3.3746 29619 2.0982 0.0033
2.1843 3.4996 30716 2.0956 0.0033
2.182 3.6246 31813 2.0886 0.0033
2.1779 3.7496 32910 2.0859 0.0033
2.1866 3.8746 34007 2.0834 0.0033
2.1784 3.9995 35104 2.0791 0.0033
2.1743 4.1245 36201 2.0841 0.0033
2.1727 4.2495 37298 2.0837 0.0033
2.173 4.3745 38395 2.0787 0.0033
2.1735 4.4995 39492 2.0807 0.0033
2.1667 4.6245 40589 2.0745 0.0033
2.1709 4.7495 41686 2.0814 0.0033
2.1679 4.8744 42783 2.0711 0.0033
2.1656 4.9994 43880 2.0758 0.0033
2.1634 5.1244 44977 2.0728 0.0033
2.1626 5.2494 46074 2.0697 0.0033
2.1599 5.3744 47171 2.0846 0.0033
2.1606 5.4994 48268 2.0718 0.0033
2.1621 5.6244 49365 2.0700 0.0033
2.1609 5.7493 50462 2.0715 0.0033
2.1571 5.8743 51559 2.0695 0.0033
2.156 5.9993 52656 2.0580 0.0033
2.1499 6.1243 53753 2.0631 0.0033
2.1554 6.2493 54850 2.0662 0.0033
2.1512 6.3743 55947 2.0643 0.0033
2.1539 6.4993 57044 2.0713 0.0033
2.1497 6.6242 58141 2.0605 0.0033
2.1471 6.7492 59238 2.0572 0.0033
2.1476 6.8742 60335 2.0542 0.0033
2.1485 6.9992 61432 2.0672 0.0033
2.1447 7.1242 62529 2.0544 0.0033
2.1423 7.2492 63626 2.0560 0.0033
2.1471 7.3742 64723 2.0501 0.0033
2.1433 7.4991 65820 2.0536 0.0033
2.1425 7.6241 66917 2.0577 0.0033
2.1411 7.7491 68014 2.0551 0.0033
2.143 7.8741 69111 2.0482 0.0033
2.1404 7.9991 70208 2.0606 0.0033
2.136 8.1241 71305 2.0605 0.0033
2.1371 8.2491 72402 2.0547 0.0033
2.1375 8.3740 73499 2.0460 0.0033
2.1369 8.4990 74596 2.0541 0.0033
2.1359 8.6240 75693 2.0508 0.0033
2.1337 8.7490 76790 2.0524 0.0033
2.1342 8.8740 77887 2.0622 0.0033
2.1365 8.9990 78984 2.0474 0.0033
2.1297 9.1240 80081 2.0510 0.0033
2.1325 9.2489 81178 2.0466 0.0033
2.1306 9.3739 82275 2.0453 0.0033
2.1305 9.4989 83372 2.0457 0.0033
2.1296 9.6239 84469 2.0522 0.0033
2.1287 9.7489 85566 2.0399 0.0033
2.1305 9.8739 86663 2.0461 0.0033
2.128 9.9989 87760 2.0428 0.0033
2.1281 10.1238 88857 2.0379 0.0033
2.1272 10.2488 89954 2.0381 0.0033
2.1268 10.3738 91051 2.0443 0.0033
2.1252 10.4988 92148 2.0371 0.0033
2.1229 10.6238 93245 2.0391 0.0033
2.1264 10.7488 94342 2.0451 0.0033
2.123 10.8738 95439 2.0412 0.0033
2.1252 10.9987 96536 2.0438 0.0033
2.1192 11.1237 97633 2.0391 0.0033
2.1236 11.2487 98730 2.0433 0.0033
2.1221 11.3737 99827 2.0386 0.0033
2.124 11.4987 100924 2.0334 0.0033
2.1228 11.6237 102021 2.0384 0.0033
2.1172 11.7487 103118 2.0400 0.0033
2.1217 11.8736 104215 2.0319 0.0033
2.1216 11.9986 105312 2.0405 0.0033

Framework versions

  • Transformers 4.50.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.4.1
  • Tokenizers 0.21.1
Downloads last month
3
Safetensors
Model size
111M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for yosefw/roberta-base-amharic-32k-256-512-v2

Finetuned
(1)
this model