roberta-base-amharic-32k-256-512-v2
This model is a fine-tuned version of yosefw/roberta-base-amharic-32k-256 on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.0425
- Model Preparation Time: 0.0033
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- lr_scheduler_warmup_steps: 1000
- num_epochs: 12
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Model Preparation Time |
---|---|---|---|---|
5.9352 | 0.1250 | 1097 | 4.5422 | 0.0033 |
3.0997 | 0.2500 | 2194 | 2.2981 | 0.0033 |
2.4198 | 0.3750 | 3291 | 2.2186 | 0.0033 |
2.3486 | 0.4999 | 4388 | 2.2021 | 0.0033 |
2.3108 | 0.6249 | 5485 | 2.1808 | 0.0033 |
2.2912 | 0.7499 | 6582 | 2.1681 | 0.0033 |
2.2729 | 0.8749 | 7679 | 2.1540 | 0.0033 |
2.2661 | 0.9999 | 8776 | 2.1429 | 0.0033 |
2.2515 | 1.1249 | 9873 | 2.1510 | 0.0033 |
2.2449 | 1.2499 | 10970 | 2.1359 | 0.0033 |
2.2353 | 1.3748 | 12067 | 2.1340 | 0.0033 |
2.2353 | 1.4998 | 13164 | 2.1250 | 0.0033 |
2.2284 | 1.6248 | 14261 | 2.1245 | 0.0033 |
2.2269 | 1.7498 | 15358 | 2.1264 | 0.0033 |
2.2179 | 1.8748 | 16455 | 2.1105 | 0.0033 |
2.2214 | 1.9998 | 17552 | 2.1179 | 0.0033 |
2.2141 | 2.1248 | 18649 | 2.1098 | 0.0033 |
2.2055 | 2.2497 | 19746 | 2.1134 | 0.0033 |
2.2077 | 2.3747 | 20843 | 2.1123 | 0.0033 |
2.2025 | 2.4997 | 21940 | 2.1050 | 0.0033 |
2.1998 | 2.6247 | 23037 | 2.0985 | 0.0033 |
2.201 | 2.7497 | 24134 | 2.1059 | 0.0033 |
2.1963 | 2.8747 | 25231 | 2.1060 | 0.0033 |
2.1931 | 2.9997 | 26328 | 2.0956 | 0.0033 |
2.1884 | 3.1246 | 27425 | 2.0942 | 0.0033 |
2.1879 | 3.2496 | 28522 | 2.0907 | 0.0033 |
2.1859 | 3.3746 | 29619 | 2.0982 | 0.0033 |
2.1843 | 3.4996 | 30716 | 2.0956 | 0.0033 |
2.182 | 3.6246 | 31813 | 2.0886 | 0.0033 |
2.1779 | 3.7496 | 32910 | 2.0859 | 0.0033 |
2.1866 | 3.8746 | 34007 | 2.0834 | 0.0033 |
2.1784 | 3.9995 | 35104 | 2.0791 | 0.0033 |
2.1743 | 4.1245 | 36201 | 2.0841 | 0.0033 |
2.1727 | 4.2495 | 37298 | 2.0837 | 0.0033 |
2.173 | 4.3745 | 38395 | 2.0787 | 0.0033 |
2.1735 | 4.4995 | 39492 | 2.0807 | 0.0033 |
2.1667 | 4.6245 | 40589 | 2.0745 | 0.0033 |
2.1709 | 4.7495 | 41686 | 2.0814 | 0.0033 |
2.1679 | 4.8744 | 42783 | 2.0711 | 0.0033 |
2.1656 | 4.9994 | 43880 | 2.0758 | 0.0033 |
2.1634 | 5.1244 | 44977 | 2.0728 | 0.0033 |
2.1626 | 5.2494 | 46074 | 2.0697 | 0.0033 |
2.1599 | 5.3744 | 47171 | 2.0846 | 0.0033 |
2.1606 | 5.4994 | 48268 | 2.0718 | 0.0033 |
2.1621 | 5.6244 | 49365 | 2.0700 | 0.0033 |
2.1609 | 5.7493 | 50462 | 2.0715 | 0.0033 |
2.1571 | 5.8743 | 51559 | 2.0695 | 0.0033 |
2.156 | 5.9993 | 52656 | 2.0580 | 0.0033 |
2.1499 | 6.1243 | 53753 | 2.0631 | 0.0033 |
2.1554 | 6.2493 | 54850 | 2.0662 | 0.0033 |
2.1512 | 6.3743 | 55947 | 2.0643 | 0.0033 |
2.1539 | 6.4993 | 57044 | 2.0713 | 0.0033 |
2.1497 | 6.6242 | 58141 | 2.0605 | 0.0033 |
2.1471 | 6.7492 | 59238 | 2.0572 | 0.0033 |
2.1476 | 6.8742 | 60335 | 2.0542 | 0.0033 |
2.1485 | 6.9992 | 61432 | 2.0672 | 0.0033 |
2.1447 | 7.1242 | 62529 | 2.0544 | 0.0033 |
2.1423 | 7.2492 | 63626 | 2.0560 | 0.0033 |
2.1471 | 7.3742 | 64723 | 2.0501 | 0.0033 |
2.1433 | 7.4991 | 65820 | 2.0536 | 0.0033 |
2.1425 | 7.6241 | 66917 | 2.0577 | 0.0033 |
2.1411 | 7.7491 | 68014 | 2.0551 | 0.0033 |
2.143 | 7.8741 | 69111 | 2.0482 | 0.0033 |
2.1404 | 7.9991 | 70208 | 2.0606 | 0.0033 |
2.136 | 8.1241 | 71305 | 2.0605 | 0.0033 |
2.1371 | 8.2491 | 72402 | 2.0547 | 0.0033 |
2.1375 | 8.3740 | 73499 | 2.0460 | 0.0033 |
2.1369 | 8.4990 | 74596 | 2.0541 | 0.0033 |
2.1359 | 8.6240 | 75693 | 2.0508 | 0.0033 |
2.1337 | 8.7490 | 76790 | 2.0524 | 0.0033 |
2.1342 | 8.8740 | 77887 | 2.0622 | 0.0033 |
2.1365 | 8.9990 | 78984 | 2.0474 | 0.0033 |
2.1297 | 9.1240 | 80081 | 2.0510 | 0.0033 |
2.1325 | 9.2489 | 81178 | 2.0466 | 0.0033 |
2.1306 | 9.3739 | 82275 | 2.0453 | 0.0033 |
2.1305 | 9.4989 | 83372 | 2.0457 | 0.0033 |
2.1296 | 9.6239 | 84469 | 2.0522 | 0.0033 |
2.1287 | 9.7489 | 85566 | 2.0399 | 0.0033 |
2.1305 | 9.8739 | 86663 | 2.0461 | 0.0033 |
2.128 | 9.9989 | 87760 | 2.0428 | 0.0033 |
2.1281 | 10.1238 | 88857 | 2.0379 | 0.0033 |
2.1272 | 10.2488 | 89954 | 2.0381 | 0.0033 |
2.1268 | 10.3738 | 91051 | 2.0443 | 0.0033 |
2.1252 | 10.4988 | 92148 | 2.0371 | 0.0033 |
2.1229 | 10.6238 | 93245 | 2.0391 | 0.0033 |
2.1264 | 10.7488 | 94342 | 2.0451 | 0.0033 |
2.123 | 10.8738 | 95439 | 2.0412 | 0.0033 |
2.1252 | 10.9987 | 96536 | 2.0438 | 0.0033 |
2.1192 | 11.1237 | 97633 | 2.0391 | 0.0033 |
2.1236 | 11.2487 | 98730 | 2.0433 | 0.0033 |
2.1221 | 11.3737 | 99827 | 2.0386 | 0.0033 |
2.124 | 11.4987 | 100924 | 2.0334 | 0.0033 |
2.1228 | 11.6237 | 102021 | 2.0384 | 0.0033 |
2.1172 | 11.7487 | 103118 | 2.0400 | 0.0033 |
2.1217 | 11.8736 | 104215 | 2.0319 | 0.0033 |
2.1216 | 11.9986 | 105312 | 2.0405 | 0.0033 |
Framework versions
- Transformers 4.50.0
- Pytorch 2.6.0+cu124
- Datasets 3.4.1
- Tokenizers 0.21.1
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for yosefw/roberta-base-amharic-32k-256-512-v2
Base model
yosefw/roberta-base-amharic-32k-256