berel_finetuned_on_HB_20_epochs
This model is a fine-tuned version of dicta-il/BEREL on the None dataset. It achieves the following results on the evaluation set:
- Loss: 8.5450
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
7.1591 | 0.2153 | 500 | 8.6618 |
8.6589 | 0.4307 | 1000 | 8.7495 |
8.6554 | 0.6460 | 1500 | 8.6908 |
8.6627 | 0.8613 | 2000 | 8.7017 |
8.6223 | 1.0767 | 2500 | 8.6827 |
8.5857 | 1.2920 | 3000 | 8.6622 |
8.6139 | 1.5073 | 3500 | 8.7055 |
8.5881 | 1.7227 | 4000 | 8.6809 |
8.5148 | 1.9380 | 4500 | 8.6633 |
8.5885 | 2.1533 | 5000 | 8.6592 |
8.5679 | 2.3686 | 5500 | 8.6508 |
8.5388 | 2.5840 | 6000 | 8.6486 |
8.5463 | 2.7993 | 6500 | 8.5792 |
8.5194 | 3.0146 | 7000 | 8.6633 |
8.48 | 3.2300 | 7500 | 8.6601 |
8.5492 | 3.4453 | 8000 | 8.5711 |
8.5415 | 3.6606 | 8500 | 8.6319 |
8.472 | 3.8760 | 9000 | 8.6781 |
8.4879 | 4.0913 | 9500 | 8.6063 |
8.4655 | 4.3066 | 10000 | 8.6381 |
8.425 | 4.5220 | 10500 | 8.6260 |
8.4945 | 4.7373 | 11000 | 8.6256 |
8.4558 | 4.9526 | 11500 | 8.6616 |
8.5101 | 5.1680 | 12000 | 8.6613 |
8.4096 | 5.3833 | 12500 | 8.6660 |
8.4309 | 5.5986 | 13000 | 8.6026 |
8.4496 | 5.8140 | 13500 | 8.6025 |
9.3234 | 6.0293 | 14000 | 8.7347 |
8.5097 | 6.2446 | 14500 | 8.6560 |
8.4168 | 6.4599 | 15000 | 8.5908 |
8.445 | 6.6753 | 15500 | 8.6343 |
8.4541 | 6.8906 | 16000 | 8.6516 |
8.4338 | 7.1059 | 16500 | 8.6521 |
8.4299 | 7.3213 | 17000 | 8.6038 |
8.457 | 7.5366 | 17500 | 8.6108 |
8.4626 | 7.7519 | 18000 | 8.6150 |
8.407 | 7.9673 | 18500 | 8.6590 |
8.4536 | 8.1826 | 19000 | 8.5758 |
8.4815 | 8.3979 | 19500 | 8.5993 |
8.4254 | 8.6133 | 20000 | 8.6394 |
8.4508 | 8.8286 | 20500 | 8.6176 |
8.4415 | 9.0439 | 21000 | 8.6250 |
8.4393 | 9.2593 | 21500 | 8.6635 |
8.4358 | 9.4746 | 22000 | 8.5987 |
8.4646 | 9.6899 | 22500 | 8.6133 |
8.4111 | 9.9053 | 23000 | 8.6195 |
8.3735 | 10.1206 | 23500 | 8.6907 |
8.443 | 10.3359 | 24000 | 8.6044 |
8.376 | 10.5512 | 24500 | 8.6409 |
8.4031 | 10.7666 | 25000 | 8.5897 |
8.4022 | 10.9819 | 25500 | 8.6592 |
8.3487 | 11.1972 | 26000 | 8.6485 |
8.3594 | 11.4126 | 26500 | 8.5704 |
8.3992 | 11.6279 | 27000 | 8.6308 |
8.448 | 11.8432 | 27500 | 8.5839 |
8.3923 | 12.0586 | 28000 | 8.6194 |
8.4294 | 12.2739 | 28500 | 8.6220 |
8.367 | 12.4892 | 29000 | 8.6278 |
8.372 | 12.7046 | 29500 | 8.6453 |
8.3982 | 12.9199 | 30000 | 8.5927 |
8.3998 | 13.1352 | 30500 | 8.6343 |
8.4504 | 13.3506 | 31000 | 8.6210 |
8.3868 | 13.5659 | 31500 | 8.6174 |
8.421 | 13.7812 | 32000 | 8.5785 |
8.3744 | 13.9966 | 32500 | 8.6087 |
8.3772 | 14.2119 | 33000 | 8.6390 |
8.3912 | 14.4272 | 33500 | 8.6107 |
8.373 | 14.6425 | 34000 | 8.6597 |
8.3741 | 14.8579 | 34500 | 8.5642 |
8.346 | 15.0732 | 35000 | 8.6119 |
8.4013 | 15.2885 | 35500 | 8.5845 |
8.3922 | 15.5039 | 36000 | nan |
8.367 | 15.7192 | 36500 | 8.5948 |
8.3847 | 15.9345 | 37000 | 8.5956 |
8.3212 | 16.1499 | 37500 | 8.6331 |
8.3453 | 16.3652 | 38000 | 8.5590 |
8.3495 | 16.5805 | 38500 | 8.5951 |
8.3877 | 16.7959 | 39000 | 8.6073 |
8.3976 | 17.0112 | 39500 | 8.5885 |
8.3509 | 17.2265 | 40000 | 8.6037 |
8.386 | 17.4419 | 40500 | 8.6091 |
8.3475 | 17.6572 | 41000 | 8.6172 |
8.3148 | 17.8725 | 41500 | 8.6449 |
8.3426 | 18.0879 | 42000 | 8.6365 |
8.3148 | 18.3032 | 42500 | 8.6147 |
8.3736 | 18.5185 | 43000 | 8.5603 |
8.3439 | 18.7339 | 43500 | 8.5605 |
8.3471 | 18.9492 | 44000 | 8.5325 |
8.3408 | 19.1645 | 44500 | 8.5323 |
8.3804 | 19.3798 | 45000 | 8.5840 |
8.3231 | 19.5952 | 45500 | 8.6323 |
8.3746 | 19.8105 | 46000 | 8.5450 |
Framework versions
- Transformers 4.47.1
- Pytorch 2.5.1+cu118
- Datasets 3.2.0
- Tokenizers 0.21.0
- Downloads last month
- 8
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for martijn75/berel_finetuned_on_HB_20_epochs
Base model
dicta-il/BEREL