pythia-160m-shuffled-pg19
This model is a fine-tuned version of yurakuratov/pythia-160m-rnd on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 5.9773
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 25
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
6.9501 | 0.5593 | 250 | 6.8935 |
6.5228 | 1.1186 | 500 | 6.5062 |
6.3755 | 1.6779 | 750 | 6.3901 |
6.2914 | 2.2371 | 1000 | 6.3212 |
6.2433 | 2.7964 | 1250 | 6.2800 |
6.1989 | 3.3557 | 1500 | 6.2246 |
6.1642 | 3.9150 | 1750 | 6.1784 |
6.1303 | 4.4743 | 2000 | 6.1952 |
6.1105 | 5.0336 | 2250 | 6.1830 |
6.1022 | 5.5928 | 2500 | 6.1601 |
6.0819 | 6.1521 | 2750 | 6.1461 |
6.0729 | 6.7114 | 3000 | 6.1581 |
6.0704 | 7.2707 | 3250 | 6.1284 |
6.0564 | 7.8300 | 3500 | 6.1152 |
6.0422 | 8.3893 | 3750 | 6.1052 |
6.0272 | 8.9485 | 4000 | 6.1243 |
6.025 | 9.5078 | 4250 | 6.1150 |
6.0096 | 10.0671 | 4500 | 6.0761 |
6.0014 | 10.6264 | 4750 | 6.0883 |
6.001 | 11.1857 | 5000 | 6.0950 |
5.9984 | 11.7450 | 5250 | 6.0633 |
5.9943 | 12.3043 | 5500 | 6.0714 |
5.9836 | 12.8635 | 5750 | 6.0981 |
5.9819 | 13.4228 | 6000 | 6.0536 |
5.9825 | 13.9821 | 6250 | 6.0519 |
5.9677 | 14.5414 | 6500 | 6.0923 |
5.9645 | 15.1007 | 6750 | 6.0295 |
5.9689 | 15.6600 | 7000 | 6.0396 |
5.9667 | 16.2192 | 7250 | 6.0684 |
5.9598 | 16.7785 | 7500 | 6.0128 |
5.9414 | 17.3378 | 7750 | 6.0212 |
5.9427 | 17.8971 | 8000 | 6.0452 |
5.9403 | 18.4564 | 8250 | 6.0217 |
5.9439 | 19.0157 | 8500 | 6.0177 |
5.9404 | 19.5749 | 8750 | 6.0494 |
5.9328 | 20.1342 | 9000 | 5.9959 |
5.9344 | 20.6935 | 9250 | 6.0190 |
5.9323 | 21.2528 | 9500 | 5.9959 |
5.9273 | 21.8121 | 9750 | 6.0320 |
5.9164 | 22.3714 | 10000 | 6.0198 |
5.9237 | 22.9306 | 10250 | 5.9934 |
5.921 | 23.4899 | 10500 | 6.0037 |
5.9169 | 24.0492 | 10750 | 6.0041 |
5.9089 | 24.6085 | 11000 | 5.9773 |
Framework versions
- Transformers 4.50.0
- Pytorch 2.6.0+cu124
- Datasets 3.4.1
- Tokenizers 0.21.1
- Downloads last month
- 8
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for yurakuratov/pythia-160m-shuffled-pg19
Base model
yurakuratov/pythia-160m-rnd