tinybert_base_train_book_ent_15p_s_init_book
This model is a fine-tuned version of google/bert_uncased_L-4_H-512_A-8 on the gokulsrinivasagan/processed_book_corpus-ld dataset. It achieves the following results on the evaluation set:
- Loss: 2.7883
- Accuracy: 0.5016
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 120
- eval_batch_size: 120
- seed: 10
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 10000
- num_epochs: 24
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
5.7148 | 0.5269 | 10000 | 5.3754 | 0.1660 |
5.4378 | 1.0539 | 20000 | 5.0823 | 0.1892 |
5.1562 | 1.5808 | 30000 | 4.7818 | 0.2149 |
4.8487 | 2.1077 | 40000 | 4.4633 | 0.2476 |
4.5657 | 2.6346 | 50000 | 4.1493 | 0.2907 |
4.2716 | 3.1616 | 60000 | 3.7873 | 0.3466 |
4.2871 | 3.6885 | 70000 | 3.8181 | 0.3361 |
4.1543 | 4.2154 | 80000 | 3.5822 | 0.3733 |
3.8571 | 4.7423 | 90000 | 3.4747 | 0.3897 |
3.7109 | 5.2693 | 100000 | 3.1587 | 0.4452 |
3.5309 | 5.7962 | 110000 | 3.1118 | 0.4521 |
3.5456 | 6.3231 | 120000 | 3.1531 | 0.4396 |
3.3806 | 6.8500 | 130000 | 2.8550 | 0.4952 |
3.4529 | 7.3770 | 140000 | 3.0184 | 0.4605 |
3.3215 | 7.9039 | 150000 | 2.7883 | 0.5016 |
3.513 | 8.4308 | 160000 | 3.0518 | 0.4508 |
3.3968 | 8.9577 | 170000 | 2.9743 | 0.4616 |
3.449 | 9.4847 | 180000 | 2.9690 | 0.4628 |
3.3697 | 10.0116 | 190000 | 2.8899 | 0.4777 |
3.357 | 10.5385 | 200000 | 2.9087 | 0.4713 |
3.387 | 11.0654 | 210000 | 2.8973 | 0.4734 |
3.4019 | 11.5924 | 220000 | 2.9180 | 0.4674 |
3.3729 | 12.1193 | 230000 | 2.9308 | 0.4650 |
3.4055 | 12.6462 | 240000 | 2.9422 | 0.4640 |
3.4147 | 13.1731 | 250000 | 3.0244 | 0.4468 |
3.395 | 13.7001 | 260000 | 2.9477 | 0.4606 |
3.4227 | 14.2270 | 270000 | 2.9277 | 0.4636 |
3.5185 | 14.7539 | 280000 | 3.0647 | 0.4362 |
3.4673 | 15.2809 | 290000 | 3.0344 | 0.4418 |
3.4164 | 15.8078 | 300000 | 3.0563 | 0.4379 |
3.3326 | 16.3347 | 310000 | 3.0179 | 0.4443 |
3.3937 | 16.8616 | 320000 | 3.0324 | 0.4397 |
3.4516 | 17.3886 | 330000 | 3.1178 | 0.4245 |
3.4207 | 17.9155 | 340000 | 3.0349 | 0.4407 |
3.3921 | 18.4424 | 350000 | 2.9866 | 0.4471 |
3.3771 | 18.9693 | 360000 | 2.9835 | 0.4488 |
3.3844 | 19.4963 | 370000 | 2.9886 | 0.4477 |
3.3288 | 20.0232 | 380000 | 2.9555 | 0.4523 |
3.3691 | 20.5501 | 390000 | 2.9938 | 0.4449 |
3.3104 | 21.0770 | 400000 | 2.9500 | 0.4528 |
3.3398 | 21.6040 | 410000 | 2.9999 | 0.4437 |
3.3325 | 22.1309 | 420000 | 2.9703 | 0.4481 |
3.3466 | 22.6578 | 430000 | 2.9785 | 0.4474 |
3.3444 | 23.1847 | 440000 | 2.9894 | 0.4450 |
3.3103 | 23.7117 | 450000 | 2.9477 | 0.4523 |
Framework versions
- Transformers 4.51.2
- Pytorch 2.6.0+cu126
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 40
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for gokulsrinivasagan/tinybert_base_train_book_ent_15p_s_init_book
Dataset used to train gokulsrinivasagan/tinybert_base_train_book_ent_15p_s_init_book
Evaluation results
- Accuracy on gokulsrinivasagan/processed_book_corpus-ldself-reported0.502