lmind_nq_train6000_eval6489_v1_recite_qa_gpt2-xl
This model is a fine-tuned version of gpt2-xl on the tyzhu/lmind_nq_train6000_eval6489_v1_recite_qa dataset. It achieves the following results on the evaluation set:
- Loss: 0.3634
- Accuracy: 0.8783
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant
- num_epochs: 20.0
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
2.1506 | 1.0 | 1058 | 1.7260 | 0.7131 |
1.5141 | 2.0 | 2116 | 1.2579 | 0.7600 |
0.9961 | 3.0 | 3174 | 0.8674 | 0.8056 |
0.6354 | 4.0 | 4232 | 0.6007 | 0.8397 |
0.4213 | 5.0 | 5290 | 0.4423 | 0.8612 |
0.283 | 6.0 | 6348 | 0.3741 | 0.8703 |
0.2072 | 7.0 | 7406 | 0.3511 | 0.8742 |
0.1641 | 8.0 | 8464 | 0.3441 | 0.8764 |
0.1365 | 9.0 | 9522 | 0.3439 | 0.8769 |
0.1225 | 10.0 | 10580 | 0.3467 | 0.8774 |
0.1129 | 11.0 | 11638 | 0.3479 | 0.8776 |
0.1074 | 12.0 | 12696 | 0.3505 | 0.8778 |
0.1026 | 13.0 | 13754 | 0.3498 | 0.8774 |
0.1 | 14.0 | 14812 | 0.3514 | 0.8780 |
0.0953 | 15.0 | 15870 | 0.3595 | 0.8782 |
0.0944 | 16.0 | 16928 | 0.3604 | 0.8781 |
0.0911 | 17.0 | 17986 | 0.3604 | 0.8781 |
0.0905 | 18.0 | 19044 | 0.3617 | 0.8781 |
0.0879 | 19.0 | 20102 | 0.3662 | 0.8784 |
0.0866 | 20.0 | 21160 | 0.3634 | 0.8783 |
Framework versions
- Transformers 4.34.0
- Pytorch 2.1.0+cu121
- Datasets 2.14.5
- Tokenizers 0.14.1
- Downloads last month
- 6
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for tyzhu/lmind_nq_train6000_eval6489_v1_recite_qa_gpt2-xl
Base model
openai-community/gpt2-xlDataset used to train tyzhu/lmind_nq_train6000_eval6489_v1_recite_qa_gpt2-xl
Evaluation results
- Accuracy on tyzhu/lmind_nq_train6000_eval6489_v1_recite_qaself-reported0.878