gpt2_u070_tiny-stories_1024_dpos

This model is a fine-tuned version of on the roneneldan/TinyStories dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.9077	0.0520	1000	2.4249	0.4548
1.9481	0.1040	2000	1.7689	0.5757
1.7051	0.1560	3000	1.5897	0.6072
1.5909	0.2081	4000	1.4959	0.6243
1.5165	0.2601	5000	1.4326	0.6359
1.4645	0.3121	6000	1.3880	0.6441
1.4312	0.3641	7000	1.3518	0.6512
1.3952	0.4161	8000	1.3251	0.6562
1.3699	0.4681	9000	1.3029	0.6608
1.3497	0.5202	10000	1.2838	0.6643
1.3312	0.5722	11000	1.2663	0.6680
1.3141	0.6242	12000	1.2515	0.6708
1.3035	0.6762	13000	1.2400	0.6731
1.2874	0.7282	14000	1.2295	0.6752
1.28	0.7802	15000	1.2204	0.6772
1.2689	0.8322	16000	1.2115	0.6791
1.2625	0.8843	17000	1.2051	0.6805
1.2536	0.9363	18000	1.1992	0.6817
1.2535	0.9883	19000	1.1952	0.6824