gpt2_u060_tiny-stories_1024_dpos

This model is a fine-tuned version of on the roneneldan/TinyStories dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.9072	0.0521	1000	2.4167	0.4556
1.9548	0.1043	2000	1.7778	0.5734
1.707	0.1564	3000	1.5956	0.6059
1.589	0.2085	4000	1.4949	0.6241
1.5186	0.2607	5000	1.4333	0.6353
1.463	0.3128	6000	1.3858	0.6444
1.4279	0.3649	7000	1.3521	0.6508
1.3932	0.4171	8000	1.3245	0.6564
1.3687	0.4692	9000	1.3019	0.6604
1.3498	0.5213	10000	1.2831	0.6644
1.3315	0.5735	11000	1.2661	0.6677
1.3141	0.6256	12000	1.2517	0.6706
1.3006	0.6778	13000	1.2390	0.6731
1.2872	0.7299	14000	1.2279	0.6754
1.2757	0.7820	15000	1.2187	0.6773
1.2688	0.8342	16000	1.2117	0.6788
1.2609	0.8863	17000	1.2041	0.6803
1.2572	0.9384	18000	1.1983	0.6817
1.2517	0.9906	19000	1.1945	0.6825