gpt2_u050_tiny-stories_1024_dpos

This model is a fine-tuned version of on the roneneldan/TinyStories dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.9229	0.0523	1000	2.4379	0.4515
1.9572	0.1046	2000	1.7788	0.5732
1.7096	0.1569	3000	1.5963	0.6055
1.5948	0.2091	4000	1.5008	0.6224
1.5178	0.2614	5000	1.4377	0.6346
1.4709	0.3137	6000	1.3926	0.6431
1.4296	0.3660	7000	1.3555	0.6503
1.4009	0.4183	8000	1.3329	0.6542
1.374	0.4706	9000	1.3055	0.6596
1.3542	0.5228	10000	1.2870	0.6632
1.3326	0.5751	11000	1.2712	0.6667
1.3171	0.6274	12000	1.2554	0.6697
1.3052	0.6797	13000	1.2436	0.6722
1.2931	0.7320	14000	1.2320	0.6746
1.2813	0.7843	15000	1.2234	0.6762
1.2765	0.8366	16000	1.2160	0.6776
1.2659	0.8888	17000	1.2086	0.6792
1.2634	0.9411	18000	1.2035	0.6804
1.2545	0.9934	19000	1.1999	0.6812