gpt2_u080_tiny-stories_1024_dpos

This model is a fine-tuned version of on the roneneldan/TinyStories dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.8944	0.0519	1000	2.4388	0.4515
1.9548	0.1037	2000	1.7761	0.5749
1.711	0.1556	3000	1.5925	0.6070
1.5894	0.2074	4000	1.4947	0.6246
1.5161	0.2593	5000	1.4353	0.6355
1.4649	0.3112	6000	1.3872	0.6444
1.4264	0.3630	7000	1.3541	0.6509
1.3953	0.4149	8000	1.3277	0.6560
1.3682	0.4668	9000	1.3023	0.6610
1.349	0.5186	10000	1.2824	0.6650
1.3305	0.5705	11000	1.2654	0.6682
1.3179	0.6223	12000	1.2524	0.6708
1.3027	0.6742	13000	1.2396	0.6735
1.2895	0.7261	14000	1.2290	0.6758
1.2797	0.7779	15000	1.2197	0.6775
1.2697	0.8298	16000	1.2126	0.6790
1.2629	0.8817	17000	1.2044	0.6807
1.2562	0.9335	18000	1.1985	0.6820
1.2531	0.9854	19000	1.1952	0.6827