gpt2_u040_tiny-stories_1024_dpos

This model is a fine-tuned version of on the roneneldan/TinyStories dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.8993	0.0523	1000	2.4128	0.4573
1.9582	0.1046	2000	1.7777	0.5731
1.7164	0.1569	3000	1.5985	0.6049
1.5976	0.2092	4000	1.5014	0.6224
1.5257	0.2615	5000	1.4399	0.6342
1.4723	0.3138	6000	1.3955	0.6425
1.4337	0.3661	7000	1.3618	0.6489
1.4068	0.4184	8000	1.3318	0.6549
1.3766	0.4707	9000	1.3082	0.6594
1.3567	0.5230	10000	1.2884	0.6632
1.3373	0.5753	11000	1.2717	0.6667
1.3231	0.6275	12000	1.2593	0.6692
1.3101	0.6798	13000	1.2451	0.6717
1.2962	0.7321	14000	1.2344	0.6740
1.2842	0.7844	15000	1.2259	0.6759
1.2752	0.8367	16000	1.2178	0.6776
1.2718	0.8890	17000	1.2103	0.6789
1.2628	0.9413	18000	1.2050	0.6803
1.2576	0.9936	19000	1.2016	0.6809