collapse_gemma-2-2b_hs2_accumulatesubsample_iter20_sftsd2

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.3101	0.0528	5	1.2785	259360
1.0508	0.1057	10	1.2293	530168
0.9566	0.1585	15	1.2129	794520
0.8309	0.2114	20	1.2625	1058200
0.8029	0.2642	25	1.2461	1327328
0.6388	0.3170	30	1.2820	1587968
0.5711	0.3699	35	1.2793	1853352
0.5408	0.4227	40	1.2597	2118856
0.5223	0.4756	45	1.2438	2384304
0.4692	0.5284	50	1.2533	2651296
0.5336	0.5812	55	1.2343	2907944
0.4685	0.6341	60	1.2426	3175608
0.4822	0.6869	65	1.2253	3449616
0.4543	0.7398	70	1.2388	3719976
0.4193	0.7926	75	1.2284	3987200
0.4007	0.8454	80	1.2234	4247992
0.3711	0.8983	85	1.2196	4508112
0.4195	0.9511	90	1.2262	4769520