collapse_gemma-2-2b_hs2_accumulatesubsample_iter20_sftsd1

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.3061	0.0526	5	1.2755	257768
1.1139	0.1053	10	1.2161	516928
0.8992	0.1579	15	1.2251	774912
0.7783	0.2105	20	1.2543	1046504
0.6654	0.2632	25	1.2786	1304672
0.6199	0.3158	30	1.2854	1564472
0.5221	0.3684	35	1.2730	1825704
0.4487	0.4211	40	1.2795	2083416
0.467	0.4737	45	1.2633	2341304
0.4486	0.5263	50	1.2577	2609808
0.4169	0.5789	55	1.2187	2865536
0.3921	0.6316	60	1.2464	3125408
0.3376	0.6842	65	1.2217	3387088
0.3697	0.7368	70	1.2219	3650704
0.3067	0.7895	75	1.2148	3918312
0.3436	0.8421	80	1.2127	4176968
0.3345	0.8947	85	1.2084	4435856
0.3397	0.9474	90	1.2054	4698528
0.2657	1.0	95	1.2128	4956000