collapse_gemma-2-2b_hs2_replace_iter2_sftsd1

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3956	0
1.6452	0.0346	5	1.3058	278656
1.3928	0.0692	10	1.2047	558136
1.3648	0.1038	15	1.1749	828560
1.1606	0.1383	20	1.1904	1108192
0.9722	0.1729	25	1.2270	1388824
0.8185	0.2075	30	1.3096	1665952
0.6731	0.2421	35	1.3850	1950056
0.5466	0.2767	40	1.4228	2228112
0.4882	0.3113	45	1.5026	2510296
0.4421	0.3459	50	1.4879	2795248
0.3396	0.3805	55	1.4673	3079696
0.2269	0.4150	60	1.5111	3363824
0.2738	0.4496	65	1.4618	3641232
0.3523	0.4842	70	1.4619	3912784
0.2859	0.5188	75	1.4459	4191336
0.1768	0.5534	80	1.4447	4471320
0.1786	0.5880	85	1.4194	4751488
0.1399	0.6226	90	1.4671	5027576
0.1653	0.6572	95	1.4218	5303728
0.1802	0.6917	100	1.4062	5582944
0.1076	0.7263	105	1.4140	5859696
0.16	0.7609	110	1.4109	6130248
0.0994	0.7955	115	1.4027	6409520
0.0925	0.8301	120	1.4232	6691040
0.114	0.8647	125	1.4581	6969344
0.1758	0.8993	130	1.4161	7248104
0.0975	0.9339	135	1.4421	7526360
0.1121	0.9684	140	1.4560	7808664