RylanSchaeffer
/

collapse_gemma-2-2b_hs2_accumulatesubsample_iter20_sftsd0

Generated from Trainer

Model card Files Files and versions Community

collapse_gemma-2-2b_hs2_accumulatesubsample_iter20_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.2287
Num Input Tokens Seen: 4938184

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 8e-06
train_batch_size: 8
eval_batch_size: 16
seed: 0
gradient_accumulation_steps: 16
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant_with_warmup
lr_scheduler_warmup_ratio: 0.05
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.3396	0.0533	5	1.2821	269216
1.0707	0.1067	10	1.2356	537456
0.9303	0.16	15	1.2354	795928
0.8553	0.2133	20	1.2527	1054264
0.8268	0.2667	25	1.2553	1324584
0.7279	0.32	30	1.2700	1597000
0.5158	0.3733	35	1.2862	1858032
0.5511	0.4267	40	1.2565	2122448
0.5151	0.48	45	1.2456	2386632
0.5688	0.5333	50	1.2360	2651920
0.408	0.5867	55	1.2481	2923680
0.4403	0.64	60	1.2211	3186272
0.3863	0.6933	65	1.2360	3456024
0.4065	0.7467	70	1.2128	3727192
0.4249	0.8	75	1.2300	3989832
0.4252	0.8533	80	1.2140	4250048
0.3838	0.9067	85	1.2314	4509488
0.4182	0.96	90	1.2114	4776720

Framework versions

Transformers 4.44.0
Pytorch 2.4.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Downloads last month: 2

Safetensors

Model size

2.61B params

Tensor type

BF16

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RylanSchaeffer/collapse_gemma-2-2b_hs2_accumulatesubsample_iter20_sftsd0

Base model

google/gemma-2-2b

Finetuned

(529)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard