Edit model card

collapse_gemma-2-2b_hs2_replace_iter15_sftsd2

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5155
  • Num Input Tokens Seen: 4641384

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-06
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 2
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
No log 0 0 1.3909 0
1.6014 0.0511 5 1.2802 240072
0.8389 0.1021 10 1.3196 475816
0.3993 0.1532 15 1.5512 718216
0.2213 0.2042 20 1.7697 960224
0.094 0.2553 25 2.0740 1203544
0.0569 0.3063 30 2.2531 1446408
0.0275 0.3574 35 2.4064 1688376
0.0254 0.4084 40 2.4999 1928088
0.0235 0.4595 45 2.5458 2169296
0.0205 0.5105 50 2.5605 2407520
0.0258 0.5616 55 2.5508 2644816
0.0212 0.6126 60 2.5421 2884400
0.0214 0.6637 65 2.5326 3119768
0.0202 0.7147 70 2.5302 3353920
0.0203 0.7658 75 2.5297 3597696
0.0232 0.8168 80 2.5191 3830680
0.0198 0.8679 85 2.5112 4073168
0.0212 0.9190 90 2.5135 4300904
0.0194 0.9700 95 2.5167 4546272

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
4
Safetensors
Model size
2.61B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for RylanSchaeffer/collapse_gemma-2-2b_hs2_replace_iter15_sftsd2

Base model

google/gemma-2-2b
Finetuned
(429)
this model