Edit model card

collapse_gemma-2-2b_hs2_replace_iter14_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.6142
  • Num Input Tokens Seen: 4828184

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-06
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 0
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
No log 0 0 1.3909 0
1.4148 0.0511 5 1.2837 247024
0.7742 0.1021 10 1.3395 499008
0.5796 0.1532 15 1.5431 741368
0.2962 0.2042 20 1.7539 987472
0.1328 0.2553 25 2.0165 1225360
0.061 0.3063 30 2.2330 1472936
0.0457 0.3574 35 2.3817 1718176
0.0254 0.4084 40 2.4657 1967528
0.0238 0.4595 45 2.5407 2213992
0.0246 0.5105 50 2.5689 2463832
0.0226 0.5616 55 2.5767 2710400
0.0229 0.6126 60 2.5694 2965856
0.0236 0.6637 65 2.5710 3221744
0.0207 0.7147 70 2.5780 3472392
0.0219 0.7658 75 2.5823 3723160
0.0239 0.8168 80 2.5922 3975240
0.0223 0.8679 85 2.5955 4227248
0.0222 0.9190 90 2.6030 4480456
0.0215 0.9700 95 2.6140 4725336

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
5
Safetensors
Model size
2.61B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for RylanSchaeffer/collapse_gemma-2-2b_hs2_replace_iter14_sftsd0

Base model

google/gemma-2-2b
Finetuned
(429)
this model