Zacktree's picture
Model save
b002745 verified
|
raw
history blame
3.39 kB
metadata
library_name: peft
license: gemma
base_model: google/codegemma-7b
tags:
  - trl
  - sft
  - generated_from_trainer
model-index:
  - name: code-bench-CodeGemma-7B-cgv1-ds
    results: []

code-bench-CodeGemma-7B-cgv1-ds

This model is a fine-tuned version of google/codegemma-7b on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0947

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 1
  • eval_batch_size: 3
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.9203 0.0530 50 1.0306
0.551 0.1061 100 0.5383
0.4483 0.1591 150 0.4048
0.3469 0.2121 200 0.3013
0.2868 0.2652 250 0.2447
0.2307 0.3182 300 0.2061
0.1972 0.3713 350 0.1727
0.1716 0.4243 400 0.1525
0.1612 0.4773 450 0.1468
0.1631 0.5304 500 0.1400
0.1739 0.5834 550 0.1376
0.148 0.6364 600 0.1330
0.1413 0.6895 650 0.1274
0.1464 0.7425 700 0.1267
0.1376 0.7955 750 0.1240
0.1287 0.8486 800 0.1210
0.1402 0.9016 850 0.1198
0.1261 0.9547 900 0.1173
0.1195 1.0077 950 0.1145
0.1254 1.0607 1000 0.1133
0.1109 1.1138 1050 0.1119
0.1206 1.1668 1100 0.1093
0.1195 1.2198 1150 0.1084
0.1237 1.2729 1200 0.1073
0.1205 1.3259 1250 0.1064
0.1105 1.3789 1300 0.1048
0.1027 1.4320 1350 0.1038
0.1128 1.4850 1400 0.1035
0.1207 1.5381 1450 0.1030
0.1057 1.5911 1500 0.1013
0.1056 1.6441 1550 0.0996
0.1086 1.6972 1600 0.0985
0.1078 1.7502 1650 0.0982
0.0987 1.8032 1700 0.0968
0.1037 1.8563 1750 0.0960
0.1047 1.9093 1800 0.0957
0.1045 1.9623 1850 0.0947

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.2
  • Pytorch 2.5.1+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1