Zacktree commited on
Commit
b002745
·
verified ·
1 Parent(s): 0f2e6c3

Model save

Browse files
README.md ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: peft
3
+ license: gemma
4
+ base_model: google/codegemma-7b
5
+ tags:
6
+ - trl
7
+ - sft
8
+ - generated_from_trainer
9
+ model-index:
10
+ - name: code-bench-CodeGemma-7B-cgv1-ds
11
+ results: []
12
+ ---
13
+
14
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
+ should probably proofread and complete it, then remove this comment. -->
16
+
17
+ # code-bench-CodeGemma-7B-cgv1-ds
18
+
19
+ This model is a fine-tuned version of [google/codegemma-7b](https://huggingface.co/google/codegemma-7b) on the None dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - Loss: 0.0947
22
+
23
+ ## Model description
24
+
25
+ More information needed
26
+
27
+ ## Intended uses & limitations
28
+
29
+ More information needed
30
+
31
+ ## Training and evaluation data
32
+
33
+ More information needed
34
+
35
+ ## Training procedure
36
+
37
+ ### Training hyperparameters
38
+
39
+ The following hyperparameters were used during training:
40
+ - learning_rate: 5e-05
41
+ - train_batch_size: 1
42
+ - eval_batch_size: 3
43
+ - seed: 42
44
+ - distributed_type: multi-GPU
45
+ - gradient_accumulation_steps: 8
46
+ - total_train_batch_size: 8
47
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
48
+ - lr_scheduler_type: cosine
49
+ - lr_scheduler_warmup_ratio: 0.03
50
+ - num_epochs: 2
51
+ - mixed_precision_training: Native AMP
52
+
53
+ ### Training results
54
+
55
+ | Training Loss | Epoch | Step | Validation Loss |
56
+ |:-------------:|:------:|:----:|:---------------:|
57
+ | 0.9203 | 0.0530 | 50 | 1.0306 |
58
+ | 0.551 | 0.1061 | 100 | 0.5383 |
59
+ | 0.4483 | 0.1591 | 150 | 0.4048 |
60
+ | 0.3469 | 0.2121 | 200 | 0.3013 |
61
+ | 0.2868 | 0.2652 | 250 | 0.2447 |
62
+ | 0.2307 | 0.3182 | 300 | 0.2061 |
63
+ | 0.1972 | 0.3713 | 350 | 0.1727 |
64
+ | 0.1716 | 0.4243 | 400 | 0.1525 |
65
+ | 0.1612 | 0.4773 | 450 | 0.1468 |
66
+ | 0.1631 | 0.5304 | 500 | 0.1400 |
67
+ | 0.1739 | 0.5834 | 550 | 0.1376 |
68
+ | 0.148 | 0.6364 | 600 | 0.1330 |
69
+ | 0.1413 | 0.6895 | 650 | 0.1274 |
70
+ | 0.1464 | 0.7425 | 700 | 0.1267 |
71
+ | 0.1376 | 0.7955 | 750 | 0.1240 |
72
+ | 0.1287 | 0.8486 | 800 | 0.1210 |
73
+ | 0.1402 | 0.9016 | 850 | 0.1198 |
74
+ | 0.1261 | 0.9547 | 900 | 0.1173 |
75
+ | 0.1195 | 1.0077 | 950 | 0.1145 |
76
+ | 0.1254 | 1.0607 | 1000 | 0.1133 |
77
+ | 0.1109 | 1.1138 | 1050 | 0.1119 |
78
+ | 0.1206 | 1.1668 | 1100 | 0.1093 |
79
+ | 0.1195 | 1.2198 | 1150 | 0.1084 |
80
+ | 0.1237 | 1.2729 | 1200 | 0.1073 |
81
+ | 0.1205 | 1.3259 | 1250 | 0.1064 |
82
+ | 0.1105 | 1.3789 | 1300 | 0.1048 |
83
+ | 0.1027 | 1.4320 | 1350 | 0.1038 |
84
+ | 0.1128 | 1.4850 | 1400 | 0.1035 |
85
+ | 0.1207 | 1.5381 | 1450 | 0.1030 |
86
+ | 0.1057 | 1.5911 | 1500 | 0.1013 |
87
+ | 0.1056 | 1.6441 | 1550 | 0.0996 |
88
+ | 0.1086 | 1.6972 | 1600 | 0.0985 |
89
+ | 0.1078 | 1.7502 | 1650 | 0.0982 |
90
+ | 0.0987 | 1.8032 | 1700 | 0.0968 |
91
+ | 0.1037 | 1.8563 | 1750 | 0.0960 |
92
+ | 0.1047 | 1.9093 | 1800 | 0.0957 |
93
+ | 0.1045 | 1.9623 | 1850 | 0.0947 |
94
+
95
+
96
+ ### Framework versions
97
+
98
+ - PEFT 0.12.0
99
+ - Transformers 4.44.2
100
+ - Pytorch 2.5.1+cu121
101
+ - Datasets 2.21.0
102
+ - Tokenizers 0.19.1
runs/Sep10_18-53-35_m3h110/events.out.tfevents.1757495450.m3h110.2356734.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a9bc6b0ff94831e51de171fecf53937ea3b4094670d2e5a1b57549dbb0f0806a
3
- size 37222
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d33bf9793ffe67c5ac0b2f9ee361488a798020890886b5699a40fd4bc706bff8
3
+ size 39535