code-bench-CodeGemma-7B-cgv1-ds

This model is a fine-tuned version of google/codegemma-7b on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1137

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
gradient_accumulation_steps: 8
total_train_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.03
num_epochs: 6
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
4.7885	0.0530	50	3.2152
0.6747	0.1061	100	0.6309
0.5689	0.1591	150	0.5025
0.4619	0.2121	200	0.4133
0.4034	0.2652	250	0.3695
0.3182	0.3182	300	0.3194
0.2859	0.3713	350	0.2842
0.2577	0.4243	400	0.2579
0.2488	0.4773	450	0.2471
0.2357	0.5304	500	0.2397
0.2614	0.5834	550	0.2292
0.2205	0.6364	600	0.2252
0.218	0.6895	650	0.2235
0.2277	0.7425	700	0.2176
0.221	0.7955	750	0.2148
0.2109	0.8486	800	0.2103
0.2092	0.9016	850	0.2099
0.2046	0.9547	900	0.2039
0.1899	1.0077	950	0.2024
0.1844	1.0607	1000	0.1971
0.1785	1.1138	1050	0.1938
0.1852	1.1668	1100	0.1920
0.1885	1.2198	1150	0.1893
0.1859	1.2729	1200	0.1860
0.1813	1.3259	1250	0.1853
0.1587	1.3789	1300	0.1833
0.1631	1.4320	1350	0.1814
0.1693	1.4850	1400	0.1793
0.174	1.5381	1450	0.1774
0.1674	1.5911	1500	0.1750
0.1567	1.6441	1550	0.1732
0.1702	1.6972	1600	0.1718
0.161	1.7502	1650	0.1704
0.1656	1.8032	1700	0.1687
0.1704	1.8563	1750	0.1701
0.149	1.9093	1800	0.1658
0.1604	1.9623	1850	0.1632
0.1482	2.0154	1900	0.1636
0.1421	2.0684	1950	0.1614
0.1555	2.1215	2000	0.1603
0.1387	2.1745	2050	0.1588
0.1331	2.2275	2100	0.1587
0.1349	2.2806	2150	0.1556
0.1233	2.3336	2200	0.1549
0.1383	2.3866	2250	0.1533
0.1411	2.4397	2300	0.1535
0.1369	2.4927	2350	0.1512
0.1342	2.5457	2400	0.1500
0.1292	2.5988	2450	0.1480
0.1393	2.6518	2500	0.1480
0.1281	2.7049	2550	0.1477
0.1379	2.7579	2600	0.1456
0.1335	2.8109	2650	0.1439
0.1356	2.8640	2700	0.1438
0.1203	2.9170	2750	0.1427
0.1399	2.9700	2800	0.1411
0.1162	3.0231	2850	0.1396
0.123	3.0761	2900	0.1395
0.1	3.1291	2950	0.1390
0.1182	3.1822	3000	0.1365
0.1043	3.2352	3050	0.1376
0.1052	3.2883	3100	0.1354
0.1039	3.3413	3150	0.1343
0.1101	3.3943	3200	0.1339
0.1161	3.4474	3250	0.1340
0.1031	3.5004	3300	0.1319
0.1044	3.5534	3350	0.1314
0.0936	3.6065	3400	0.1307
0.1057	3.6595	3450	0.1307
0.1103	3.7125	3500	0.1290
0.1055	3.7656	3550	0.1283
0.1044	3.8186	3600	0.1270
0.101	3.8717	3650	0.1259
0.1014	3.9247	3700	0.1247
0.1113	3.9777	3750	0.1239
0.0898	4.0308	3800	0.1270
0.0891	4.0838	3850	0.1243
0.0869	4.1368	3900	0.1244
0.0955	4.1899	3950	0.1227
0.0894	4.2429	4000	0.1217
0.0871	4.2959	4050	0.1222
0.0885	4.3490	4100	0.1212
0.0873	4.4020	4150	0.1214
0.0873	4.4551	4200	0.1198
0.0866	4.5081	4250	0.1208
0.0866	4.5611	4300	0.1183
0.08	4.6142	4350	0.1177
0.0899	4.6672	4400	0.1172
0.0798	4.7202	4450	0.1174
0.078	4.7733	4500	0.1161
0.0763	4.8263	4550	0.1158
0.0832	4.8793	4600	0.1159
0.0918	4.9324	4650	0.1161
0.0799	4.9854	4700	0.1149
0.077	5.0385	4750	0.1150
0.0769	5.0915	4800	0.1148
0.0859	5.1445	4850	0.1151
0.0746	5.1976	4900	0.1147
0.0729	5.2506	4950	0.1143
0.0759	5.3036	5000	0.1144
0.0813	5.3567	5050	0.1144
0.0663	5.4097	5100	0.1147
0.072	5.4627	5150	0.1142
0.0743	5.5158	5200	0.1140
0.0755	5.5688	5250	0.1139
0.0766	5.6219	5300	0.1140
0.0701	5.6749	5350	0.1138
0.0837	5.7279	5400	0.1137
0.0799	5.7810	5450	0.1137
0.0797	5.8340	5500	0.1138
0.0813	5.8870	5550	0.1138
0.0717	5.9401	5600	0.1138
0.0746	5.9931	5650	0.1137

Framework versions

PEFT 0.12.0
Transformers 4.44.2
Pytorch 2.5.1+cu121
Datasets 2.21.0
Tokenizers 0.19.1

Zacktree
/

code-bench-CodeGemma-7B-cgv1-dsv1

code-bench-CodeGemma-7B-cgv1-ds

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Zacktree/code-bench-CodeGemma-7B-cgv1-dsv1

Evaluation results