Upload folder using huggingface_hub
Browse files
checkpoints/grad_l2_over_steps.png
CHANGED
![]() |
![]() |
checkpoints/loss_over_steps.png
CHANGED
![]() |
![]() |
checkpoints/lr_over_steps.png
CHANGED
![]() |
![]() |
checkpoints/main.log
CHANGED
@@ -1309,3 +1309,53 @@ Mixed precision type: bf16
|
|
1309 |
[2024-08-11 23:41:22,601][accelerate.checkpointing][INFO] - Sampler state for dataloader 0 saved in checkpoint-pt-60000/sampler.bin
|
1310 |
[2024-08-11 23:41:22,601][accelerate.checkpointing][INFO] - Sampler state for dataloader 1 saved in checkpoint-pt-60000/sampler_1.bin
|
1311 |
[2024-08-11 23:41:22,602][accelerate.checkpointing][INFO] - Random states saved in checkpoint-pt-60000/random_states_0.pkl
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1309 |
[2024-08-11 23:41:22,601][accelerate.checkpointing][INFO] - Sampler state for dataloader 0 saved in checkpoint-pt-60000/sampler.bin
|
1310 |
[2024-08-11 23:41:22,601][accelerate.checkpointing][INFO] - Sampler state for dataloader 1 saved in checkpoint-pt-60000/sampler_1.bin
|
1311 |
[2024-08-11 23:41:22,602][accelerate.checkpointing][INFO] - Random states saved in checkpoint-pt-60000/random_states_0.pkl
|
1312 |
+
[2024-08-11 23:45:07,108][Main][INFO] - [train] Step 60050 out of 80000 | Loss --> 1.754 | Grad_l2 --> 0.312 | Weights_l2 --> 9084.007 | Lr --> 0.002 | Seconds_per_step --> 4.624 |
|
1313 |
+
[2024-08-11 23:48:44,602][Main][INFO] - [train] Step 60100 out of 80000 | Loss --> 1.760 | Grad_l2 --> 0.310 | Weights_l2 --> 9083.915 | Lr --> 0.002 | Seconds_per_step --> 4.350 |
|
1314 |
+
[2024-08-11 23:52:26,043][Main][INFO] - [train] Step 60150 out of 80000 | Loss --> 1.760 | Grad_l2 --> 0.312 | Weights_l2 --> 9083.829 | Lr --> 0.001 | Seconds_per_step --> 4.429 |
|
1315 |
+
[2024-08-11 23:56:03,737][Main][INFO] - [train] Step 60200 out of 80000 | Loss --> 1.771 | Grad_l2 --> 0.311 | Weights_l2 --> 9083.733 | Lr --> 0.001 | Seconds_per_step --> 4.354 |
|
1316 |
+
[2024-08-11 23:59:47,660][Main][INFO] - [train] Step 60250 out of 80000 | Loss --> 1.767 | Grad_l2 --> 0.312 | Weights_l2 --> 9083.640 | Lr --> 0.001 | Seconds_per_step --> 4.478 |
|
1317 |
+
[2024-08-12 00:03:32,244][Main][INFO] - [train] Step 60300 out of 80000 | Loss --> 1.771 | Grad_l2 --> 0.313 | Weights_l2 --> 9083.550 | Lr --> 0.001 | Seconds_per_step --> 4.492 |
|
1318 |
+
[2024-08-12 00:07:17,431][Main][INFO] - [train] Step 60350 out of 80000 | Loss --> 1.780 | Grad_l2 --> 0.314 | Weights_l2 --> 9083.451 | Lr --> 0.001 | Seconds_per_step --> 4.504 |
|
1319 |
+
[2024-08-12 00:11:01,326][Main][INFO] - [train] Step 60400 out of 80000 | Loss --> 1.768 | Grad_l2 --> 0.311 | Weights_l2 --> 9083.362 | Lr --> 0.001 | Seconds_per_step --> 4.478 |
|
1320 |
+
[2024-08-12 00:14:45,402][Main][INFO] - [train] Step 60450 out of 80000 | Loss --> 1.779 | Grad_l2 --> 0.313 | Weights_l2 --> 9083.267 | Lr --> 0.001 | Seconds_per_step --> 4.482 |
|
1321 |
+
[2024-08-12 00:18:30,537][Main][INFO] - [train] Step 60500 out of 80000 | Loss --> 1.780 | Grad_l2 --> 0.314 | Weights_l2 --> 9083.181 | Lr --> 0.001 | Seconds_per_step --> 4.503 |
|
1322 |
+
[2024-08-12 00:22:22,904][Main][INFO] - [train] Step 60550 out of 80000 | Loss --> 1.776 | Grad_l2 --> 0.312 | Weights_l2 --> 9083.093 | Lr --> 0.001 | Seconds_per_step --> 4.647 |
|
1323 |
+
[2024-08-12 00:26:07,858][Main][INFO] - [train] Step 60600 out of 80000 | Loss --> 1.779 | Grad_l2 --> 0.315 | Weights_l2 --> 9082.995 | Lr --> 0.001 | Seconds_per_step --> 4.499 |
|
1324 |
+
[2024-08-12 00:29:46,792][Main][INFO] - [train] Step 60650 out of 80000 | Loss --> 1.774 | Grad_l2 --> 0.312 | Weights_l2 --> 9082.900 | Lr --> 0.001 | Seconds_per_step --> 4.379 |
|
1325 |
+
[2024-08-12 00:33:31,405][Main][INFO] - [train] Step 60700 out of 80000 | Loss --> 1.774 | Grad_l2 --> 0.313 | Weights_l2 --> 9082.805 | Lr --> 0.001 | Seconds_per_step --> 4.492 |
|
1326 |
+
[2024-08-12 00:37:13,998][Main][INFO] - [train] Step 60750 out of 80000 | Loss --> 1.771 | Grad_l2 --> 0.313 | Weights_l2 --> 9082.715 | Lr --> 0.001 | Seconds_per_step --> 4.452 |
|
1327 |
+
[2024-08-12 00:40:53,937][Main][INFO] - [train] Step 60800 out of 80000 | Loss --> 1.766 | Grad_l2 --> 0.314 | Weights_l2 --> 9082.626 | Lr --> 0.001 | Seconds_per_step --> 4.399 |
|
1328 |
+
[2024-08-12 00:44:29,587][Main][INFO] - [train] Step 60850 out of 80000 | Loss --> 1.773 | Grad_l2 --> 0.315 | Weights_l2 --> 9082.535 | Lr --> 0.001 | Seconds_per_step --> 4.313 |
|
1329 |
+
[2024-08-12 00:48:07,873][Main][INFO] - [train] Step 60900 out of 80000 | Loss --> 1.778 | Grad_l2 --> 0.315 | Weights_l2 --> 9082.441 | Lr --> 0.001 | Seconds_per_step --> 4.366 |
|
1330 |
+
[2024-08-12 00:51:44,166][Main][INFO] - [train] Step 60950 out of 80000 | Loss --> 1.784 | Grad_l2 --> 0.314 | Weights_l2 --> 9082.349 | Lr --> 0.001 | Seconds_per_step --> 4.326 |
|
1331 |
+
[2024-08-12 00:55:24,836][Main][INFO] - [train] Step 61000 out of 80000 | Loss --> 1.775 | Grad_l2 --> 0.313 | Weights_l2 --> 9082.260 | Lr --> 0.001 | Seconds_per_step --> 4.413 |
|
1332 |
+
[2024-08-12 00:59:05,951][Main][INFO] - [train] Step 61050 out of 80000 | Loss --> 1.770 | Grad_l2 --> 0.314 | Weights_l2 --> 9082.170 | Lr --> 0.001 | Seconds_per_step --> 4.422 |
|
1333 |
+
[2024-08-12 01:02:44,096][Main][INFO] - [train] Step 61100 out of 80000 | Loss --> 1.763 | Grad_l2 --> 0.315 | Weights_l2 --> 9082.077 | Lr --> 0.001 | Seconds_per_step --> 4.363 |
|
1334 |
+
[2024-08-12 01:06:23,695][Main][INFO] - [train] Step 61150 out of 80000 | Loss --> 1.771 | Grad_l2 --> 0.315 | Weights_l2 --> 9081.988 | Lr --> 0.001 | Seconds_per_step --> 4.392 |
|
1335 |
+
[2024-08-12 01:10:01,742][Main][INFO] - [train] Step 61200 out of 80000 | Loss --> 1.769 | Grad_l2 --> 0.315 | Weights_l2 --> 9081.898 | Lr --> 0.001 | Seconds_per_step --> 4.361 |
|
1336 |
+
[2024-08-12 01:13:39,844][Main][INFO] - [train] Step 61250 out of 80000 | Loss --> 1.767 | Grad_l2 --> 0.315 | Weights_l2 --> 9081.809 | Lr --> 0.001 | Seconds_per_step --> 4.362 |
|
1337 |
+
[2024-08-12 01:17:15,187][Main][INFO] - [train] Step 61300 out of 80000 | Loss --> 1.767 | Grad_l2 --> 0.316 | Weights_l2 --> 9081.720 | Lr --> 0.001 | Seconds_per_step --> 4.307 |
|
1338 |
+
[2024-08-12 01:20:53,488][Main][INFO] - [train] Step 61350 out of 80000 | Loss --> 1.762 | Grad_l2 --> 0.318 | Weights_l2 --> 9081.631 | Lr --> 0.001 | Seconds_per_step --> 4.366 |
|
1339 |
+
[2024-08-12 01:24:34,284][Main][INFO] - [train] Step 61400 out of 80000 | Loss --> 1.764 | Grad_l2 --> 0.317 | Weights_l2 --> 9081.538 | Lr --> 0.001 | Seconds_per_step --> 4.416 |
|
1340 |
+
[2024-08-12 01:28:12,531][Main][INFO] - [train] Step 61450 out of 80000 | Loss --> 1.768 | Grad_l2 --> 0.317 | Weights_l2 --> 9081.448 | Lr --> 0.001 | Seconds_per_step --> 4.365 |
|
1341 |
+
[2024-08-12 01:31:52,508][Main][INFO] - [train] Step 61500 out of 80000 | Loss --> 1.770 | Grad_l2 --> 0.314 | Weights_l2 --> 9081.354 | Lr --> 0.001 | Seconds_per_step --> 4.400 |
|
1342 |
+
[2024-08-12 01:35:34,640][Main][INFO] - [train] Step 61550 out of 80000 | Loss --> 1.760 | Grad_l2 --> 0.314 | Weights_l2 --> 9081.260 | Lr --> 0.001 | Seconds_per_step --> 4.443 |
|
1343 |
+
[2024-08-12 01:39:17,817][Main][INFO] - [train] Step 61600 out of 80000 | Loss --> 1.766 | Grad_l2 --> 0.313 | Weights_l2 --> 9081.169 | Lr --> 0.001 | Seconds_per_step --> 4.464 |
|
1344 |
+
[2024-08-12 01:42:56,472][Main][INFO] - [train] Step 61650 out of 80000 | Loss --> 1.762 | Grad_l2 --> 0.316 | Weights_l2 --> 9081.077 | Lr --> 0.001 | Seconds_per_step --> 4.373 |
|
1345 |
+
[2024-08-12 01:46:34,161][Main][INFO] - [train] Step 61700 out of 80000 | Loss --> 1.771 | Grad_l2 --> 0.317 | Weights_l2 --> 9080.988 | Lr --> 0.001 | Seconds_per_step --> 4.354 |
|
1346 |
+
[2024-08-12 01:50:15,341][Main][INFO] - [train] Step 61750 out of 80000 | Loss --> 1.759 | Grad_l2 --> 0.315 | Weights_l2 --> 9080.903 | Lr --> 0.001 | Seconds_per_step --> 4.424 |
|
1347 |
+
[2024-08-12 01:53:56,286][Main][INFO] - [train] Step 61800 out of 80000 | Loss --> 1.760 | Grad_l2 --> 0.316 | Weights_l2 --> 9080.810 | Lr --> 0.001 | Seconds_per_step --> 4.419 |
|
1348 |
+
[2024-08-12 01:57:35,453][Main][INFO] - [train] Step 61850 out of 80000 | Loss --> 1.764 | Grad_l2 --> 0.317 | Weights_l2 --> 9080.725 | Lr --> 0.001 | Seconds_per_step --> 4.383 |
|
1349 |
+
[2024-08-12 02:01:14,106][Main][INFO] - [train] Step 61900 out of 80000 | Loss --> 1.765 | Grad_l2 --> 0.316 | Weights_l2 --> 9080.646 | Lr --> 0.001 | Seconds_per_step --> 4.373 |
|
1350 |
+
[2024-08-12 02:04:55,693][Main][INFO] - [train] Step 61950 out of 80000 | Loss --> 1.756 | Grad_l2 --> 0.316 | Weights_l2 --> 9080.552 | Lr --> 0.001 | Seconds_per_step --> 4.432 |
|
1351 |
+
[2024-08-12 02:08:35,956][Main][INFO] - [train] Step 62000 out of 80000 | Loss --> 1.757 | Grad_l2 --> 0.317 | Weights_l2 --> 9080.465 | Lr --> 0.001 | Seconds_per_step --> 4.405 |
|
1352 |
+
[2024-08-12 02:12:08,062][Main][INFO] - [train] Step 62050 out of 80000 | Loss --> 1.763 | Grad_l2 --> 0.316 | Weights_l2 --> 9080.379 | Lr --> 0.001 | Seconds_per_step --> 4.242 |
|
1353 |
+
[2024-08-12 02:15:46,511][Main][INFO] - [train] Step 62100 out of 80000 | Loss --> 1.762 | Grad_l2 --> 0.316 | Weights_l2 --> 9080.297 | Lr --> 0.001 | Seconds_per_step --> 4.369 |
|
1354 |
+
[2024-08-12 02:19:22,962][Main][INFO] - [train] Step 62150 out of 80000 | Loss --> 1.759 | Grad_l2 --> 0.318 | Weights_l2 --> 9080.214 | Lr --> 0.001 | Seconds_per_step --> 4.329 |
|
1355 |
+
[2024-08-12 02:22:58,963][Main][INFO] - [train] Step 62200 out of 80000 | Loss --> 1.760 | Grad_l2 --> 0.317 | Weights_l2 --> 9080.127 | Lr --> 0.001 | Seconds_per_step --> 4.320 |
|
1356 |
+
[2024-08-12 02:26:37,400][Main][INFO] - [train] Step 62250 out of 80000 | Loss --> 1.757 | Grad_l2 --> 0.317 | Weights_l2 --> 9080.042 | Lr --> 0.001 | Seconds_per_step --> 4.369 |
|
1357 |
+
[2024-08-12 02:30:16,727][Main][INFO] - [train] Step 62300 out of 80000 | Loss --> 1.772 | Grad_l2 --> 0.319 | Weights_l2 --> 9079.958 | Lr --> 0.001 | Seconds_per_step --> 4.387 |
|
1358 |
+
[2024-08-12 02:33:56,219][Main][INFO] - [train] Step 62350 out of 80000 | Loss --> 1.757 | Grad_l2 --> 0.316 | Weights_l2 --> 9079.868 | Lr --> 0.001 | Seconds_per_step --> 4.390 |
|
1359 |
+
[2024-08-12 02:37:37,466][Main][INFO] - [train] Step 62400 out of 80000 | Loss --> 1.758 | Grad_l2 --> 0.318 | Weights_l2 --> 9079.775 | Lr --> 0.001 | Seconds_per_step --> 4.425 |
|
1360 |
+
[2024-08-12 02:41:13,861][Main][INFO] - [train] Step 62450 out of 80000 | Loss --> 1.756 | Grad_l2 --> 0.318 | Weights_l2 --> 9079.691 | Lr --> 0.001 | Seconds_per_step --> 4.328 |
|
1361 |
+
[2024-08-12 02:44:48,098][Main][INFO] - [train] Step 62500 out of 80000 | Loss --> 1.754 | Grad_l2 --> 0.316 | Weights_l2 --> 9079.610 | Lr --> 0.001 | Seconds_per_step --> 4.285 |
|
checkpoints/seconds_per_step_over_steps.png
CHANGED
![]() |
![]() |
checkpoints/training_metrics.csv
CHANGED
@@ -1199,3 +1199,53 @@ timestamp,step,loss,grad_l2,weights_l2,lr,seconds_per_step
|
|
1199 |
"2024-08-11 23:33:42,430",59900,1.761,0.312,9084.295,0.002,4.371
|
1200 |
"2024-08-11 23:37:30,256",59950,1.749,0.313,9084.198,0.002,4.556
|
1201 |
"2024-08-11 23:41:15,929",60000,1.763,0.311,9084.104,0.002,4.513
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1199 |
"2024-08-11 23:33:42,430",59900,1.761,0.312,9084.295,0.002,4.371
|
1200 |
"2024-08-11 23:37:30,256",59950,1.749,0.313,9084.198,0.002,4.556
|
1201 |
"2024-08-11 23:41:15,929",60000,1.763,0.311,9084.104,0.002,4.513
|
1202 |
+
"2024-08-11 23:45:07,108",60050,1.754,0.312,9084.007,0.002,4.624
|
1203 |
+
"2024-08-11 23:48:44,602",60100,1.76,0.31,9083.915,0.002,4.35
|
1204 |
+
"2024-08-11 23:52:26,043",60150,1.76,0.312,9083.829,0.001,4.429
|
1205 |
+
"2024-08-11 23:56:03,737",60200,1.771,0.311,9083.733,0.001,4.354
|
1206 |
+
"2024-08-11 23:59:47,660",60250,1.767,0.312,9083.64,0.001,4.478
|
1207 |
+
"2024-08-12 00:03:32,244",60300,1.771,0.313,9083.55,0.001,4.492
|
1208 |
+
"2024-08-12 00:07:17,431",60350,1.78,0.314,9083.451,0.001,4.504
|
1209 |
+
"2024-08-12 00:11:01,326",60400,1.768,0.311,9083.362,0.001,4.478
|
1210 |
+
"2024-08-12 00:14:45,402",60450,1.779,0.313,9083.267,0.001,4.482
|
1211 |
+
"2024-08-12 00:18:30,537",60500,1.78,0.314,9083.181,0.001,4.503
|
1212 |
+
"2024-08-12 00:22:22,904",60550,1.776,0.312,9083.093,0.001,4.647
|
1213 |
+
"2024-08-12 00:26:07,858",60600,1.779,0.315,9082.995,0.001,4.499
|
1214 |
+
"2024-08-12 00:29:46,792",60650,1.774,0.312,9082.9,0.001,4.379
|
1215 |
+
"2024-08-12 00:33:31,405",60700,1.774,0.313,9082.805,0.001,4.492
|
1216 |
+
"2024-08-12 00:37:13,998",60750,1.771,0.313,9082.715,0.001,4.452
|
1217 |
+
"2024-08-12 00:40:53,937",60800,1.766,0.314,9082.626,0.001,4.399
|
1218 |
+
"2024-08-12 00:44:29,587",60850,1.773,0.315,9082.535,0.001,4.313
|
1219 |
+
"2024-08-12 00:48:07,873",60900,1.778,0.315,9082.441,0.001,4.366
|
1220 |
+
"2024-08-12 00:51:44,166",60950,1.784,0.314,9082.349,0.001,4.326
|
1221 |
+
"2024-08-12 00:55:24,836",61000,1.775,0.313,9082.26,0.001,4.413
|
1222 |
+
"2024-08-12 00:59:05,951",61050,1.77,0.314,9082.17,0.001,4.422
|
1223 |
+
"2024-08-12 01:02:44,096",61100,1.763,0.315,9082.077,0.001,4.363
|
1224 |
+
"2024-08-12 01:06:23,695",61150,1.771,0.315,9081.988,0.001,4.392
|
1225 |
+
"2024-08-12 01:10:01,742",61200,1.769,0.315,9081.898,0.001,4.361
|
1226 |
+
"2024-08-12 01:13:39,844",61250,1.767,0.315,9081.809,0.001,4.362
|
1227 |
+
"2024-08-12 01:17:15,187",61300,1.767,0.316,9081.72,0.001,4.307
|
1228 |
+
"2024-08-12 01:20:53,488",61350,1.762,0.318,9081.631,0.001,4.366
|
1229 |
+
"2024-08-12 01:24:34,284",61400,1.764,0.317,9081.538,0.001,4.416
|
1230 |
+
"2024-08-12 01:28:12,531",61450,1.768,0.317,9081.448,0.001,4.365
|
1231 |
+
"2024-08-12 01:31:52,508",61500,1.77,0.314,9081.354,0.001,4.4
|
1232 |
+
"2024-08-12 01:35:34,640",61550,1.76,0.314,9081.26,0.001,4.443
|
1233 |
+
"2024-08-12 01:39:17,817",61600,1.766,0.313,9081.169,0.001,4.464
|
1234 |
+
"2024-08-12 01:42:56,472",61650,1.762,0.316,9081.077,0.001,4.373
|
1235 |
+
"2024-08-12 01:46:34,161",61700,1.771,0.317,9080.988,0.001,4.354
|
1236 |
+
"2024-08-12 01:50:15,341",61750,1.759,0.315,9080.903,0.001,4.424
|
1237 |
+
"2024-08-12 01:53:56,286",61800,1.76,0.316,9080.81,0.001,4.419
|
1238 |
+
"2024-08-12 01:57:35,453",61850,1.764,0.317,9080.725,0.001,4.383
|
1239 |
+
"2024-08-12 02:01:14,106",61900,1.765,0.316,9080.646,0.001,4.373
|
1240 |
+
"2024-08-12 02:04:55,693",61950,1.756,0.316,9080.552,0.001,4.432
|
1241 |
+
"2024-08-12 02:08:35,956",62000,1.757,0.317,9080.465,0.001,4.405
|
1242 |
+
"2024-08-12 02:12:08,062",62050,1.763,0.316,9080.379,0.001,4.242
|
1243 |
+
"2024-08-12 02:15:46,511",62100,1.762,0.316,9080.297,0.001,4.369
|
1244 |
+
"2024-08-12 02:19:22,962",62150,1.759,0.318,9080.214,0.001,4.329
|
1245 |
+
"2024-08-12 02:22:58,963",62200,1.76,0.317,9080.127,0.001,4.32
|
1246 |
+
"2024-08-12 02:26:37,400",62250,1.757,0.317,9080.042,0.001,4.369
|
1247 |
+
"2024-08-12 02:30:16,727",62300,1.772,0.319,9079.958,0.001,4.387
|
1248 |
+
"2024-08-12 02:33:56,219",62350,1.757,0.316,9079.868,0.001,4.39
|
1249 |
+
"2024-08-12 02:37:37,466",62400,1.758,0.318,9079.775,0.001,4.425
|
1250 |
+
"2024-08-12 02:41:13,861",62450,1.756,0.318,9079.691,0.001,4.328
|
1251 |
+
"2024-08-12 02:44:48,098",62500,1.754,0.316,9079.61,0.001,4.285
|
checkpoints/weights_l2_over_steps.png
CHANGED
![]() |
![]() |