pszemraj commited on
Commit
a70b924
·
verified ·
1 Parent(s): 39bce54

Upload folder using huggingface_hub

Browse files
checkpoints/grad_l2_over_steps.png CHANGED
checkpoints/loss_over_steps.png CHANGED
checkpoints/lr_over_steps.png CHANGED
checkpoints/main.log CHANGED
@@ -1309,3 +1309,53 @@ Mixed precision type: bf16
1309
  [2024-08-11 23:41:22,601][accelerate.checkpointing][INFO] - Sampler state for dataloader 0 saved in checkpoint-pt-60000/sampler.bin
1310
  [2024-08-11 23:41:22,601][accelerate.checkpointing][INFO] - Sampler state for dataloader 1 saved in checkpoint-pt-60000/sampler_1.bin
1311
  [2024-08-11 23:41:22,602][accelerate.checkpointing][INFO] - Random states saved in checkpoint-pt-60000/random_states_0.pkl
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1309
  [2024-08-11 23:41:22,601][accelerate.checkpointing][INFO] - Sampler state for dataloader 0 saved in checkpoint-pt-60000/sampler.bin
1310
  [2024-08-11 23:41:22,601][accelerate.checkpointing][INFO] - Sampler state for dataloader 1 saved in checkpoint-pt-60000/sampler_1.bin
1311
  [2024-08-11 23:41:22,602][accelerate.checkpointing][INFO] - Random states saved in checkpoint-pt-60000/random_states_0.pkl
1312
+ [2024-08-11 23:45:07,108][Main][INFO] - [train] Step 60050 out of 80000 | Loss --> 1.754 | Grad_l2 --> 0.312 | Weights_l2 --> 9084.007 | Lr --> 0.002 | Seconds_per_step --> 4.624 |
1313
+ [2024-08-11 23:48:44,602][Main][INFO] - [train] Step 60100 out of 80000 | Loss --> 1.760 | Grad_l2 --> 0.310 | Weights_l2 --> 9083.915 | Lr --> 0.002 | Seconds_per_step --> 4.350 |
1314
+ [2024-08-11 23:52:26,043][Main][INFO] - [train] Step 60150 out of 80000 | Loss --> 1.760 | Grad_l2 --> 0.312 | Weights_l2 --> 9083.829 | Lr --> 0.001 | Seconds_per_step --> 4.429 |
1315
+ [2024-08-11 23:56:03,737][Main][INFO] - [train] Step 60200 out of 80000 | Loss --> 1.771 | Grad_l2 --> 0.311 | Weights_l2 --> 9083.733 | Lr --> 0.001 | Seconds_per_step --> 4.354 |
1316
+ [2024-08-11 23:59:47,660][Main][INFO] - [train] Step 60250 out of 80000 | Loss --> 1.767 | Grad_l2 --> 0.312 | Weights_l2 --> 9083.640 | Lr --> 0.001 | Seconds_per_step --> 4.478 |
1317
+ [2024-08-12 00:03:32,244][Main][INFO] - [train] Step 60300 out of 80000 | Loss --> 1.771 | Grad_l2 --> 0.313 | Weights_l2 --> 9083.550 | Lr --> 0.001 | Seconds_per_step --> 4.492 |
1318
+ [2024-08-12 00:07:17,431][Main][INFO] - [train] Step 60350 out of 80000 | Loss --> 1.780 | Grad_l2 --> 0.314 | Weights_l2 --> 9083.451 | Lr --> 0.001 | Seconds_per_step --> 4.504 |
1319
+ [2024-08-12 00:11:01,326][Main][INFO] - [train] Step 60400 out of 80000 | Loss --> 1.768 | Grad_l2 --> 0.311 | Weights_l2 --> 9083.362 | Lr --> 0.001 | Seconds_per_step --> 4.478 |
1320
+ [2024-08-12 00:14:45,402][Main][INFO] - [train] Step 60450 out of 80000 | Loss --> 1.779 | Grad_l2 --> 0.313 | Weights_l2 --> 9083.267 | Lr --> 0.001 | Seconds_per_step --> 4.482 |
1321
+ [2024-08-12 00:18:30,537][Main][INFO] - [train] Step 60500 out of 80000 | Loss --> 1.780 | Grad_l2 --> 0.314 | Weights_l2 --> 9083.181 | Lr --> 0.001 | Seconds_per_step --> 4.503 |
1322
+ [2024-08-12 00:22:22,904][Main][INFO] - [train] Step 60550 out of 80000 | Loss --> 1.776 | Grad_l2 --> 0.312 | Weights_l2 --> 9083.093 | Lr --> 0.001 | Seconds_per_step --> 4.647 |
1323
+ [2024-08-12 00:26:07,858][Main][INFO] - [train] Step 60600 out of 80000 | Loss --> 1.779 | Grad_l2 --> 0.315 | Weights_l2 --> 9082.995 | Lr --> 0.001 | Seconds_per_step --> 4.499 |
1324
+ [2024-08-12 00:29:46,792][Main][INFO] - [train] Step 60650 out of 80000 | Loss --> 1.774 | Grad_l2 --> 0.312 | Weights_l2 --> 9082.900 | Lr --> 0.001 | Seconds_per_step --> 4.379 |
1325
+ [2024-08-12 00:33:31,405][Main][INFO] - [train] Step 60700 out of 80000 | Loss --> 1.774 | Grad_l2 --> 0.313 | Weights_l2 --> 9082.805 | Lr --> 0.001 | Seconds_per_step --> 4.492 |
1326
+ [2024-08-12 00:37:13,998][Main][INFO] - [train] Step 60750 out of 80000 | Loss --> 1.771 | Grad_l2 --> 0.313 | Weights_l2 --> 9082.715 | Lr --> 0.001 | Seconds_per_step --> 4.452 |
1327
+ [2024-08-12 00:40:53,937][Main][INFO] - [train] Step 60800 out of 80000 | Loss --> 1.766 | Grad_l2 --> 0.314 | Weights_l2 --> 9082.626 | Lr --> 0.001 | Seconds_per_step --> 4.399 |
1328
+ [2024-08-12 00:44:29,587][Main][INFO] - [train] Step 60850 out of 80000 | Loss --> 1.773 | Grad_l2 --> 0.315 | Weights_l2 --> 9082.535 | Lr --> 0.001 | Seconds_per_step --> 4.313 |
1329
+ [2024-08-12 00:48:07,873][Main][INFO] - [train] Step 60900 out of 80000 | Loss --> 1.778 | Grad_l2 --> 0.315 | Weights_l2 --> 9082.441 | Lr --> 0.001 | Seconds_per_step --> 4.366 |
1330
+ [2024-08-12 00:51:44,166][Main][INFO] - [train] Step 60950 out of 80000 | Loss --> 1.784 | Grad_l2 --> 0.314 | Weights_l2 --> 9082.349 | Lr --> 0.001 | Seconds_per_step --> 4.326 |
1331
+ [2024-08-12 00:55:24,836][Main][INFO] - [train] Step 61000 out of 80000 | Loss --> 1.775 | Grad_l2 --> 0.313 | Weights_l2 --> 9082.260 | Lr --> 0.001 | Seconds_per_step --> 4.413 |
1332
+ [2024-08-12 00:59:05,951][Main][INFO] - [train] Step 61050 out of 80000 | Loss --> 1.770 | Grad_l2 --> 0.314 | Weights_l2 --> 9082.170 | Lr --> 0.001 | Seconds_per_step --> 4.422 |
1333
+ [2024-08-12 01:02:44,096][Main][INFO] - [train] Step 61100 out of 80000 | Loss --> 1.763 | Grad_l2 --> 0.315 | Weights_l2 --> 9082.077 | Lr --> 0.001 | Seconds_per_step --> 4.363 |
1334
+ [2024-08-12 01:06:23,695][Main][INFO] - [train] Step 61150 out of 80000 | Loss --> 1.771 | Grad_l2 --> 0.315 | Weights_l2 --> 9081.988 | Lr --> 0.001 | Seconds_per_step --> 4.392 |
1335
+ [2024-08-12 01:10:01,742][Main][INFO] - [train] Step 61200 out of 80000 | Loss --> 1.769 | Grad_l2 --> 0.315 | Weights_l2 --> 9081.898 | Lr --> 0.001 | Seconds_per_step --> 4.361 |
1336
+ [2024-08-12 01:13:39,844][Main][INFO] - [train] Step 61250 out of 80000 | Loss --> 1.767 | Grad_l2 --> 0.315 | Weights_l2 --> 9081.809 | Lr --> 0.001 | Seconds_per_step --> 4.362 |
1337
+ [2024-08-12 01:17:15,187][Main][INFO] - [train] Step 61300 out of 80000 | Loss --> 1.767 | Grad_l2 --> 0.316 | Weights_l2 --> 9081.720 | Lr --> 0.001 | Seconds_per_step --> 4.307 |
1338
+ [2024-08-12 01:20:53,488][Main][INFO] - [train] Step 61350 out of 80000 | Loss --> 1.762 | Grad_l2 --> 0.318 | Weights_l2 --> 9081.631 | Lr --> 0.001 | Seconds_per_step --> 4.366 |
1339
+ [2024-08-12 01:24:34,284][Main][INFO] - [train] Step 61400 out of 80000 | Loss --> 1.764 | Grad_l2 --> 0.317 | Weights_l2 --> 9081.538 | Lr --> 0.001 | Seconds_per_step --> 4.416 |
1340
+ [2024-08-12 01:28:12,531][Main][INFO] - [train] Step 61450 out of 80000 | Loss --> 1.768 | Grad_l2 --> 0.317 | Weights_l2 --> 9081.448 | Lr --> 0.001 | Seconds_per_step --> 4.365 |
1341
+ [2024-08-12 01:31:52,508][Main][INFO] - [train] Step 61500 out of 80000 | Loss --> 1.770 | Grad_l2 --> 0.314 | Weights_l2 --> 9081.354 | Lr --> 0.001 | Seconds_per_step --> 4.400 |
1342
+ [2024-08-12 01:35:34,640][Main][INFO] - [train] Step 61550 out of 80000 | Loss --> 1.760 | Grad_l2 --> 0.314 | Weights_l2 --> 9081.260 | Lr --> 0.001 | Seconds_per_step --> 4.443 |
1343
+ [2024-08-12 01:39:17,817][Main][INFO] - [train] Step 61600 out of 80000 | Loss --> 1.766 | Grad_l2 --> 0.313 | Weights_l2 --> 9081.169 | Lr --> 0.001 | Seconds_per_step --> 4.464 |
1344
+ [2024-08-12 01:42:56,472][Main][INFO] - [train] Step 61650 out of 80000 | Loss --> 1.762 | Grad_l2 --> 0.316 | Weights_l2 --> 9081.077 | Lr --> 0.001 | Seconds_per_step --> 4.373 |
1345
+ [2024-08-12 01:46:34,161][Main][INFO] - [train] Step 61700 out of 80000 | Loss --> 1.771 | Grad_l2 --> 0.317 | Weights_l2 --> 9080.988 | Lr --> 0.001 | Seconds_per_step --> 4.354 |
1346
+ [2024-08-12 01:50:15,341][Main][INFO] - [train] Step 61750 out of 80000 | Loss --> 1.759 | Grad_l2 --> 0.315 | Weights_l2 --> 9080.903 | Lr --> 0.001 | Seconds_per_step --> 4.424 |
1347
+ [2024-08-12 01:53:56,286][Main][INFO] - [train] Step 61800 out of 80000 | Loss --> 1.760 | Grad_l2 --> 0.316 | Weights_l2 --> 9080.810 | Lr --> 0.001 | Seconds_per_step --> 4.419 |
1348
+ [2024-08-12 01:57:35,453][Main][INFO] - [train] Step 61850 out of 80000 | Loss --> 1.764 | Grad_l2 --> 0.317 | Weights_l2 --> 9080.725 | Lr --> 0.001 | Seconds_per_step --> 4.383 |
1349
+ [2024-08-12 02:01:14,106][Main][INFO] - [train] Step 61900 out of 80000 | Loss --> 1.765 | Grad_l2 --> 0.316 | Weights_l2 --> 9080.646 | Lr --> 0.001 | Seconds_per_step --> 4.373 |
1350
+ [2024-08-12 02:04:55,693][Main][INFO] - [train] Step 61950 out of 80000 | Loss --> 1.756 | Grad_l2 --> 0.316 | Weights_l2 --> 9080.552 | Lr --> 0.001 | Seconds_per_step --> 4.432 |
1351
+ [2024-08-12 02:08:35,956][Main][INFO] - [train] Step 62000 out of 80000 | Loss --> 1.757 | Grad_l2 --> 0.317 | Weights_l2 --> 9080.465 | Lr --> 0.001 | Seconds_per_step --> 4.405 |
1352
+ [2024-08-12 02:12:08,062][Main][INFO] - [train] Step 62050 out of 80000 | Loss --> 1.763 | Grad_l2 --> 0.316 | Weights_l2 --> 9080.379 | Lr --> 0.001 | Seconds_per_step --> 4.242 |
1353
+ [2024-08-12 02:15:46,511][Main][INFO] - [train] Step 62100 out of 80000 | Loss --> 1.762 | Grad_l2 --> 0.316 | Weights_l2 --> 9080.297 | Lr --> 0.001 | Seconds_per_step --> 4.369 |
1354
+ [2024-08-12 02:19:22,962][Main][INFO] - [train] Step 62150 out of 80000 | Loss --> 1.759 | Grad_l2 --> 0.318 | Weights_l2 --> 9080.214 | Lr --> 0.001 | Seconds_per_step --> 4.329 |
1355
+ [2024-08-12 02:22:58,963][Main][INFO] - [train] Step 62200 out of 80000 | Loss --> 1.760 | Grad_l2 --> 0.317 | Weights_l2 --> 9080.127 | Lr --> 0.001 | Seconds_per_step --> 4.320 |
1356
+ [2024-08-12 02:26:37,400][Main][INFO] - [train] Step 62250 out of 80000 | Loss --> 1.757 | Grad_l2 --> 0.317 | Weights_l2 --> 9080.042 | Lr --> 0.001 | Seconds_per_step --> 4.369 |
1357
+ [2024-08-12 02:30:16,727][Main][INFO] - [train] Step 62300 out of 80000 | Loss --> 1.772 | Grad_l2 --> 0.319 | Weights_l2 --> 9079.958 | Lr --> 0.001 | Seconds_per_step --> 4.387 |
1358
+ [2024-08-12 02:33:56,219][Main][INFO] - [train] Step 62350 out of 80000 | Loss --> 1.757 | Grad_l2 --> 0.316 | Weights_l2 --> 9079.868 | Lr --> 0.001 | Seconds_per_step --> 4.390 |
1359
+ [2024-08-12 02:37:37,466][Main][INFO] - [train] Step 62400 out of 80000 | Loss --> 1.758 | Grad_l2 --> 0.318 | Weights_l2 --> 9079.775 | Lr --> 0.001 | Seconds_per_step --> 4.425 |
1360
+ [2024-08-12 02:41:13,861][Main][INFO] - [train] Step 62450 out of 80000 | Loss --> 1.756 | Grad_l2 --> 0.318 | Weights_l2 --> 9079.691 | Lr --> 0.001 | Seconds_per_step --> 4.328 |
1361
+ [2024-08-12 02:44:48,098][Main][INFO] - [train] Step 62500 out of 80000 | Loss --> 1.754 | Grad_l2 --> 0.316 | Weights_l2 --> 9079.610 | Lr --> 0.001 | Seconds_per_step --> 4.285 |
checkpoints/seconds_per_step_over_steps.png CHANGED
checkpoints/training_metrics.csv CHANGED
@@ -1199,3 +1199,53 @@ timestamp,step,loss,grad_l2,weights_l2,lr,seconds_per_step
1199
  "2024-08-11 23:33:42,430",59900,1.761,0.312,9084.295,0.002,4.371
1200
  "2024-08-11 23:37:30,256",59950,1.749,0.313,9084.198,0.002,4.556
1201
  "2024-08-11 23:41:15,929",60000,1.763,0.311,9084.104,0.002,4.513
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1199
  "2024-08-11 23:33:42,430",59900,1.761,0.312,9084.295,0.002,4.371
1200
  "2024-08-11 23:37:30,256",59950,1.749,0.313,9084.198,0.002,4.556
1201
  "2024-08-11 23:41:15,929",60000,1.763,0.311,9084.104,0.002,4.513
1202
+ "2024-08-11 23:45:07,108",60050,1.754,0.312,9084.007,0.002,4.624
1203
+ "2024-08-11 23:48:44,602",60100,1.76,0.31,9083.915,0.002,4.35
1204
+ "2024-08-11 23:52:26,043",60150,1.76,0.312,9083.829,0.001,4.429
1205
+ "2024-08-11 23:56:03,737",60200,1.771,0.311,9083.733,0.001,4.354
1206
+ "2024-08-11 23:59:47,660",60250,1.767,0.312,9083.64,0.001,4.478
1207
+ "2024-08-12 00:03:32,244",60300,1.771,0.313,9083.55,0.001,4.492
1208
+ "2024-08-12 00:07:17,431",60350,1.78,0.314,9083.451,0.001,4.504
1209
+ "2024-08-12 00:11:01,326",60400,1.768,0.311,9083.362,0.001,4.478
1210
+ "2024-08-12 00:14:45,402",60450,1.779,0.313,9083.267,0.001,4.482
1211
+ "2024-08-12 00:18:30,537",60500,1.78,0.314,9083.181,0.001,4.503
1212
+ "2024-08-12 00:22:22,904",60550,1.776,0.312,9083.093,0.001,4.647
1213
+ "2024-08-12 00:26:07,858",60600,1.779,0.315,9082.995,0.001,4.499
1214
+ "2024-08-12 00:29:46,792",60650,1.774,0.312,9082.9,0.001,4.379
1215
+ "2024-08-12 00:33:31,405",60700,1.774,0.313,9082.805,0.001,4.492
1216
+ "2024-08-12 00:37:13,998",60750,1.771,0.313,9082.715,0.001,4.452
1217
+ "2024-08-12 00:40:53,937",60800,1.766,0.314,9082.626,0.001,4.399
1218
+ "2024-08-12 00:44:29,587",60850,1.773,0.315,9082.535,0.001,4.313
1219
+ "2024-08-12 00:48:07,873",60900,1.778,0.315,9082.441,0.001,4.366
1220
+ "2024-08-12 00:51:44,166",60950,1.784,0.314,9082.349,0.001,4.326
1221
+ "2024-08-12 00:55:24,836",61000,1.775,0.313,9082.26,0.001,4.413
1222
+ "2024-08-12 00:59:05,951",61050,1.77,0.314,9082.17,0.001,4.422
1223
+ "2024-08-12 01:02:44,096",61100,1.763,0.315,9082.077,0.001,4.363
1224
+ "2024-08-12 01:06:23,695",61150,1.771,0.315,9081.988,0.001,4.392
1225
+ "2024-08-12 01:10:01,742",61200,1.769,0.315,9081.898,0.001,4.361
1226
+ "2024-08-12 01:13:39,844",61250,1.767,0.315,9081.809,0.001,4.362
1227
+ "2024-08-12 01:17:15,187",61300,1.767,0.316,9081.72,0.001,4.307
1228
+ "2024-08-12 01:20:53,488",61350,1.762,0.318,9081.631,0.001,4.366
1229
+ "2024-08-12 01:24:34,284",61400,1.764,0.317,9081.538,0.001,4.416
1230
+ "2024-08-12 01:28:12,531",61450,1.768,0.317,9081.448,0.001,4.365
1231
+ "2024-08-12 01:31:52,508",61500,1.77,0.314,9081.354,0.001,4.4
1232
+ "2024-08-12 01:35:34,640",61550,1.76,0.314,9081.26,0.001,4.443
1233
+ "2024-08-12 01:39:17,817",61600,1.766,0.313,9081.169,0.001,4.464
1234
+ "2024-08-12 01:42:56,472",61650,1.762,0.316,9081.077,0.001,4.373
1235
+ "2024-08-12 01:46:34,161",61700,1.771,0.317,9080.988,0.001,4.354
1236
+ "2024-08-12 01:50:15,341",61750,1.759,0.315,9080.903,0.001,4.424
1237
+ "2024-08-12 01:53:56,286",61800,1.76,0.316,9080.81,0.001,4.419
1238
+ "2024-08-12 01:57:35,453",61850,1.764,0.317,9080.725,0.001,4.383
1239
+ "2024-08-12 02:01:14,106",61900,1.765,0.316,9080.646,0.001,4.373
1240
+ "2024-08-12 02:04:55,693",61950,1.756,0.316,9080.552,0.001,4.432
1241
+ "2024-08-12 02:08:35,956",62000,1.757,0.317,9080.465,0.001,4.405
1242
+ "2024-08-12 02:12:08,062",62050,1.763,0.316,9080.379,0.001,4.242
1243
+ "2024-08-12 02:15:46,511",62100,1.762,0.316,9080.297,0.001,4.369
1244
+ "2024-08-12 02:19:22,962",62150,1.759,0.318,9080.214,0.001,4.329
1245
+ "2024-08-12 02:22:58,963",62200,1.76,0.317,9080.127,0.001,4.32
1246
+ "2024-08-12 02:26:37,400",62250,1.757,0.317,9080.042,0.001,4.369
1247
+ "2024-08-12 02:30:16,727",62300,1.772,0.319,9079.958,0.001,4.387
1248
+ "2024-08-12 02:33:56,219",62350,1.757,0.316,9079.868,0.001,4.39
1249
+ "2024-08-12 02:37:37,466",62400,1.758,0.318,9079.775,0.001,4.425
1250
+ "2024-08-12 02:41:13,861",62450,1.756,0.318,9079.691,0.001,4.328
1251
+ "2024-08-12 02:44:48,098",62500,1.754,0.316,9079.61,0.001,4.285
checkpoints/weights_l2_over_steps.png CHANGED