Transformer Decoder-Only Model Training Report | |
=========================================== | |
Training Start Time: Wed Mar 26 20:17:27 2025 | |
Total Training Time: 191.49 seconds | |
Total Epochs: 1 | |
Total Iterations: 547 | |
Batch Size: 64 | |
Learning Rate: 0.0001 | |
Max Sequence Length: 128 | |
Dataset Limit: 35000 rows | |
Epoch-wise Average Loss Values: | |
Epoch 1: 8.3391 | |