Transformer Decoder-Only Model Training Report =========================================== Training Start Time: Wed Mar 26 20:17:27 2025 Total Training Time: 191.49 seconds Total Epochs: 1 Total Iterations: 547 Batch Size: 64 Learning Rate: 0.0001 Max Sequence Length: 128 Dataset Limit: 35000 rows Epoch-wise Average Loss Values: Epoch 1: 8.3391