Br-T-GPT-1 / training_report.txt
Bertug1911's picture
Upload 7 files
2585a35 verified
Transformer Decoder-Only Model Training Report
===========================================
Training Start Time: Wed Mar 26 20:17:27 2025
Total Training Time: 191.49 seconds
Total Epochs: 1
Total Iterations: 547
Batch Size: 64
Learning Rate: 0.0001
Max Sequence Length: 128
Dataset Limit: 35000 rows
Epoch-wise Average Loss Values:
Epoch 1: 8.3391