ER-GRPO-alpha99 / train_results.json
lalalaDa's picture
Model save
014bb1e verified
{
"total_flos": 0.0,
"train_loss": 0.00044901110231876373,
"train_runtime": 4526.0548,
"train_samples": 7000,
"train_samples_per_second": 0.53,
"train_steps_per_second": 0.011
}