Qwen2.5-14B-Instruct-ultrafeedback-spin-iter1-RPO / model-00002-of-00006.safetensors

Commit History

Training in progress, epoch 1
333c0f9
verified

AmberYifan commited on