llama-3.1-8b-instruct_grpo-GSM8K / model-00001-of-00004.safetensors

Commit History