Model Card for Model ID
This is the model that is used to get the paper results for the MPCxR1 Qwen2.5 1.5B SFT GRPO model. This model was evaluated on the 19.04.25. and trained on the 18.04.25 at 15:30:47.
Base model was "nibauman/race_llm_Qwen_1_5B_sft" here This is the wandb train: https://wandb.ai/CoRL-heist-2025/mpc_grpo/runs/4ydp2ilr?nw=nwusernibaumaneth
- Downloads last month
- 3
Hardware compatibility
Log In
to view the estimation
5-bit
16-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support