Model Card for Model ID

This is the model that is used to get the paper results for the MPCxR1 Qwen2.5 3B SFT GRPO model. This model was evaluated on the 19.04.25. and trained on the 18.04.25 at 13:29:26. The base model used for this training was "nibauman/race_llm_Qwen_3B_sft" here This is the wandb training: https://wandb.ai/CoRL-heist-2025/mpc_grpo/runs/osa45as5?nw=nwusernibaumaneth

Downloads last month
23
GGUF
Model size
3.09B params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation

5-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for nibauman/MPCxR1_Qwen3B_SFT_GRPO

Adapters
1 model