base_model: unsloth/qwen2.5-1.5b-instruct-unsloth-bnb-4bit | |
library_name: peft | |
# Model Card for Model ID | |
<!-- Provide a quick summary of what the model is/does. --> | |
This is the model that is used to get the paper results for the MPCxR1 Qwen2.5 1.5B SFT GRPO model. | |
This model was evaluated on the 19.04.25. and trained on the 18.04.25 at 15:30:47. | |
Base model was "nibauman/race_llm_Qwen_1_5B_sft" [here](https://huggingface.co/nibauman/race_llm_Qwen_1_5B_sft) | |
This is the wandb train: https://wandb.ai/CoRL-heist-2025/mpc_grpo/runs/4ydp2ilr?nw=nwusernibaumaneth |