|
--- |
|
base_model: unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit |
|
library_name: peft |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
This is the model that is used to get the paper results for the MPCxR1 Qwen2.5 3B SFT GRPO model. |
|
This model was evaluated on the 19.04.25. and trained on the 18.04.25 at 13:29:26. |
|
The base model used for this training was "nibauman/race_llm_Qwen_3B_sft" [here](https://huggingface.co/nibauman/race_llm_Qwen_3B_sft) |
|
This is the wandb training: https://wandb.ai/CoRL-heist-2025/mpc_grpo/runs/osa45as5?nw=nwusernibaumaneth |
|
|
|
|
|
|