nibauman
/

MPCxR1_Qwen1.5B_SFT_GRPO

nibauman commited on Apr 20

Commit

8a459b9

verified ·

1 Parent(s): 02307fb

Upload folder using huggingface_hub

Files changed (1) hide show

README.md CHANGED Viewed

@@ -8,3 +8,6 @@ library_name: peft
 <!-- Provide a quick summary of what the model is/does. -->
 This is the model that is used to get the paper results for the MPCxR1 Qwen2.5 1.5B SFT GRPO model.
 This model was evaluated on the 19.04.25. and trained on the 18.04.25 at 15:30:47.

 <!-- Provide a quick summary of what the model is/does. -->
 This is the model that is used to get the paper results for the MPCxR1 Qwen2.5 1.5B SFT GRPO model.
 This model was evaluated on the 19.04.25. and trained on the 18.04.25 at 15:30:47.
+Base model was "nibauman/race_llm_Qwen_1_5B_sft" [here](https://huggingface.co/nibauman/race_llm_Qwen_1_5B_sft)
+This is the wandb train: https://wandb.ai/CoRL-heist-2025/mpc_grpo/runs/4ydp2ilr?nw=nwusernibaumaneth