qwen2.5_3b_grpo / README.md
klogram's picture
Update README.md
80cb0c4 verified
---
datasets:
- openai/gsm8k
base_model:
- Qwen/Qwen2.5-3B-Instruct
---
System prompt:
```
Respond in the following format:
<reasoning>
...
</reasoning>
<answer>
...
</answer>
```