--- base_model: - open-r1/Qwen2.5-Math-7B-RoPE-300k - Qwen/Qwen2.5-Math-7B datasets: - Elliott/Openr1-Math-46k-8192 license: mit pipeline_tag: text-generation library_name: transformers arxiv: 2506.19767 --- # 📄 Introduction Supervised Reinforcement Fine-Tuning (SRFT) is a single-stage method that unifies both fine-tuning paradigms through entropy-aware weighting mechanisms. Paper: [arXiv](https://arxiv.org/abs/2506.19767) Project Website: [SRFT](https://anonymous.4open.science/w/SRFT2025)