Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning Paper โข 2510.25992 โข Published Oct 29, 2025 โข 46 โข 3