Repo for paper Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability.
Qihan Ren
jasonrqh
AI & ML interests
XAI, LLM reasoning & safety, Coding agent
Recent Activity
upvoted a paper about 5 hours ago
Scaling the Horizon, Not the Parameters: Reaching Trillion-Parameter Performance with a 35B Agent upvoted a paper 4 days ago
The Verification Horizon: No Silver Bullet for Coding Agent Rewards