StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors Paper • 2512.16915 • Published 7 days ago • 37
FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning Paper • 2510.22543 • Published Oct 26 • 11
Agent Lightning: Train ANY AI Agents with Reinforcement Learning Paper • 2508.03680 • Published Aug 5 • 121