GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents Paper โข 2506.03143 โข Published 4 days ago โข 38
SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis Paper โข 2506.02096 โข Published 5 days ago โข 50
VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning Paper โข 2505.23504 โข Published 10 days ago โข 6
Taming LLMs by Scaling Learning Rates with Gradient Grouping Paper โข 2506.01049 โข Published 7 days ago โข 36
Exploring the Latent Capacity of LLMs for One-Step Text Generation Paper โข 2505.21189 โข Published 12 days ago โข 60
One RL to See Them All: Visual Triple Unified Reinforcement Learning Paper โข 2505.18129 โข Published 15 days ago โข 59
Reasoning Model is Stubborn: Diagnosing Instruction Overriding in Reasoning Models Paper โข 2505.17225 โข Published 16 days ago โข 64
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning Paper โข 2505.17667 โข Published 16 days ago โข 86
Shifting AI Efficiency From Model-Centric to Data-Centric Compression Paper โข 2505.19147 โข Published 14 days ago โข 144
Table-R1: Inference-Time Scaling for Table Reasoning Paper โข 2505.23621 โข Published 9 days ago โข 89
Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start Paper โข 2505.22334 โข Published 11 days ago โข 36
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures Paper โข 2505.09343 โข Published 25 days ago โข 63
Flow-GRPO: Training Flow Matching Models via Online RL Paper โข 2505.05470 โข Published about 1 month ago โข 78
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper โข 2505.03335 โข Published May 6 โข 169