UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision Paper • 2601.03193 • Published 8 days ago • 44
From Word to World: Can Large Language Models be Implicit Text-based World Models? Paper • 2512.18832 • Published 24 days ago • 12
Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny Paper • 2507.16331 • Published Jul 22, 2025 • 20
Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models Paper • 2508.09874 • Published Aug 13, 2025 • 10
Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents Paper • 2509.26354 • Published Sep 30, 2025 • 17
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6, 2025 • 188
FlowRL: Matching Reward Distributions for LLM Reasoning Paper • 2509.15207 • Published Sep 18, 2025 • 115
FlowRL: Matching Reward Distributions for LLM Reasoning Paper • 2509.15207 • Published Sep 18, 2025 • 115