Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs Paper • 2506.14245 • Published 25 days ago • 39
Xolver: Multi-Agent Reasoning with Holistic Experience Learning Just Like an Olympiad Team Paper • 2506.14234 • Published 25 days ago • 38
MultiFinBen: A Multilingual, Multimodal, and Difficulty-Aware Benchmark for Financial LLM Evaluation Paper • 2506.14028 • Published 26 days ago • 91
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs Paper • 2506.14429 • Published 25 days ago • 44
GMT: General Motion Tracking for Humanoid Whole-Body Control Paper • 2506.14770 • Published 25 days ago • 8
SwarmAgentic: Towards Fully Automated Agentic System Generation via Swarm Intelligence Paper • 2506.15672 • Published 24 days ago • 13
Semantically-Aware Rewards for Open-Ended R1 Training in Free-Form Generation Paper • 2506.15068 • Published 25 days ago • 14
Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence Paper • 2506.15677 • Published 24 days ago • 24
ProtoReasoning: Prototypes as the Foundation for Generalizable Reasoning in LLMs Paper • 2506.15211 • Published 24 days ago • 35
GenRecal: Generation after Recalibration from Large to Small Vision-Language Models Paper • 2506.15681 • Published 24 days ago • 37
Improved Iterative Refinement for Chart-to-Code Generation via Structured Instruction Paper • 2506.14837 • Published 27 days ago • 11
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective Paper • 2506.14965 • Published 25 days ago • 47
Dirichlet Flow Matching with Applications to DNA Sequence Design Paper • 2402.05841 • Published Feb 8, 2024 • 2
Gumbel-Softmax Flow Matching with Straight-Through Guidance for Controllable Biological Sequence Generation Paper • 2503.17361 • Published Mar 21 • 5