LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling Paper • 2606.18023 • Published 5 days ago • 193
RepFusion: Leveraging Multimodal Priors for Denoising in Representation Space Paper • 2606.14700 • Published 9 days ago • 15
Learning A Unified Risk Map for Autonomous Driving in Partially Observable Environments Paper • 2605.22189 • Published about 1 month ago • 8
TransitLM: A Large-Scale Dataset and Benchmark for Map-Free Transit Route Generation Paper • 2605.22355 • Published about 1 month ago • 179
Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining Paper • 2605.14747 • Published May 14 • 147
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published May 13 • 274
RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards Paper • 2605.10899 • Published May 11 • 79
Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context Paper • 2605.13831 • Published May 13 • 88
IntentGrasp: A Comprehensive Benchmark for Intent Understanding Paper • 2605.06832 • Published May 7 • 8
OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents Paper • 2605.05185 • Published May 6 • 106
Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising Paper • 2604.26694 • Published Apr 29 • 6