Submitted by BoZhang 92 NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification · 25 authors 1
Submitted by Xiaoye08 51 Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models · 5 authors 3
Submitted by dongguanting 44 Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning · 10 authors 2
Submitted by wenhu 39 Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning · 5 authors 2
Submitted by Liang0223 37 KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models · 10 authors 2
Submitted by DongfuJiang 31 QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design · 5 authors 2
Submitted by gogoduan 23 GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning · 8 authors 2
Submitted by yyyou 22 LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning · 8 authors 3
Submitted by i-udovichenko 21 Risk-Averse Reinforcement Learning with Itakura-Saito Loss · 5 authors 2
Submitted by Franck-Dernoncourt 20 Understanding Generative AI Capabilities in Everyday Image Editing Tasks · 7 authors 2
Submitted by ychenNLP 19 AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning · 8 authors 2
Submitted by yanyc 19 Mind the Gap: Bridging Thought Leap for Improved Chain-of-Thought Tuning · 10 authors 1
Submitted by tricktreat 19 Let LLMs Break Free from Overthinking via Self-Braking Tuning · 10 authors 2
Submitted by taesiri 18 VideoGameQA-Bench: Evaluating Vision-Language Models for Video Game Quality Assurance · 5 authors 2
Submitted by rp-yu 15 Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding · 3 authors 2
Submitted by nthakur 15 Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval · 4 authors 3
Submitted by XuankunRong 15 Backdoor Cleaning without External Guidance in MLLM Fine-tuning · 8 authors 2
Submitted by julianjuaner 13 Training-Free Efficient Video Generation via Dynamic Token Carving · 9 authors 2
Submitted by KaituoFeng 12 SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward · 5 authors 2
Submitted by weizhepei 11 WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning · 12 authors 2
Submitted by haoningwu 10 SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding · 6 authors 2
Submitted by jacklishufan 10 LaViDa: A Large Diffusion Language Model for Multimodal Understanding · 10 authors 2
Submitted by zhangchenxu 10 TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning · 7 authors 2
Submitted by KevinQHLin 9 Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models · 4 authors 2
Submitted by Kikkk 6 AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios · 8 authors 2
Submitted by ilgee 6 Think-RM: Enabling Long-Horizon Reasoning in Generative Reward Models · 11 authors 2
Submitted by xhyandwyy 6 VLM-R^3: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought · 9 authors 2
Submitted by ryokamoi 6 Training Step-Level Reasoning Verifiers with Formal Verification Tools · 5 authors 2
Submitted by RunsenXu 5 Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models · 9 authors 2
Submitted by sagnikM 5 Reinforcement Learning Finetunes Small Subnetworks in Large Language Models · 4 authors 2
Submitted by MING-ZCH 3 Let Androids Dream of Electric Sheep: A Human-like Image Implication Understanding and Reasoning Framework · 2 authors 3
Submitted by keplerccc 3 Robo2VLM: Visual Question Answering from Large-Scale In-the-Wild Robot Manipulation Datasets · 4 authors 2
Submitted by ingeol 3 How Do Large Vision-Language Models See Text in Image? Unveiling the Distinctive Role of OCR Heads · 4 authors 2
Submitted by jaagli 3 RAVENEA: A Benchmark for Multimodal Retrieval-Augmented Visual Culture Understanding · 11 authors 2
Submitted by berkegokmen1 3 RoPECraft: Training-Free Motion Transfer with Trajectory-Guided RoPE Optimization on Diffusion Transformers · 4 authors 2
Submitted by gsarti 2 Steering Large Language Models for Machine Translation Personalization · 5 authors 2
Submitted by ayyyq 2 When Do LLMs Admit Their Mistakes? Understanding the Role of Model Belief in Retraction · 2 authors 2
Submitted by gagan3012 2 Date Fragments: A Hidden Bottleneck of Tokenization for Temporal Reasoning · 3 authors 2
Submitted by reachomk 2 gen2seg: Generative Models Enable Generalizable Instance Segmentation · 2 authors 2
Submitted by seyoungsong 2 MUG-Eval: A Proxy Evaluation Framework for Multilingual Generation Capabilities in Any Language · 7 authors 2
Submitted by philippds 1 SPhyR: Spatial-Physical Reasoning Benchmark on Material Distribution · 1 authors 2
Submitted by zenyn - SAKURA: On the Multi-hop Reasoning of Large Audio-Language Models Based on Speech and Audio Information · 4 authors 2