Submitted by llwswyn 60 Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning · 18 authors 75 1
Submitted by yuntian-deng 49 NeuralOS: Towards Simulating Operating Systems via Neural Generative Models · 5 authors 9 5
Submitted by xwen99 48 Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation · 8 authors 1
Submitted by EricW123456 45 CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering · 5 authors 14 1
Submitted by iliashum 30 Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities · 3303 authors 3
Submitted by yukimasano 29 KV Cache Steering for Inducing Reasoning in Small Language Models · 6 authors 3
Submitted by JacobYuan 21 Lumos-1: On Autoregressive Video Generation from a Unified Model Perspective · 14 authors 68 1
Submitted by Ksgk-fy 7 What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models · 4 authors 1
Submitted by Sreyan88 5 Audio Flamingo 3: Advancing Audio Intelligence with Fully Open Large Audio Language Models · 11 authors 1
Submitted by Raincleared 4 BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity · 8 authors 5 1
Submitted by ustc-zhangzm 4 Robust Multimodal Large Language Models Against Modality Conflict · 4 authors 2 1
Submitted by maitysubhajit 1 Doodle Your Keypoints: Sketch-Based Few-Shot Keypoint Detection · 6 authors 1
Submitted by nverma 1 DOTResize: Reducing LLM Width via Discrete Optimal Transport-based Neuron Merging · 3 authors 1