Submitted by MiniMax-AI 171 MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention · 127 authors 3
Submitted by schrodingers-tiger 56 Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning · 27 authors 2
Submitted by Ayanami0730 37 DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents · 5 authors 1
Submitted by shulin16 26 Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning · 10 authors 1
Submitted by shuaishuaicdp 24 Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency · 6 authors 1
Submitted by rp-yu 18 Discrete Diffusion in Large Language and Multimodal Models: A Survey · 3 authors 2
Submitted by zhendch 17 Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression · 8 authors 1
Submitted by WTNswaggy 14 PersonaFeedback: A Large-scale Human-annotated Benchmark For Personalization · 6 authors 1
Submitted by IgnoraZ 13 From Real to Synthetic: Synthesizing Millions of Diversified and Complicated User Instructions with Attributed Grounding · 4 authors 1
Submitted by LPY 10 BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models · 9 authors 1
Submitted by iwiwi 5 ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm Engineering · 6 authors 1
Submitted by pranavAL2109 3 Supernova Event Dataset: Interpreting Large Language Model's Personality through Critical Event Analysis · 2 authors 1
Submitted by Franck-Dernoncourt 2 Forecasting Time Series with LLMs via Patch-Based Prompting and Decomposition · 10 authors 1
Submitted by Franck-Dernoncourt 2 MS4UI: A Dataset for Multi-modal Summarization of User Interface Instructional Videos · 8 authors 1
Submitted by zainmujahid 2 Profiling News Media for Factuality and Bias Using LLMs and the Fact-Checking Methodology of Human Experts · 4 authors 1
Submitted by Owenngt 2 SRLAgent: Enhancing Self-Regulated Learning Skills through Gamification and LLM Assistance · 8 authors 1
Submitted by Taegyeonglee 1 QGuard:Question-based Zero-shot Guard for Multi-modal LLM Safety · 5 authors 1
Submitted by PChemGuy - Ai-Facilitated Analysis of Abstracts and Conclusions: Flagging Unsubstantiated Claims and Ambiguous Pronouns · 1 authors 1