AAD-LLM: Neural Attention-Driven Auditory Scene Understanding Paper • 2502.16794 • Published 3 days ago • 4
Unveiling Downstream Performance Scaling of LLMs: A Clustering-Based Perspective Paper • 2502.17262 • Published 2 days ago • 14
SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference Paper • 2502.18137 • Published 1 day ago • 40
OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference Paper • 2502.18411 • Published 1 day ago • 53
Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam Paper • 2502.17055 • Published 3 days ago • 13
Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models Paper • 2502.16033 • Published 5 days ago • 15
Mobile-Agent-V: Learning Mobile Device Operation Through Video-Guided Multi-Agent Collaboration Paper • 2502.17110 • Published 2 days ago • 10
GCC: Generative Color Constancy via Diffusing a Color Checker Paper • 2502.17435 • Published 2 days ago • 23
Multimodal RewardBench: Holistic Evaluation of Reward Models for Vision Language Models Paper • 2502.14191 • Published 7 days ago • 6
How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM? Paper • 2502.14502 • Published 6 days ago • 77
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published 6 days ago • 91
Expect the Unexpected: FailSafe Long Context QA for Finance Paper • 2502.06329 • Published 17 days ago • 124
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published Jan 14 • 273
STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution Paper • 2501.02976 • Published Jan 6 • 54
TransPixar: Advancing Text-to-Video Generation with Transparency Paper • 2501.03006 • Published Jan 6 • 23
METAGENE-1: Metagenomic Foundation Model for Pandemic Monitoring Paper • 2501.02045 • Published Jan 3 • 21
TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video Paper • 2411.18671 • Published Nov 27, 2024 • 20
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published Nov 7, 2024 • 115