Taking Notes Brings Focus? Towards Multi-Turn Multimodal Dialogue Learning Paper • 2503.07002 • Published 3 days ago • 36
YuE: Scaling Open Foundation Models for Long-Form Music Generation Paper • 2503.08638 • Published 2 days ago • 52
Identifying Sensitive Weights via Post-quantization Integral Paper • 2503.01901 • Published 13 days ago • 7
Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities Paper • 2503.03983 • Published 8 days ago • 22
LINGOLY-TOO: Disentangling Memorisation from Reasoning with Linguistic Templatisation and Orthographic Obfuscation Paper • 2503.02972 • Published 9 days ago • 23
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published 7 days ago • 76
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM Paper • 2503.04724 • Published 7 days ago • 58
Remasking Discrete Diffusion Models with Inference-Time Scaling Paper • 2503.00307 • Published 12 days ago • 9
TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding Paper • 2502.19400 • Published 15 days ago • 43
MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning Paper • 2502.19634 • Published 15 days ago • 58
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think Paper • 2502.20172 • Published 14 days ago • 27
UniTok: A Unified Tokenizer for Visual Generation and Understanding Paper • 2502.20321 • Published 14 days ago • 29
Mobius: Text to Seamless Looping Video Generation via Latent Shift Paper • 2502.20307 • Published 14 days ago • 17
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute Paper • 2502.20126 • Published 14 days ago • 20
OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference Paper • 2502.18411 • Published 16 days ago • 69
Project Alexandria: Towards Freeing Scientific Knowledge from Copyright Burdens via LLMs Paper • 2502.19413 • Published 15 days ago • 19