LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM Paper • 2503.04724 • Published 3 days ago • 45
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published 3 days ago • 61
Mobius: Text to Seamless Looping Video Generation via Latent Shift Paper • 2502.20307 • Published 10 days ago • 16
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute Paper • 2502.20126 • Published 10 days ago • 19
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think Paper • 2502.20172 • Published 10 days ago • 26
R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts Paper • 2502.20395 • Published 10 days ago • 43
MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning Paper • 2502.19634 • Published 11 days ago • 56
PosterSum: A Multimodal Benchmark for Scientific Poster Summarization Paper • 2502.17540 • Published 13 days ago • 2
Project Alexandria: Towards Freeing Scientific Knowledge from Copyright Burdens via LLMs Paper • 2502.19413 • Published 11 days ago • 19
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning? Paper • 2502.19361 • Published 11 days ago • 26
Language Models' Factuality Depends on the Language of Inquiry Paper • 2502.17955 • Published 12 days ago • 29
GHOST 2.0: generative high-fidelity one shot transfer of heads Paper • 2502.18417 • Published 12 days ago • 62
TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding Paper • 2502.19400 • Published 11 days ago • 42
Rare Disease Differential Diagnosis with Large Language Models at Scale: From Abdominal Actinomycosis to Wilson's Disease Paper • 2502.15069 • Published 17 days ago • 2
The Relationship Between Reasoning and Performance in Large Language Models -- o3 (mini) Thinks Harder, Not Longer Paper • 2502.15631 • Published 16 days ago • 8
MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models Paper • 2502.14302 • Published 17 days ago • 9