ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations Paper • 2504.00824 • Published 5 days ago • 35
TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization Paper • 2503.19901 • Published 12 days ago • 32
DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance Paper • 2504.01724 • Published 4 days ago • 57
MoCha: Towards Movie-Grade Talking Character Synthesis Paper • 2503.23307 • Published 7 days ago • 87
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes Paper • 2503.23461 • Published 7 days ago • 87
SkyReels-A2: Compose Anything in Video Diffusion Transformers Paper • 2504.02436 • Published 3 days ago • 21
Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme Paper • 2504.02587 • Published 3 days ago • 27
ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers Paper • 2504.00502 • Published 5 days ago • 17
PaperBench: Evaluating AI's Ability to Replicate AI Research Paper • 2504.01848 • Published 4 days ago • 30
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction Paper • 2504.01014 • Published 5 days ago • 52
Scaling Laws in Scientific Discovery with AI and Robot Scientists Paper • 2503.22444 • Published 9 days ago • 11