Temporal Preference Optimization for Long-Form Video Understanding Paper ā¢ 2501.13919 ā¢ Published 4 days ago ā¢ 18 ā¢ 3
IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models Paper ā¢ 2501.13920 ā¢ Published 4 days ago ā¢ 12 ā¢ 2
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking Paper ā¢ 2501.09751 ā¢ Published 11 days ago ā¢ 46 ā¢ 2
Multimodal LLMs Can Reason about Aesthetics in Zero-Shot Paper ā¢ 2501.09012 ā¢ Published 12 days ago ā¢ 10 ā¢ 2
MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents Paper ā¢ 2501.08828 ā¢ Published 12 days ago ā¢ 28 ā¢ 2
Ouroboros-Diffusion: Exploring Consistent Content Generation in Tuning-free Long Video Diffusion Paper ā¢ 2501.09019 ā¢ Published 12 days ago ā¢ 12 ā¢ 2
HALoGEN: Fantastic LLM Hallucinations and Where to Find Them Paper ā¢ 2501.08292 ā¢ Published 13 days ago ā¢ 16 ā¢ 2
VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control Paper ā¢ 2501.01427 ā¢ Published 25 days ago ā¢ 49 ā¢ 3