Any Information Is Just Worth One Single Screenshot: Unifying Search With Visualized Information Retrieval Paper • 2502.11431 • Published Feb 17
OmniGen2: Exploration to Advanced Multimodal Generation Paper • 2506.18871 • Published 3 days ago • 64
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated Apr 28 • 496
OmniGen2: Exploration to Advanced Multimodal Generation Paper • 2506.18871 • Published 3 days ago • 64
VideoDeepResearch: Long Video Understanding With Agentic Tool Using Paper • 2506.10821 • Published 14 days ago • 19
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval Paper • 2412.14475 • Published Dec 19, 2024 • 55
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval Paper • 2412.14475 • Published Dec 19, 2024 • 55
VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation Paper • 2412.10704 • Published Dec 14, 2024 • 15