mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality Paper • 2304.14178 • Published Apr 27, 2023 • 3
UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model Paper • 2310.05126 • Published Oct 8, 2023 • 1
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding Paper • 2307.02499 • Published Jul 4, 2023 • 14
BUS:Efficient and Effective Vision-language Pre-training with Bottom-Up Patch Summarization Paper • 2307.08504 • Published Jul 17, 2023
Evaluation and Analysis of Hallucination in Large Vision-Language Models Paper • 2308.15126 • Published Aug 29, 2023 • 1
HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training Paper • 2212.14546 • Published Dec 30, 2022
Learning Trajectory-Word Alignments for Video-Language Tasks Paper • 2301.01953 • Published Jan 5, 2023
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video Paper • 2302.00402 • Published Feb 1, 2023
mPLUG-PaperOwl: Scientific Diagram Analysis with the Multimodal Large Language Model Paper • 2311.18248 • Published Nov 30, 2023
LLaVA-Critic: Learning to Evaluate Multimodal Models Paper • 2410.02712 • Published Oct 3, 2024 • 38
TiMix: Text-aware Image Mixing for Effective Vision-Language Pre-training Paper • 2312.08846 • Published Dec 14, 2023
Classification Done Right for Vision-Language Pre-Training Paper • 2411.03313 • Published Nov 5, 2024
Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning Paper • 2503.07906 • Published Mar 10 • 4
Emerging Properties in Unified Multimodal Pretraining Paper • 2505.14683 • Published May 20 • 130
Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning Paper • 2503.07906 • Published Mar 10 • 4 • 2
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation? Paper • 2407.04842 • Published Jul 5, 2024 • 57
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation? Paper • 2407.04842 • Published Jul 5, 2024 • 57