14 4 8

QinghaoYe

MAGAer13

AI & ML interests

None yet

Recent Activity

authored a paper about 2 months ago

mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality

authored a paper about 2 months ago

UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model

authored a paper about 2 months ago

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

View all activity

Organizations

authored 14 papers about 2 months ago

mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality

Paper • 2304.14178 • Published Apr 27, 2023 • 3

UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model

Paper • 2310.05126 • Published Oct 8, 2023 • 1

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Paper • 2307.02499 • Published Jul 4, 2023 • 14

BUS:Efficient and Effective Vision-language Pre-training with Bottom-Up Patch Summarization

Paper • 2307.08504 • Published Jul 17, 2023

mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video

Paper • 2302.00402 • Published Feb 1, 2023

mPLUG-PaperOwl: Scientific Diagram Analysis with the Multimodal Large Language Model

Paper • 2311.18248 • Published Nov 30, 2023

LLaVA-Critic: Learning to Evaluate Multimodal Models

Paper • 2410.02712 • Published Oct 3, 2024 • 38

TiMix: Text-aware Image Mixing for Effective Vision-Language Pre-training

Paper • 2312.08846 • Published Dec 14, 2023

Classification Done Right for Vision-Language Pre-Training

Paper • 2411.03313 • Published Nov 5, 2024

Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning

Paper • 2503.07906 • Published Mar 10 • 4

Seed1.5-VL Technical Report

Paper • 2505.07062 • Published May 11 • 146

authored a paper 12 months ago

MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?

Paper • 2407.04842 • Published Jul 5, 2024 • 57

authored a paper over 1 year ago

mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration

Paper • 2311.04257 • Published Nov 7, 2023 • 22

authored a paper about 2 years ago

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks

Paper • 2306.04362 • Published Jun 7, 2023 • 2

QinghaoYe

AI & ML interests

Recent Activity

Organizations

MAGAer13's activity