A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise Paper • 2312.12436 • Published Dec 19, 2023 • 13
Masked Autoencoders are Efficient Class Incremental Learners Paper • 2308.12510 • Published Aug 24, 2023
Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion Paper • 2009.05757 • Published Sep 12, 2020
Woodpecker: Hallucination Correction for Multimodal Large Language Models Paper • 2310.16045 • Published Oct 24, 2023 • 15
MMICT: Boosting Multi-Modal Fine-Tuning with In-Context Examples Paper • 2312.06363 • Published Dec 11, 2023 • 1
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models Paper • 2306.13394 • Published Jun 23, 2023
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis Paper • 2405.21075 • Published May 31, 2024 • 21
CAPro: Webly Supervised Learning with Cross-Modality Aligned Prototypes Paper • 2310.09761 • Published Oct 15, 2023
Sinkhorn Distance Minimization for Knowledge Distillation Paper • 2402.17110 • Published Feb 27, 2024
RESTORE: Towards Feature Shift for Vision-Language Prompt Learning Paper • 2403.06136 • Published Mar 10, 2024
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models Paper • 2408.02085 • Published Aug 4, 2024 • 17
Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models Paper • 2408.15915 • Published Aug 28, 2024 • 19
Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful Comparators Paper • 2408.12325 • Published Aug 22, 2024
FlashSloth: Lightning Multimodal Large Language Models via Embedded Visual Compression Paper • 2412.04317 • Published Dec 5, 2024
Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM Paper • 2411.00774 • Published Nov 1, 2024
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction Paper • 2501.01957 • Published 9 days ago • 34