sam2ai
's Collections
MM_LLM
updated
OpenFlamingo: An Open-Source Framework for Training Large Autoregressive
Vision-Language Models
Paper
•
2308.01390
•
Published
•
32
Med-Flamingo: a Multimodal Medical Few-shot Learner
Paper
•
2307.15189
•
Published
•
22
BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs
Paper
•
2307.08581
•
Published
•
27
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
Paper
•
2307.03601
•
Published
•
11
Towards Language Models That Can See: Computer Vision Through the LENS
of Natural Language
Paper
•
2306.16410
•
Published
•
27
ImageBind-LLM: Multi-modality Instruction Tuning
Paper
•
2309.03905
•
Published
•
16
NExT-GPT: Any-to-Any Multimodal LLM
Paper
•
2309.05519
•
Published
•
78
Mirasol3B: A Multimodal Autoregressive model for time-aligned and
contextual modalities
Paper
•
2311.05698
•
Published
•
9
CogAgent: A Visual Language Model for GUI Agents
Paper
•
2312.08914
•
Published
•
29
Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
Paper
•
2312.03818
•
Published
•
32