LLaVE - a zhibinlan Collection

zhibinlan 's Collections

LLaVE

LLaVE

updated Mar 10

LLaVE is a series of large language and vision embedding models trained on a variety of multimodal embedding datasets

zhibinlan/LLaVE-0.5B

Image-Text-to-Text • Updated Mar 14 • 2.98k • 7
zhibinlan/LLaVE-2B

Image-Text-to-Text • Updated Mar 14 • 20.1k • 45
zhibinlan/LLaVE-7B

Image-Text-to-Text • Updated Mar 14 • 1.46k • 5
LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning

Paper • 2503.04812 • Published Mar 4 • 14