zhibinlan/LLaVE-0.5B
Image-Text-to-Text
•
Updated
•
2.98k
•
7
LLaVE is a series of large language and vision embedding models trained on a variety of multimodal embedding datasets