UniME is a series of multimodal large language models trained for learning universal multimodal embedding.
-
DeepGlint-AI/UniME-Phi3.5-V-4.2B
Image-Text-to-Text • Updated • 198 • 6 -
DeepGlint-AI/UniME-LLaVA-1.6-7B
Image-Text-to-Text • Updated • 285 • 5 -
Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs
Paper • 2504.17432 • Published • 38 -
DeepGlint-AI/UniME-LLaVA-OneVision-7B
Image-Text-to-Text • Updated • 315 • 2