Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Models filters
Main
Tasks
Libraries
Languages
Licenses
Other
1
Apps
Backyard AI
DiffusionBee
Draw Things
Invoke
Jan
Jellybox
llama.cpp
LM Studio
LocalAI
MLX LM
Msty
node-llama-cpp
Ollama
RecurseChat
Sanctum
TGI
vLLM
Apps with no match
JoyFusion
Inference Providers
Select all
Fireworks
HF Inference API
Hyperbolic
Nebius AI
Inference Providers with no match
Novita
Together AI
Cerebras
Featherless AI
Nscale
fal
SambaNova
Groq
Replicate
Cohere
Misc
Reset Misc
multimodal
Inference Endpoints
text-generation-inference
custom_code
4-bit precision
Eval Results
Merge
8-bit precision
Mixture of Experts
Misc with no match
text-embeddings-inference
Carbon Emissions
Apply filters
Models
1,240
Full-text search
Edit filters
Sort: Trending
Active filters:
multimodal
Clear all
mradermacher/SpaceThinker-Qwen2.5VL-3B-GGUF
Robotics
•
3B
•
Updated
6 days ago
•
327
•
2
mradermacher/SpaceThinker-Qwen2.5VL-3B-i1-GGUF
Robotics
•
3B
•
Updated
6 days ago
•
413
•
2
lusxvr/nanoVLM-222M
Image-Text-to-Text
•
0.2B
•
Updated
May 8
•
2.21k
•
88
openbmb/AgentCPM-GUI
Image-Text-to-Text
•
8B
•
Updated
13 days ago
•
801
•
120
lmstudio-community/Qwen2.5-VL-72B-Instruct-GGUF
Image-Text-to-Text
•
73B
•
Updated
May 10
•
773
•
1
csfufu/Revisual-R1-final
Image-Text-to-Text
•
8B
•
Updated
22 days ago
•
640
•
5
unsloth/Qwen2.5-VL-7B-Instruct-GGUF
Image-Text-to-Text
•
8B
•
Updated
May 12
•
9.5k
•
9
unsloth/Qwen2.5-VL-32B-Instruct-GGUF
Image-Text-to-Text
•
33B
•
Updated
May 12
•
4.67k
•
4
osunlp/WebJudge-7B
Image-Text-to-Text
•
8B
•
Updated
May 12
•
80
•
5
ggml-org/Qwen2.5-Omni-7B-GGUF
Any-to-Any
•
8B
•
Updated
May 26
•
2.47k
•
8
stockmark/Stockmark-2-VL-100B-beta
Image-Text-to-Text
•
96B
•
Updated
25 days ago
•
1.95k
•
18
imageomics/bioclip-2
Zero-Shot Image Classification
•
Updated
22 days ago
•
5.08k
•
10
davidelobba/TEMU-VTOFF
Image-to-Image
•
Updated
28 days ago
•
3
OpenGVLab/ZeroGUI-AndroidLab-7B
Image-Text-to-Text
•
8B
•
Updated
29 days ago
•
88
•
4
Sungyeon/GENIUS
Visual Document Retrieval
•
Updated
21 days ago
•
1
humbleakh/qwen2.5-vl-3b-8bit-chain-of-zoom
Image-to-Text
•
Updated
19 days ago
•
72
•
1
mehmetkuzucu/Waffle-v1.0
Visual Question Answering
•
0.2B
•
Updated
17 days ago
•
98
•
4
mradermacher/SpaceOm-GGUF
3B
•
Updated
7 days ago
•
325
•
2
mradermacher/SpaceOm-i1-GGUF
3B
•
Updated
7 days ago
•
630
•
2
rinabuoy/nanoVLM
Image-Text-to-Text
•
0.2B
•
Updated
8 days ago
•
30
•
2
adriabama06/UI-TARS-1.5-7B-exl2
Image-Text-to-Text
•
Updated
7 days ago
•
2
•
1
adriabama06/UI-TARS-1.5-7B-Q4_K_M-GGUF
Image-Text-to-Text
•
8B
•
Updated
7 days ago
•
22
•
1
adriabama06/UI-TARS-1.5-7B-GGUF
Image-Text-to-Text
•
8B
•
Updated
7 days ago
•
134
•
1
avin-255/nanoVLM
Image-Text-to-Text
•
0.2B
•
Updated
6 days ago
•
18
•
1
thesby/Qwen2.5-VL-7B-NSFW-Caption-V3
Image-Text-to-Text
•
8B
•
Updated
10 days ago
•
168
•
7
sujitpal/clip-imageclef
Zero-Shot Image Classification
•
Updated
Oct 31, 2023
•
60
•
3
waybarrios/guidance-based-video-grounding
Updated
Apr 1, 2023
MonoHime/mosei-senti-intermodal
Feature Extraction
•
Updated
May 18, 2023
•
52
MonoHime/mosei-emo-intermodal
Feature Extraction
•
Updated
May 18, 2023
•
39
MonoHime/iemocap-emo-intermodal
Feature Extraction
•
Updated
May 18, 2023
•
23
Previous
1
2
3
4
5
...
42
Next