Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
diwank
's Collections
Robotics
reasoning
F
search
Vision
Art
K
S1.1
Sam
Audio
thought
Vision
updated
11 days ago
Upvote
-
apple/DepthPro
Depth Estimation
•
Updated
9 days ago
•
1.94k
•
404
rhymes-ai/Aria
Image-Text-to-Text
•
Updated
Jan 27
•
22.4k
•
617
mit-han-lab/hart-0.7b-1024px
Unconditional Image Generation
•
Updated
Nov 17, 2024
•
9
deepseek-ai/Janus-1.3B
Any-to-Any
•
Updated
Jan 27
•
127k
•
579
neulab/PangeaInstruct
Updated
Feb 2
•
652
•
82
genmo/mochi-1-preview
Text-to-Video
•
Updated
Dec 18, 2024
•
24.9k
•
•
1.19k
stabilityai/stable-diffusion-3.5-large
Text-to-Image
•
Updated
Oct 22, 2024
•
158k
•
•
2.43k
Freepik/flux.1-lite-8B-alpha
Text-to-Image
•
Updated
Dec 30, 2024
•
25k
•
410
microsoft/OmniParser
Image-Text-to-Text
•
Updated
Dec 2, 2024
•
2.57k
•
1.64k
mistralai/Pixtral-12B-Base-2409
Updated
Feb 2
•
94
neulab/Pangea-7B
Updated
Oct 24, 2024
•
10.6k
•
126
jadechoghari/Ferret-UI-Llama8b
Image-Text-to-Text
•
Updated
Jan 8
•
767
•
68
OpenGVLab/InternVL2-1B
Image-Text-to-Text
•
Updated
Feb 5
•
84.2k
•
65
OpenGVLab/InternVL2-2B
Image-Text-to-Text
•
Updated
Feb 5
•
198k
•
66
OpenGVLab/Mono-InternVL-2B
Image-Text-to-Text
•
Updated
10 days ago
•
6.67k
•
32
OpenGVLab/OmniCorpus-YT
Updated
Nov 17, 2024
•
422
•
12
OpenGVLab/OmniCorpus-CC-210M
Viewer
•
Updated
Nov 17, 2024
•
208M
•
224
•
19
OpenGVLab/OmniCorpus-CC
Viewer
•
Updated
Nov 17, 2024
•
986M
•
14.5k
•
13
OpenGVLab/InternVideo2_chat_8B_HD
Video-Text-to-Text
•
Updated
Dec 18, 2024
•
431
•
16
OpenGVLab/ViCLIP
Updated
Jun 7, 2024
•
35
OpenGVLab/ASMv2
Text Generation
•
Updated
Feb 29, 2024
•
123
•
17
OpenGVLab/VideoChat2-IT
Viewer
•
Updated
Jun 29, 2024
•
1.82M
•
350
•
49
NimVideo/cogvideox-2b-img2vid
Image-to-Video
•
Updated
Oct 28, 2024
•
345
•
76
BAAI/Infinity-MM
Updated
Dec 13, 2024
•
13.7k
•
92
nvidia/RADIO-H
Updated
Dec 2, 2024
•
1.2k
•
9
Spawning/PD12M
Viewer
•
Updated
Jan 9
•
12.4M
•
2.02k
•
152
Shitao/OmniGen-v1
Text-to-Image
•
Updated
Nov 7, 2024
•
14.1k
•
297
InstantX/InstantIR
Image-to-Image
•
Updated
Nov 7, 2024
•
1
•
165
nvidia/Cosmos-0.1-Tokenizer-DI8x8
Updated
Dec 25, 2024
•
221
•
11
BAAI/Emu3-Chat
Text Generation
•
Updated
Oct 24, 2024
•
5.34k
•
71
briaai/RMBG-2.0
Image Segmentation
•
Updated
2 days ago
•
1.07M
•
661
Watermark Anything with Localized Messages
Paper
•
2411.07231
•
Published
Nov 11, 2024
•
20
rain1011/pyramid-flow-miniflux
Text-to-Video
•
Updated
Nov 13, 2024
•
166
OpenGVLab/InternVL2-8B-MPO
Image-Text-to-Text
•
Updated
Dec 20, 2024
•
624
•
36
mistralai/Pixtral-Large-Instruct-2411
Image-Text-to-Text
•
Updated
Dec 26, 2024
•
17
•
396
briaai/BRIA-2.3
Text-to-Image
•
Updated
11 days ago
•
2.23k
•
•
36
microsoft/Reducio-VAE
Updated
Nov 21, 2024
•
16
•
15
Lightricks/LTX-Video
Text-to-Video
•
Updated
3 days ago
•
348k
•
•
1.05k
apple/aimv2-3B-patch14-448
Image Feature Extraction
•
Updated
9 days ago
•
1.84k
•
11
THUdyh/Insight-V-Reason
Text Generation
•
Updated
Nov 22, 2024
•
19
•
9
black-forest-labs/FLUX.1-Fill-dev
Updated
Nov 25, 2024
•
72k
•
561
Efficient-Large-Model/Sana_1600M_512px
Text-to-Image
•
Updated
Jan 10
•
229
•
38
Efficient-Large-Model/Sana_1600M_1024px
Text-to-Image
•
Updated
Jan 10
•
24.8k
•
196
AIDC-AI/Ovis1.6-Gemma2-27B
Image-Text-to-Text
•
Updated
11 days ago
•
615
•
63
HuggingFaceTB/SmolVLM-Base
Image-Text-to-Text
•
Updated
Nov 28, 2024
•
6.02k
•
66
THUDM/glm-edge-v-5b
Image-Text-to-Text
•
Updated
Jan 2
•
1.42k
•
12
rhymes-ai/Aria-Base-64K
Image-Text-to-Text
•
Updated
Dec 1, 2024
•
245
•
12
allenai/pixmo-point-explanations
Viewer
•
Updated
Dec 5, 2024
•
79.6k
•
225
•
7
tencent/HunyuanVideo
Text-to-Video
•
Updated
3 days ago
•
6.02k
•
•
1.74k
tencent/HunyuanVideo-PromptRewrite
Updated
Dec 6, 2024
•
114
•
45
google/paligemma2-28b-pt-896
Image-Text-to-Text
•
Updated
Dec 5, 2024
•
453
•
47
OpenGVLab/InternVL2_5-78B
Image-Text-to-Text
•
Updated
Feb 5
•
4.63k
•
180
MAmmoTH-VL/MAmmoTH-VL-8B
Updated
Dec 9, 2024
•
90
•
18
MAmmoTH-VL/MAmmoTH-VL-Instruct-12M
Viewer
•
Updated
Jan 5
•
37M
•
5.53k
•
46
OpenGVLab/PVC-InternVL2-8B
Image-Text-to-Text
•
Updated
Dec 17, 2024
•
26
•
8
BGLab/BioTrove
Viewer
•
Updated
Dec 13, 2024
•
163M
•
920
•
9
TencentARC/NVComposer
Image-to-3D
•
Updated
Dec 16, 2024
•
65
•
7
deepseek-ai/deepseek-vl2
Image-Text-to-Text
•
Updated
Dec 18, 2024
•
18.4k
•
298
FastVideo/FastHunyuan
Text-to-Video
•
Updated
Jan 8
•
281
•
179
BAAI/nova-d48w1536-sdxl1024
Text-to-Image
•
Updated
Dec 21, 2024
•
17
•
7
IamCreateAI/Ruyi-Mini-7B
Image-to-Video
•
Updated
Dec 25, 2024
•
2.49k
•
602
Infinigence/Megrez-3B-Omni
Updated
23 days ago
•
76
•
130
microsoft/VidTok
Updated
Jan 14
•
33
TIGER-Lab/Mantis-8B-siglip-llama3
Image-Text-to-Text
•
Updated
Nov 15, 2024
•
14.4k
•
33
OpenGVLab/HoVLE-HD
Image-Text-to-Text
•
Updated
28 days ago
•
39
•
8
nyu-visionx/cambrian-34b
Text Generation
•
Updated
Jun 28, 2024
•
427
•
28
nyu-visionx/cambrian-phi3-3b
Text Generation
•
Updated
Jul 6, 2024
•
384
•
11
nyu-visionx/Cambrian-Alignment
Viewer
•
Updated
Jul 23, 2024
•
292k
•
7.86k
•
33
nvidia/Cosmos-1.0-Autoregressive-13B-Video2World
Updated
30 days ago
•
507
•
31
nvidia/Cosmos-1.0-Diffusion-14B-Video2World
Updated
30 days ago
•
56.5k
•
52
nvidia/Cosmos-1.0-Diffusion-14B-Text2World
Updated
Jan 10
•
85.8k
•
49
nvidia/Cosmos-1.0-Autoregressive-12B
Updated
26 days ago
•
523
•
29
StephanST/WALDO30
Object Detection
•
Updated
Oct 9, 2024
•
222
ByteDance/Sa2VA-8B
Image-Text-to-Text
•
Updated
Jan 14
•
1.46k
•
55
OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448
Video-Text-to-Text
•
Updated
5 days ago
•
1.1k
•
13
OpenGVLab/VideoMAEv2-giant
Video Classification
•
Updated
13 days ago
•
1.5k
•
3
MiniMaxAI/MiniMax-VL-01
Image-Text-to-Text
•
Updated
15 days ago
•
465
•
245
NimVideo/mochi-1-transformer-42
Text-to-Video
•
Updated
Jan 13
•
143
•
2
ostris/Flex.1-alpha
Text-to-Image
•
Updated
Jan 19
•
33.2k
•
397
tencent/Hunyuan3D-2
Image-to-3D
•
Updated
9 days ago
•
38.2k
•
1.04k
deepseek-ai/Janus-Pro-1B
Any-to-Any
•
Updated
Feb 1
•
100k
•
398
deepseek-ai/Janus-Pro-7B
Any-to-Any
•
Updated
Feb 1
•
372k
•
3.2k
Qwen/Qwen2.5-VL-72B-Instruct
Image-Text-to-Text
•
Updated
2 days ago
•
270k
•
360
nvidia/Eagle2-9B
Image-Text-to-Text
•
Updated
Jan 28
•
1.21k
•
45
m-a-p/PIN-100M
Viewer
•
Updated
6 days ago
•
68.1k
•
61.7k
•
6
AIDC-AI/Ovis2-34B
Image-Text-to-Text
•
Updated
10 days ago
•
3.61k
•
127
microsoft/OmniParser-v2.0
Image-Text-to-Text
•
Updated
19 days ago
•
8.78k
•
1.12k
Alpha-VLLM/Lumina-Image-2.0
Text-to-Image
•
Updated
about 1 month ago
•
38.3k
•
•
279
prithivMLmods/JSONify-Flux
Image-Text-to-Text
•
Updated
21 days ago
•
229
•
12
Skywork/SkyReels-V1-Hunyuan-I2V
Image-to-Video
•
Updated
13 days ago
•
51.9k
•
247
Skywork/SkyReels-A1
Image-to-Video
•
Updated
5 days ago
•
786
•
45
AIDC-AI/Ovis2-16B
Image-Text-to-Text
•
Updated
10 days ago
•
2.91k
•
76
curateIT/themet_openaccess_bestof
Viewer
•
Updated
Apr 7, 2024
•
1.77k
•
41
•
1
MnLgt/yolo-human-parse
Image Classification
•
Updated
Sep 19, 2024
•
136
•
5
google/paligemma2-3b-mix-448
Image-Text-to-Text
•
Updated
about 1 month ago
•
10.4k
•
39
google/paligemma2-28b-mix-448
Image-Text-to-Text
•
Updated
about 1 month ago
•
725
•
25
HuggingFaceTB/SmolVLM2-2.2B-Instruct
Image-Text-to-Text
•
Updated
3 days ago
•
429k
•
106
Wan-AI/Wan2.1-T2V-14B
Text-to-Video
•
Updated
11 days ago
•
186k
•
•
949
allenai/olmOCR-7B-0225-preview
Image-Text-to-Text
•
Updated
13 days ago
•
142k
•
496
microsoft/Phi-4-multimodal-instruct
Automatic Speech Recognition
•
Updated
1 day ago
•
231k
•
1.03k
Upvote
-
Share collection
View history
Collection guide
Browse collections