Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
diwank
's Collections
Med
code
Robotics
reasoning
F
search
Vision
Art
K
S1.1
Sam
Audio
thought
Vision
updated
12 days ago
Upvote
1
apple/DepthPro
Depth Estimation
•
Updated
Feb 28
•
1.76k
•
432
rhymes-ai/Aria
Image-Text-to-Text
•
Updated
Apr 23
•
19.1k
•
629
mit-han-lab/hart-0.7b-1024px
Unconditional Image Generation
•
Updated
Nov 17, 2024
•
13
deepseek-ai/Janus-1.3B
Any-to-Any
•
Updated
Jan 27
•
6.05k
•
588
neulab/PangeaInstruct
Updated
Feb 2
•
900
•
83
genmo/mochi-1-preview
Text-to-Video
•
Updated
Dec 18, 2024
•
33.6k
•
•
1.22k
stabilityai/stable-diffusion-3.5-large
Text-to-Image
•
Updated
Oct 22, 2024
•
117k
•
•
2.81k
Freepik/flux.1-lite-8B-alpha
Text-to-Image
•
Updated
Dec 30, 2024
•
1.76k
•
419
microsoft/OmniParser
Image-Text-to-Text
•
Updated
Dec 2, 2024
•
592
•
1.66k
mistralai/Pixtral-12B-Base-2409
Updated
Feb 2
•
102
neulab/Pangea-7B
Updated
Oct 24, 2024
•
12.2k
•
129
jadechoghari/Ferret-UI-Llama8b
Image-Text-to-Text
•
Updated
Jan 8
•
107
•
69
OpenGVLab/InternVL2-1B
Image-Text-to-Text
•
Updated
Mar 25
•
36.6k
•
71
OpenGVLab/InternVL2-2B
Image-Text-to-Text
•
Updated
Mar 25
•
321k
•
69
OpenGVLab/Mono-InternVL-2B
Image-Text-to-Text
•
Updated
Mar 12
•
9.01k
•
33
OpenGVLab/OmniCorpus-YT
Updated
Mar 20
•
335
•
13
OpenGVLab/OmniCorpus-CC-210M
Viewer
•
Updated
Mar 20
•
208M
•
619
•
24
OpenGVLab/OmniCorpus-CC
Viewer
•
Updated
Mar 20
•
872M
•
15.2k
•
17
OpenGVLab/InternVideo2_chat_8B_HD
Video-Text-to-Text
•
Updated
Dec 18, 2024
•
1.55k
•
17
OpenGVLab/ViCLIP
Updated
Jun 7, 2024
•
40
OpenGVLab/ASMv2
Text Generation
•
Updated
Feb 29, 2024
•
265
•
16
OpenGVLab/VideoChat2-IT
Viewer
•
Updated
Jun 29, 2024
•
1.82M
•
335
•
50
NimVideo/cogvideox-2b-img2vid
Image-to-Video
•
Updated
Oct 28, 2024
•
179
•
79
BAAI/Infinity-MM
Updated
Dec 13, 2024
•
20.3k
•
101
nvidia/RADIO-H
Updated
Apr 17
•
1.82k
•
10
Spawning/PD12M
Viewer
•
Updated
Jan 9
•
12.4M
•
3.04k
•
157
Shitao/OmniGen-v1
Text-to-Image
•
Updated
Nov 7, 2024
•
4.7k
•
313
InstantX/InstantIR
Image-to-Image
•
Updated
Nov 7, 2024
•
172
nvidia/Cosmos-0.1-Tokenizer-DI8x8
Updated
Dec 25, 2024
•
780
•
11
BAAI/Emu3-Chat
Text Generation
•
Updated
Oct 24, 2024
•
1.68k
•
71
briaai/RMBG-2.0
Image Segmentation
•
Updated
14 days ago
•
65.2k
•
774
Watermark Anything with Localized Messages
Paper
•
2411.07231
•
Published
Nov 11, 2024
•
22
rain1011/pyramid-flow-miniflux
Text-to-Video
•
Updated
Nov 13, 2024
•
174
OpenGVLab/InternVL2-8B-MPO
Image-Text-to-Text
•
Updated
Dec 20, 2024
•
221
•
34
mistralai/Pixtral-Large-Instruct-2411
Image-Text-to-Text
•
Updated
Mar 16
•
413
briaai/BRIA-2.3
Text-to-Image
•
Updated
Apr 10
•
736
•
37
microsoft/Reducio-VAE
Updated
Nov 21, 2024
•
4
•
15
Lightricks/LTX-Video
Text-to-Video
•
Updated
6 days ago
•
329k
•
•
1.59k
apple/aimv2-3B-patch14-448
Image Feature Extraction
•
Updated
Feb 28
•
677
•
12
THUdyh/Insight-V-Reason
Text Generation
•
Updated
Nov 22, 2024
•
17
•
9
black-forest-labs/FLUX.1-Fill-dev
Updated
Nov 25, 2024
•
359k
•
718
Efficient-Large-Model/Sana_1600M_512px
Text-to-Image
•
Updated
Jan 10
•
1.84k
•
39
Efficient-Large-Model/Sana_1600M_1024px
Text-to-Image
•
Updated
Jan 10
•
4.27k
•
•
206
AIDC-AI/Ovis1.6-Gemma2-27B
Image-Text-to-Text
•
Updated
Feb 26
•
69
•
62
HuggingFaceTB/SmolVLM-Base
Image-Text-to-Text
•
Updated
Nov 28, 2024
•
5.51k
•
76
THUDM/glm-edge-v-5b
Image-Text-to-Text
•
Updated
Jan 2
•
152
•
12
rhymes-ai/Aria-Base-64K
Image-Text-to-Text
•
Updated
Dec 1, 2024
•
48
•
14
allenai/pixmo-point-explanations
Viewer
•
Updated
Dec 5, 2024
•
79.6k
•
161
•
7
tencent/HunyuanVideo
Text-to-Video
•
Updated
Mar 6
•
1.98k
•
•
1.88k
tencent/HunyuanVideo-PromptRewrite
Updated
Dec 6, 2024
•
34
•
49
google/paligemma2-28b-pt-896
Image-Text-to-Text
•
Updated
Dec 5, 2024
•
247
•
48
OpenGVLab/InternVL2_5-78B
Image-Text-to-Text
•
Updated
Mar 25
•
28.5k
•
190
MAmmoTH-VL/MAmmoTH-VL-8B
Updated
Dec 9, 2024
•
12
•
18
MAmmoTH-VL/MAmmoTH-VL-Instruct-12M
Viewer
•
Updated
Jan 5
•
37M
•
3.83k
•
51
OpenGVLab/PVC-InternVL2-8B
Image-Text-to-Text
•
Updated
Dec 17, 2024
•
12
•
8
BGLab/BioTrove
Viewer
•
Updated
Dec 13, 2024
•
163M
•
1.11k
•
14
TencentARC/NVComposer
Image-to-3D
•
Updated
Dec 16, 2024
•
38
•
7
deepseek-ai/deepseek-vl2
Image-Text-to-Text
•
Updated
Dec 18, 2024
•
9.48k
•
331
FastVideo/FastHunyuan
Text-to-Video
•
Updated
Jan 8
•
36
•
186
BAAI/nova-d48w1536-sdxl1024
Text-to-Image
•
Updated
Dec 21, 2024
•
8
•
7
IamCreateAI/Ruyi-Mini-7B
Image-to-Video
•
Updated
Dec 25, 2024
•
296
•
610
Infinigence/Megrez-3B-Omni
Updated
Feb 14
•
26
•
132
microsoft/VidTok
Updated
Apr 5
•
41
TIGER-Lab/Mantis-8B-siglip-llama3
Image-Text-to-Text
•
Updated
Nov 15, 2024
•
490
•
33
OpenGVLab/HoVLE-HD
Image-Text-to-Text
•
Updated
Feb 9
•
36
•
8
nyu-visionx/cambrian-34b
Text Generation
•
Updated
Jun 28, 2024
•
98
•
28
nyu-visionx/cambrian-phi3-3b
Text Generation
•
Updated
Jul 6, 2024
•
212
•
11
nyu-visionx/Cambrian-Alignment
Viewer
•
Updated
Jul 23, 2024
•
292k
•
5.17k
•
33
nvidia/Cosmos-1.0-Autoregressive-13B-Video2World
Updated
Feb 8
•
70
•
31
nvidia/Cosmos-1.0-Diffusion-14B-Video2World
Updated
19 days ago
•
2.62k
•
56
nvidia/Cosmos-1.0-Diffusion-14B-Text2World
Updated
19 days ago
•
2.81k
•
59
nvidia/Cosmos-1.0-Autoregressive-12B
Updated
Feb 11
•
79
•
30
StephanST/WALDO30
Object Detection
•
Updated
Oct 9, 2024
•
235
ByteDance/Sa2VA-8B
Image-Text-to-Text
•
Updated
Mar 19
•
1.11k
•
56
OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448
Video-Text-to-Text
•
Updated
Mar 16
•
1.63k
•
19
OpenGVLab/VideoMAEv2-giant
Video Classification
•
Updated
Feb 25
•
2.38k
•
4
MiniMaxAI/MiniMax-VL-01
Image-Text-to-Text
•
Updated
14 days ago
•
23.4k
•
258
NimVideo/mochi-1-transformer-42
Text-to-Video
•
Updated
Jan 13
•
24
•
3
ostris/Flex.1-alpha
Text-to-Image
•
Updated
Jan 19
•
18.4k
•
451
tencent/Hunyuan3D-2
Image-to-3D
•
Updated
Apr 10
•
342k
•
1.46k
deepseek-ai/Janus-Pro-1B
Any-to-Any
•
Updated
Feb 1
•
24.7k
•
440
deepseek-ai/Janus-Pro-7B
Any-to-Any
•
Updated
Feb 1
•
92.6k
•
3.39k
Qwen/Qwen2.5-VL-72B-Instruct
Image-Text-to-Text
•
Updated
Mar 23
•
175k
•
•
465
nvidia/Eagle2-9B
Image-Text-to-Text
•
Updated
Jan 28
•
3.79k
•
57
m-a-p/PIN-100M
Viewer
•
Updated
4 days ago
•
68.1k
•
52.5k
•
11
AIDC-AI/Ovis2-34B
Image-Text-to-Text
•
Updated
Feb 27
•
747
•
148
microsoft/OmniParser-v2.0
Updated
Mar 28
•
1.02k
•
1.25k
Alpha-VLLM/Lumina-Image-2.0
Text-to-Image
•
Updated
Mar 30
•
2.69k
•
•
316
prithivMLmods/JSONify-Flux
Image-Text-to-Text
•
Updated
Feb 16
•
24
•
3
Skywork/SkyReels-V1-Hunyuan-I2V
Image-to-Video
•
Updated
Feb 24
•
824
•
269
Skywork/SkyReels-A1
Image-to-Video
•
Updated
Mar 4
•
229
•
60
AIDC-AI/Ovis2-16B
Image-Text-to-Text
•
Updated
Feb 27
•
19.8k
•
92
curateIT/themet_openaccess_bestof
Viewer
•
Updated
Apr 7, 2024
•
1.77k
•
11
•
1
MnLgt/yolo-human-parse
Image Classification
•
Updated
Sep 19, 2024
•
12
•
6
google/paligemma2-3b-mix-448
Image-Text-to-Text
•
Updated
Feb 7
•
7.33k
•
44
google/paligemma2-28b-mix-448
Image-Text-to-Text
•
Updated
Feb 7
•
248
•
26
HuggingFaceTB/SmolVLM2-2.2B-Instruct
Image-Text-to-Text
•
Updated
Apr 8
•
82.3k
•
191
Wan-AI/Wan2.1-T2V-14B
Text-to-Video
•
Updated
Mar 12
•
38k
•
•
1.27k
allenai/olmOCR-7B-0225-preview
Image-Text-to-Text
•
Updated
Feb 25
•
408k
•
651
microsoft/Phi-4-multimodal-instruct
Automatic Speech Recognition
•
Updated
24 days ago
•
387k
•
1.4k
briaai/BRIA-4B-Adapt
Text-to-Image
•
Updated
Mar 7
•
184
•
7
DAMO-NLP-SG/VideoLLaMA3-7B
Visual Question Answering
•
Updated
Mar 20
•
71.3k
•
54
ali-vilab/ACE_Plus
Updated
Mar 14
•
102
•
243
ByteDance/LatentSync-1.5
Updated
Mar 16
•
63
IDEA-Research/RexSeek-3B
Image-Text-to-Text
•
Updated
Mar 14
•
506
•
8
TIGER-Lab/Vamba-Qwen2-VL-7B
Video-Text-to-Text
•
Updated
Mar 18
•
133
•
16
ds4sd/SmolDocling-256M-preview
Image-Text-to-Text
•
Updated
9 days ago
•
283k
•
1.39k
nvidia/Cosmos-Predict1-14B-Video2World
Updated
Apr 8
•
316
•
4
nvidia/Cosmos-Transfer1-7B
Updated
Apr 8
•
3.47k
•
36
CohereLabs/aya-vision-32b
Image-Text-to-Text
•
Updated
12 days ago
•
496
•
•
204
ByteDance/Sa2VA-26B
Image-Text-to-Text
•
Updated
Mar 19
•
243
•
25
ChaolongYang/KDTalker
Image-to-Video
•
Updated
Mar 30
•
13
Rapidata/OpenAI-4o_t2i_human_preference
Viewer
•
Updated
Mar 28
•
13k
•
259
•
34
McGill-NLP/AURORA
Image-to-Image
•
Updated
Dec 21, 2024
•
302
•
4
HiDream-ai/MotionPro
Image-to-Video
•
Updated
6 days ago
•
76
RaphaelLiu/Pusa-V0.5
Updated
Apr 15
•
120
•
43
OpenGVLab/InternVL3-38B
Image-Text-to-Text
•
Updated
about 1 month ago
•
101k
•
29
ShoufaChen/PixelFlow-Text2Image
Text-to-Image
•
Updated
Apr 12
•
13
FoundationVision/Infinity
Updated
Feb 18
•
70
•
49
nvidia/PhysicalAI-SmartSpaces
Updated
2 days ago
•
20.4k
•
27
nvidia/DAM-3B-Video
Image-Text-to-Text
•
Updated
18 days ago
•
15.8k
•
52
nvidia/DAM-3B-Self-Contained
Image-Text-to-Text
•
Updated
18 days ago
•
5.16k
•
21
OpenGVLab/VideoChat-R1_7B
Video-Text-to-Text
•
Updated
Apr 22
•
5.35k
•
7
Skywork/SkyCaptioner-V1
Video-Text-to-Text
•
Updated
about 1 month ago
•
683
•
37
Fintor/Fintor-GUI-S2
Image-Text-to-Text
•
Updated
Apr 24
•
158
•
4
ByteDance-Seed/UI-TARS-7B-DPO
Image-Text-to-Text
•
Updated
Jan 25
•
127k
•
213
OpenGVLab/InternVL_2_5_HiCo_R64
Video-Text-to-Text
•
Updated
13 days ago
•
167
•
3
Upvote
1
Share collection
View history
Collection guide
Browse collections