MutiModal_Dataset
updated
Updated
• 1.83k
• 116
Updated
• 25.4k
• 135
WildVision/wildvision-chat
Viewer
• Updated
• 45.2k • 206
• 20
Viewer
• Updated
• 12.4M • 1.3k
• 170
lmms-lab/LLaVA-Video-178K
Viewer
• Updated
• 1.63M • 13.9k
• 187
Viewer
• Updated
• 7.29M • 454
• 49
Viewer
• Updated
• 1.66M • 14
VILA-U: a Unified Foundation Model Integrating Visual Understanding and
Generation
Paper
• 2409.04429
• Published
Viewer
• Updated
• 235M • 2.67k
• 45
Viewer
• Updated
• 9.81M • 781
• 53
JefferyZhan/Language-prompted-Localization-Dataset
Preview
• Updated
• 84
• 4
Viewer
• Updated
• 392 • 32
• 12
mlfoundations/MINT-1T-HTML
Viewer
• Updated
• 623M • 47.2k
• 91
DINO-X: A Unified Vision Model for Open-World Object Detection and
Understanding
Paper
• 2411.14347
• Published
• 16
Preview
• Updated
• 58
• 51
Viewer
• Updated
• 72.5k • 141
• 10
Viewer
• Updated
• 10.9M • 13
• 9
Viewer
• Updated
• 2.18M • 11
• 2
Viewer
• Updated
• 110k • 84
• 4
Salesforce/blip3-grounding-50m
Viewer
• Updated
• 52.4M • 309
• 27
Intelligent-Internet/II-Thought-RL-v0
Viewer
• Updated
• 342k • 269
• 54
DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and
Verifiable Mathematical Dataset for Advancing Reasoning
Paper
• 2504.11456
• Published
• 12
Viewer
• Updated
• 217M • 109k
• 114