HuggingFaceM4/the_cauldron
Viewer
•
Updated
•
1.88M
•
128k
•
492
Collections of public datasets for Vision-Language modalities, especially for Frozen Vision Language Alignment.
Note SynthRecap
Note SynthRecap