MMaDA: Multimodal Large Diffusion Language Models Paper β’ 2505.15809 β’ Published 8 days ago β’ 83
Emerging Properties in Unified Multimodal Pretraining Paper β’ 2505.14683 β’ Published 9 days ago β’ 124
view article Article Interactive Tools for machine learning, deep learning, and math By Suzana β’ 4 days ago β’ 31
view changelog Changelog Xet is now the default storage option for new users and organizations 7 days ago β’ 46
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch By ariG23498 and 6 others β’ 9 days ago β’ 115
SSR Collection Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning β’ 5 items β’ Updated 5 days ago β’ 1
MedGemma Release Collection Collection of Gemma 3 variants for performance on medical text and image comprehension to accelerate building healthcare-based AI applications. β’ 4 items β’ Updated 7 days ago β’ 136
OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning Paper β’ 2505.04601 β’ Published 22 days ago β’ 26
view article Article The Transformers Library: standardizing model definitions By lysandre and 3 others β’ 15 days ago β’ 104
Cosmos Transfer1 Collection Multimodal Conditional World Generation for World2World Transfer β’ 6 items β’ Updated 9 days ago β’ 17
Cosmos Tokenize1 Collection A suite of image and video tokenizers β’ 9 items β’ Updated 9 days ago β’ 7
Cosmos Predict1 Collection World Foundation Model for Future Prediction β’ 14 items β’ Updated 9 days ago β’ 7
Cosmos-Reason1 Collection Multimodal world understanding through reasoning β’ 5 items β’ Updated 8 days ago β’ 24
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning Paper β’ 2503.15558 β’ Published Mar 18 β’ 47
view article Article Microsoft and Hugging Face expand collaboration By jeffboudier and 2 others β’ 11 days ago β’ 20
view article Article NVIDIA Cosmos Now Available On Hugging Face For Physical AI Reasoning By PranjaliJoshi and 1 other β’ 10 days ago β’ 24
view article Article Falcon-Edge: A series of powerful, universal, fine-tunable 1.58bit language models. By tiiuae and 9 others β’ 15 days ago β’ 32