MedGemma Release Collection Collection of Gemma 3 variants for performance on medical text and image comprehension to accelerate building healthcare-based AI applications. • 4 items • Updated 21 days ago • 164
view article Article LeRobot Community Datasets: The “ImageNet” of Robotics — When and How? By danaaubakirova and 6 others • May 11 • 64
Describe Anything Collection Multimodal Large Language Models for Detailed Localized Image and Video Captioning • 7 items • Updated 3 days ago • 51
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models Paper • 2502.01061 • Published Feb 3 • 218
view article Article SigLIP 2: A better multilingual vision language encoder By ariG23498 and 2 others • Feb 21 • 167
MangaNinja: Line Art Colorization with Precise Reference Following Paper • 2501.08332 • Published Jan 14 • 60
BhasaAnuvaad Collection A Speech Translation Dataset for 13 Indian Languages • 11 items • Updated Jan 16 • 19