SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics Paper • 2506.01844 • Published 5 days ago • 75
SmolVLA Collection Small, efficient and light-weight VLAs pretrained on community datasets • 1 item • Updated 7 days ago • 19
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation Paper • 2401.02117 • Published Jan 4, 2024 • 33
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data By danaaubakirova and 8 others • 5 days ago • 96
view article Article Vision Language Models (Better, Faster, Stronger) By merve and 4 others • 27 days ago • 420
view article Article SmolLM - blazingly fast and remarkably powerful By loubnabnl and 2 others • Jul 16, 2024 • 375
Multimodal DSE Retrievers Collection A collection of DSE models for multimodal retrieval • 5 items • Updated Apr 15 • 14
view article Article NVIDIA's GTC 2025 Announcement for Physical AI Developers: New Open Models and Datasets By mingyuliutw and 4 others • Mar 18 • 41
view article Article π0 and π0-FAST: Vision-Language-Action Models for General Robot Control By danaaubakirova and 3 others • Feb 4 • 158
view article Article SmolVLM - small yet mighty Vision Language Model By andito and 4 others • Nov 26, 2024 • 308
view article Article **MCP is All You Need: The Future of AI Interoperability** By LLMhacker • Mar 18 • 8
view article Article Open-Source Handwritten Signature Detection Model By samuellimabraz • Mar 14 • 113
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published Mar 12 • 72
ViDoRe Benchmark Collection Benchmark for document retrieval using visual features, introduced in the ColPali paper. Datasets are using the QA format. • 10 items • Updated Jan 23 • 18
view article Article Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints By sergeipetrov and 3 others • May 1, 2024 • 77