SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics Paper • 2506.01844 • Published Jun 2 • 127
view article Article PaliGemma – Google's Cutting-Edge Open Vision Language Model By merve and 2 others • May 14, 2024 • 265
PEFT papers Collection A collection of methods that have been implemented in the 🤗 PEFT library • 12 items • Updated Jan 30, 2024 • 28
BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular Depth Estimation Paper • 2407.17952 • Published Jul 25, 2024 • 33 • 7
BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular Depth Estimation Paper • 2407.17952 • Published Jul 25, 2024 • 33
view article Article 💃Introducing the first LLM-based Motion understanding model: MotionLLM By EvanTHU • Jun 26, 2024 • 3