WorldVLA: Towards Autoregressive Action World Model Paper • 2506.21539 • Published 3 days ago • 33 • 3
OneTwoVLA: A Unified Vision-Language-Action Model with Adaptive Reasoning Paper • 2505.11917 • Published May 17 • 1
LoHoVLA: A Unified Vision-Language-Action Model for Long-Horizon Embodied Tasks Paper • 2506.00411 • Published 30 days ago • 30
view article Article PaliGemma – Google's Cutting-Edge Open Vision Language Model By merve and 2 others • May 14, 2024 • 254
view article Article 🤗 PEFT: Parameter-Efficient Fine-Tuning of Billion-Scale Models on Low-Resource Hardware By smangrul and 1 other • Feb 10, 2023 • 86
World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning Paper • 2503.10480 • Published Mar 13 • 54
Improved Visual-Spatial Reasoning via R1-Zero-Like Training Paper • 2504.00883 • Published Apr 1 • 64
Show-o Turbo: Towards Accelerated Unified Multimodal Understanding and Generation Paper • 2502.05415 • Published Feb 8 • 22