Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers Paper • 2506.03065 • Published 4 days ago • 27
MotionSight: Boosting Fine-Grained Motion Understanding in Multimodal LLMs Paper • 2506.01674 • Published 6 days ago • 26
NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks Paper • 2504.19854 • Published Apr 28 • 7
LoHoVLA: A Unified Vision-Language-Action Model for Long-Horizon Embodied Tasks Paper • 2506.00411 • Published 8 days ago • 28
Taming LLMs by Scaling Learning Rates with Gradient Grouping Paper • 2506.01049 • Published 6 days ago • 36
Holo1 Collection Vision-Language Action Model for use in Surfer-H web navigation agent • 5 items • Updated 3 days ago • 39
view article Article Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H By Hcompany and 1 other • 5 days ago • 60
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 26 items • Updated May 1 • 569
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time Paper • 2505.24863 • Published 8 days ago • 89
Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published Nov 20, 2024 • 46
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding Paper • 2505.22618 • Published 10 days ago • 39
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published 8 days ago • 115
AnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views Paper • 2505.23716 • Published 9 days ago • 31
Kimina Prover Preview Collection State-of-the-Art Models for Formal Mathematical Reasoning • 5 items • Updated Apr 28 • 31
ZeroGUI: Automating Online GUI Learning at Zero Human Cost Paper • 2505.23762 • Published 9 days ago • 45
Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration Paper • 2505.20256 • Published 12 days ago • 17
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning Paper • 2505.17667 • Published 16 days ago • 86
Exploring the Latent Capacity of LLMs for One-Step Text Generation Paper • 2505.21189 • Published 12 days ago • 60