Deliberation in Latent Space via Differentiable Cache Augmentation Paper • 2412.17747 • Published 3 days ago • 25
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper • 2412.03555 • Published 22 days ago • 118
Mimir: Improving Video Diffusion Models for Precise Text Understanding Paper • 2412.03085 • Published 22 days ago • 12
ShowUI: One Vision-Language-Action Model for GUI Visual Agent Paper • 2411.17465 • Published 30 days ago • 76