LaViDa-1.0 Collection LArge VIsion-language Diffusion moDel with mAsking • 11 items • Updated 6 days ago • 6
OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning Paper • 2505.08617 • Published 19 days ago • 40
view article Article What is test-time compute and how to scale it? By Kseniase and 1 other • Feb 6 • 89
Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought Paper • 2504.05599 • Published Apr 8 • 83
stabilityai/stable-diffusion-xl-base-1.0 Text-to-Image • Updated Oct 30, 2023 • 3.1M • • 6.63k
view article Article You could have designed state of the art positional encoding By FL33TW00D-HF • Nov 25, 2024 • 280