Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme Paper • 2504.02587 • Published Apr 3 • 33
thedeoxen/FLUX.1-Kontext-dev-reference-depth-fusion-LORA Image-to-Image • Updated 13 days ago • 2.18k • • 47
google/owlv2-base-patch16-ensemble Zero-Shot Object Detection • 0.2B • Updated Oct 31, 2024 • 370k • 110