nyu-visionx/Cambrian-S-1.5B
Image-to-Text
•
2B
•
Updated
•
21
•
3
None defined yet.
SIMS-V: Simulated Instruction-Tuning for Spatial Video Understanding
Benchmark Designers Should "Train on the Test Set" to Expose Exploitable Non-Visual Shortcuts