metadata
base_model:
- Qwen/Qwen2.5-7B-Instruct
pipeline_tag: image-text-to-text
tags:
- depth-estimation
SSR-MIDI-7B
This model repository is for the models in the paper SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning.