base_model: | |
- Qwen/Qwen2.5-7B-Instruct | |
pipeline_tag: image-text-to-text | |
tags: | |
- depth-estimation | |
## SSR-MIDI-7B | |
This model repository is for the models in the paper [SSR: Enhancing Depth Perception in Vision-Language | |
Models via Rationale-Guided Spatial Reasoning](https://arxiv.org/abs/2505.12448). |