Spaces:
Runtime error
Runtime error
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,136 +1,13 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
<a href='https://huggingface.co/spaces/depth-anything/Depth-Anything-V2'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue'></a>
|
| 15 |
-
<a href='https://huggingface.co/datasets/depth-anything/DA-2K'><img src='https://img.shields.io/badge/Benchmark-DA--2K-yellow' alt='Benchmark'></a>
|
| 16 |
-
</div>
|
| 17 |
-
|
| 18 |
-
This work presents Depth Anything V2. It significantly outperforms [V1](https://github.com/LiheYoung/Depth-Anything) in fine-grained details and robustness. Compared with SD-based models, it enjoys faster inference speed, fewer parameters, and higher depth accuracy.
|
| 19 |
-
|
| 20 |
-

|
| 21 |
-
|
| 22 |
-
## News
|
| 23 |
-
|
| 24 |
-
- **2024-06-14:** Paper, project page, code, models, demo, and benchmark are all released.
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
## Pre-trained Models
|
| 28 |
-
|
| 29 |
-
We provide **four models** of varying scales for robust relative depth estimation:
|
| 30 |
-
|
| 31 |
-
| Model | Params | Checkpoint |
|
| 32 |
-
|:-|-:|:-:|
|
| 33 |
-
| Depth-Anything-V2-Small | 24.8M | [Download](https://huggingface.co/depth-anything/Depth-Anything-V2-Small/resolve/main/depth_anything_v2_vits.pth?download=true) |
|
| 34 |
-
| Depth-Anything-V2-Base | 97.5M | [Download](https://huggingface.co/depth-anything/Depth-Anything-V2-Base/resolve/main/depth_anything_v2_vitb.pth?download=true) |
|
| 35 |
-
| Depth-Anything-V2-Large | 335.3M | [Download](https://huggingface.co/depth-anything/Depth-Anything-V2-Large/resolve/main/depth_anything_v2_vitl.pth?download=true) |
|
| 36 |
-
| Depth-Anything-V2-Giant | 1.3B | Coming soon |
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
### Code snippet to use our models
|
| 40 |
-
```python
|
| 41 |
-
import cv2
|
| 42 |
-
import torch
|
| 43 |
-
|
| 44 |
-
from depth_anything_v2.dpt import DepthAnythingV2
|
| 45 |
-
|
| 46 |
-
# take depth-anything-v2-large as an example
|
| 47 |
-
model = DepthAnythingV2(encoder='vitl', features=256, out_channels=[256, 512, 1024, 1024])
|
| 48 |
-
model.load_state_dict(torch.load('checkpoints/depth_anything_v2_vitl.pth', map_location='cpu'))
|
| 49 |
-
model.eval()
|
| 50 |
-
|
| 51 |
-
raw_img = cv2.imread('your/image/path')
|
| 52 |
-
depth = model.infer_image(raw_img) # HxW raw depth map
|
| 53 |
-
```
|
| 54 |
-
|
| 55 |
-
## Usage
|
| 56 |
-
|
| 57 |
-
### Installation
|
| 58 |
-
|
| 59 |
-
```bash
|
| 60 |
-
git clone https://github.com/DepthAnything/Depth-Anything-V2
|
| 61 |
-
cd Depth-Anything-V2
|
| 62 |
-
pip install -r requirements.txt
|
| 63 |
-
```
|
| 64 |
-
|
| 65 |
-
### Running
|
| 66 |
-
|
| 67 |
-
```bash
|
| 68 |
-
python run.py --encoder <vits | vitb | vitl | vitg> --img-path <path> --outdir <outdir> [--input-size <size>] [--pred-only] [--grayscale]
|
| 69 |
-
```
|
| 70 |
-
Options:
|
| 71 |
-
- `--img-path`: You can either 1) point it to an image directory storing all interested images, 2) point it to a single image, or 3) point it to a text file storing all image paths.
|
| 72 |
-
- `--input-size` (optional): By default, we use input size `518` for model inference. **You can increase the size for even more fine-grained results.**
|
| 73 |
-
- `--pred-only` (optional): Only save the predicted depth map, without raw image.
|
| 74 |
-
- `--grayscale` (optional): Save the grayscale depth map, without applying color palette.
|
| 75 |
-
|
| 76 |
-
For example:
|
| 77 |
-
```bash
|
| 78 |
-
python run.py --encoder vitl --img-path assets/examples --outdir depth_vis
|
| 79 |
-
```
|
| 80 |
-
|
| 81 |
-
**If you want to use Depth Anything V2 on videos:**
|
| 82 |
-
|
| 83 |
-
```bash
|
| 84 |
-
python run_video.py --encoder vitl --video-path assets/examples_video --outdir video_depth_vis
|
| 85 |
-
```
|
| 86 |
-
|
| 87 |
-
*Please note that our larger model has better temporal consistency on videos.*
|
| 88 |
-
|
| 89 |
-
|
| 90 |
-
### Gradio demo
|
| 91 |
-
|
| 92 |
-
To use our gradio demo locally:
|
| 93 |
-
|
| 94 |
-
```bash
|
| 95 |
-
python app.py
|
| 96 |
-
```
|
| 97 |
-
|
| 98 |
-
You can also try our [online demo](https://huggingface.co/spaces/Depth-Anything/Depth-Anything-V2).
|
| 99 |
-
|
| 100 |
-
**Note:** Compared to V1, we have made a minor modification to the DINOv2-DPT architecture (originating from this [issue](https://github.com/LiheYoung/Depth-Anything/issues/81)). In V1, we *unintentionally* used features from the last four layers of DINOv2 for decoding. In V2, we use [intermediate features](https://github.com/DepthAnything/Depth-Anything-V2/blob/2cbc36a8ce2cec41d38ee51153f112e87c8e42d8/depth_anything_v2/dpt.py#L164-L169) instead. Although this modification did not improve details or accuracy, we decided to follow this common practice.
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
## Fine-tuned to Metric Depth Estimation
|
| 105 |
-
|
| 106 |
-
Please refer to [metric depth estimation](./metric_depth).
|
| 107 |
-
|
| 108 |
-
|
| 109 |
-
## DA-2K Evaluation Benchmark
|
| 110 |
-
|
| 111 |
-
Please refer to [DA-2K benchmark](./DA-2K.md).
|
| 112 |
-
|
| 113 |
-
## LICENSE
|
| 114 |
-
|
| 115 |
-
Depth-Anything-V2-Small model is under the Apache-2.0 license. Depth-Anything-V2-Base/Large/Giant models are under the CC-BY-NC-4.0 license.
|
| 116 |
-
|
| 117 |
-
|
| 118 |
-
## Citation
|
| 119 |
-
|
| 120 |
-
If you find this project useful, please consider citing:
|
| 121 |
-
|
| 122 |
-
```bibtex
|
| 123 |
-
@article{depth_anything_v2,
|
| 124 |
-
title={Depth Anything V2},
|
| 125 |
-
author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Zhao, Zhen and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},
|
| 126 |
-
journal={arXiv:2406.09414},
|
| 127 |
-
year={2024}
|
| 128 |
-
}
|
| 129 |
-
|
| 130 |
-
@inproceedings{depth_anything_v1,
|
| 131 |
-
title={Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data},
|
| 132 |
-
author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},
|
| 133 |
-
booktitle={CVPR},
|
| 134 |
-
year={2024}
|
| 135 |
-
}
|
| 136 |
-
```
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: Depth Anything V2
|
| 3 |
+
emoji: 馃寲
|
| 4 |
+
colorFrom: red
|
| 5 |
+
colorTo: indigo
|
| 6 |
+
sdk: gradio
|
| 7 |
+
sdk_version: 4.36.0
|
| 8 |
+
app_file: app.py
|
| 9 |
+
pinned: false
|
| 10 |
+
license: apache-2.0
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|