readme
Browse files
README.md
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
title: RaySt3R
|
2 |
+
emoji: ✨
|
3 |
+
colorFrom: indigo
|
4 |
+
colorTo: red
|
5 |
+
sdk: gradio
|
6 |
+
sdk_version: 4.38.1
|
7 |
+
app_file: app.py
|
8 |
+
pinned: false
|
9 |
+
models:
|
10 |
+
- bartduis/rayst3r
|
readme.md
DELETED
@@ -1,110 +0,0 @@
|
|
1 |
-
# RaySt3R: Predicting Novel Depth Maps for Zero-Shot Object Completion
|
2 |
-
<div align="center", documentation will follow later.
|
3 |
-
<a href="https://arxiv.org/abs/2506.05285"><img src='https://img.shields.io/badge/arXiv-Paper-red?logo=arxiv&logoColor=white' alt='arXiv'></a>
|
4 |
-
<a href='https://rayst3r.github.io'><img src='https://img.shields.io/badge/Project_Page-Website-green?logo=googlechrome&logoColor=white' alt='Project Page'></a>
|
5 |
-
|
6 |
-
</div>
|
7 |
-
|
8 |
-
<div align="center">
|
9 |
-
<img src="assets/overview.png" width="80%" alt="Method overview">
|
10 |
-
</div>
|
11 |
-
|
12 |
-
## 📚 Citation
|
13 |
-
```bibtex
|
14 |
-
@misc{rayst3r,
|
15 |
-
title={RaySt3R: Predicting Novel Depth Maps for Zero-Shot Object Completion},
|
16 |
-
author={Bardienus P. Duisterhof and Jan Oberst and Bowen Wen and Stan Birchfield and Deva Ramanan and Jeffrey Ichnowski},
|
17 |
-
year={2025},
|
18 |
-
eprint={2506.05285},
|
19 |
-
archivePrefix={arXiv},
|
20 |
-
primaryClass={cs.CV},
|
21 |
-
url={https://arxiv.org/abs/2506.05285},
|
22 |
-
}
|
23 |
-
```
|
24 |
-
## ✅ TO-DOs
|
25 |
-
|
26 |
-
- [x] Inference code
|
27 |
-
- [x] Local gradio demo
|
28 |
-
- [ ] Huggingface demo
|
29 |
-
- [ ] Docker
|
30 |
-
- [ ] Training code
|
31 |
-
- [ ] Eval code
|
32 |
-
- [ ] ViT-S, No-DINO and Pointmap models
|
33 |
-
- [ ] Dataset release
|
34 |
-
|
35 |
-
# ⚙️ Installation
|
36 |
-
|
37 |
-
```bash
|
38 |
-
mamba create -n rayst3r python=3.11 cmake=3.14.0
|
39 |
-
mamba activate rayst3r
|
40 |
-
mamba install pytorch torchvision pytorch-cuda=12.4 -c pytorch -c nvidia # change to your version of cuda
|
41 |
-
pip install -r requirements.txt
|
42 |
-
|
43 |
-
# compile the cuda kernels for RoPE
|
44 |
-
cd extensions/curope/
|
45 |
-
python setup.py build_ext --inplace
|
46 |
-
cd ../../
|
47 |
-
```
|
48 |
-
|
49 |
-
# 🚀 Usage
|
50 |
-
|
51 |
-
The expected input for RaySt3R is a folder with the following structure:
|
52 |
-
|
53 |
-
<pre><code>
|
54 |
-
📁 data_dir/
|
55 |
-
├── cam2world.pt # Camera-to-world transformation (PyTorch tensor), 4x4 - eye(4) if not provided
|
56 |
-
├── depth.png # Depth image, uint16 with max 10 meters
|
57 |
-
├── intrinsics.pt # Camera intrinsics (PyTorch tensor), 3x3
|
58 |
-
├── mask.png # Binary mask image
|
59 |
-
└── rgb.png # RGB image
|
60 |
-
</code></pre>
|
61 |
-
|
62 |
-
Note the depth image needs to be saved in uint16, normalized to a 0-10 meters range. We provide an example directory in `example_scene`.
|
63 |
-
Run RaySt3R with:
|
64 |
-
|
65 |
-
|
66 |
-
```bash
|
67 |
-
python3 eval_wrapper/eval.py example_scene/
|
68 |
-
```
|
69 |
-
This writes a colored point cloud back into the input directory.
|
70 |
-
|
71 |
-
Optional flags:
|
72 |
-
```bash
|
73 |
-
--visualize # Spins up a rerun client to visualize predictions and camera posees
|
74 |
-
--run_octmae # Novel views sampled with the OctMAE parameters (see paper)
|
75 |
-
--set_conf N # Sets confidence threshold to N
|
76 |
-
--n_pred_views # Number of predicted views along each axis in a grid, 5--> 22 views total
|
77 |
-
--filter_all_masks # Use all masks, point gets rejected if in background for a single mask
|
78 |
-
--tsdf # Fits TSDF to depth maps
|
79 |
-
```
|
80 |
-
|
81 |
-
# 🧪 Gradio app
|
82 |
-
|
83 |
-
We also provide a gradio app, which uses <a href="https://wangrc.site/MoGePage/">MoGe</a> and <a href="https://github.com/danielgatis/rembg">Rembg</a> to generate 3D from a single image.
|
84 |
-
|
85 |
-
Launch it with:
|
86 |
-
```bash
|
87 |
-
python app.py
|
88 |
-
```
|
89 |
-
|
90 |
-
# 🎛️ Parameter Guide
|
91 |
-
|
92 |
-
Certain applications may benefit from different hyper parameters, here we provide guidance on how to select them.
|
93 |
-
|
94 |
-
#### 🔁 View Sampling
|
95 |
-
|
96 |
-
We sample novel views evenly on a cylindrical equal-area projection of the sphere.
|
97 |
-
Customize sampling in <a href="eval_wrapper/sample_poses.py">sample_poses.py</a>. Use --n_pred_views to reduce the total number of views, making inference faster and reduce overlap and artifacts.
|
98 |
-
|
99 |
-
#### 🟢 Confidence Threshold
|
100 |
-
|
101 |
-
You can set the confidence threshold with the --set_conf threshold. As shown in the paper, a higher threshold generally improves accuracy, reduces edge bleeding but also affects completeness.
|
102 |
-
|
103 |
-
#### 🧼 RaySt3R Masks
|
104 |
-
|
105 |
-
On top of what was presented in the paper, we also provide the option to consider all predicted masks for each point. I.e., for any point, if any of the predicted masks classifies them as background the point gets removed.
|
106 |
-
In our limited testing this led to cleaner predictions, but it ocasinally carves out crucial parts of geometry.
|
107 |
-
|
108 |
-
# 🏋️ Training
|
109 |
-
|
110 |
-
The RaySt3R training command is provided in <a href="xps/train_rayst3r.py">train_rayst3r.py</a>, documentation will follow later.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|