Spaces:
Running
on
L4
A newer version of the Gradio SDK is available:
5.34.2
Preprocess Scripts
Please download all datasets from their original sources, except for vKITTI, for which we provide a fully processed version—no need to download the original dataset. For MapFree and DL3DV, we also release depth maps computed using COLMAP Multi-View Stereo (MVS). See the sections below for details on the processing of each dataset. Please ensure compliance with the respective licensing agreements when downloading. The total data takes about 25TB of disk space.
If you encounter issues in the scripts, please feel free to create an issue.
- ARKitScenes
- BlendedMVS
- CO3Dv2
- MegaDepth
- ScanNet++
- ScanNet
- WayMo Open dataset
- WildRGB-D
- Map-free
- TartanAir
- UnrealStereo4K
- Virtual KITTI 2
- 3D Ken Burns
- BEDLAM
- COP3D
- DL3DV
- Dynamic Replica
- EDEN
- Hypersim
- IRS
- Matterport3D
- MVImgNet
- MVS-Synth
- OmniObject3D
- PointOdyssey
- RealEstate10K
- SmartPortraits
- Spring
- Synscapes
- UASOL
- UrbanSyn
- HOI4D
ARKitScenes
First download the pre-computed pairs provided by DUSt3R.
Then run the following command,
python preprocess_arkitscenes.py --arkitscenes_dir /path/to/your/raw/data --precomputed_pairs /path/to/your/pairs --output_dir /path/to/your/outdir
python generate_set_arkitscenes.py --root /path/to/your/outdir --splits Training Test --max_interval 5.0 --num_workers 8
ARKitScenes_highres
This dataset is a subset of ARKitScenes with high resolution depthmaps.
python preprocess_arkitscenes_highres.py --arkitscenes_dir /path/to/your/raw/data --output_dir /path/to/your/outdir
BlendedMVS
Follow DUSt3R to generate the processed BlendedMVS data:
python preprocess_blendedmvs.py --blendedmvs_dir /path/to/your/raw/data --precomputed_pairs /path/to/your/pairs --output_dir /path/to/your/outdir
Then put our overlap set under /path/to/your/outdir
.
CO3D
Follow DUSt3R to generate the processed CO3D data.
python3 preprocess_co3d.py --co3d_dir /path/to/your/raw/data --output_dir /path/to/your/outdir
MegaDepth
First download our precomputed set under /path/to/your/outdir
.
Then run
python preprocess_megadepth.py --megadepth_dir /path/to/your/raw/data --precomputed_sets /path/to/precomputed_sets --output_dir /path/to/your/outdir
Scannet
python preprocess_scannet.py --scannet_dir /path/to/your/raw/data --output_dir /path/to/your/outdir
python generate_set_scannet.py --root /path/to/your/outdir \
--splits scans_test scans_train --max_interval 150 --num_workers 8
Scannet++
First download the pre-computed pairs provided by DUSt3R.
Then run the following command,
python preprocess_scannetpp.py --scannetpp_dir /path/to/your/raw/data --precomputed_pairs /path/to/your/pairs --output_dir /path/to/your/outdir
python generate_set_scannetpp.py --root /path/to/your/outdir \
--max_interval 150 --num_workers 8
Waymo
Follow DUSt3R to generate the processed Waymo data.
python3 preprocess_waymo.py --waymo_dir /path/to/your/raw/data -precomputed_pairs /path/to/precomputed_pairs --output_dir /path/to/your/outdir
Then download our invalid_files and put it under /path/to/your/outdir
.
WildRGBD
Follow DUSt3R to generate the processed WildRGBD data.
python3 preprocess_wildrgbd.py --wildrgbd_dir /path/to/your/raw/data --output_dir /path/to/your/outdir
Mapfree
First preprocess the colmap results provided by Mapfree:
python3 preprocess_mapfree.py --mapfree_dir /path/to/train/data --colmap_dir /path/to/colmap/data --output_dir /path/to/first/outdir
Then re-organize the data structure:
python3 preprocess_mapfree2.py --mapfree_dir /path/to/first/outdir --output_dir /path/to/final/outdir
Finally, download our released depths and masks and combine it with your /path/to/final/outdir
.
rsync -av --update /path/to/our/release /path/to/final/outdir
TartanAir
python3 preprocess_tartanair.py --tartanair_dir /path/to/your/raw/data --output_dir /path/to/your/outdir
UnrealStereo4K
python3 preprocess_unreal4k.py --unreal4k_dir /path/to/your/raw/data --output_dir /path/to/your/outdir
Virtual KITTI 2
As Virtual KITTI 2 is using CC BY-NC-SA 3.0 License, we directly release our preprocessed data.
3D Ken Burns
python preprocess_3dkb.py --root /path/to/data_3d_ken_burns \
--out_dir /path/to/processed_3dkb \
[--num_workers 4] [--seed 42]
BEDLAM
python preprocess_bedlam.py --root /path/to/extracted_data \
--outdir /path/to/processed_bedlam \
[--num_workers 4]
COP3D
python3 preprocess_cop3d.py --cop3d_dir /path/to/cop3d \
--output_dir /path/to/processed_cop3d
DL3DV
Due to current potential problems with license, you may need to run multi-view stereo on DL3DV by yourself (which is extremely time consuming). If this is done, then you can use our preprocess script:
```
python3 preprocess_dl3dv.py --dl3dv_dir /path/to/dl3dv \
--output_dir /path/to/processed_dl3dv
```
Update: We've released the full version of our processed DL3DV dataset!
To use our processed DL3DV data, please ensure that you first cite the original DL3DV work and adhere to their licensing terms.
You can then download the following components:
After downloading, merge the components using the provided script:
python3 merge_dl3dv.py # remember to change necessary paths
Dynamic Replica
python preprocess_dynamic_replica.py --root_dir /path/to/data_dynamic_replica \
--out_dir /path/to/processed_dynamic_replica
EDEN
python preprocess_eden.py --root /path/to/data_raw_videos/data_eden \
--out_dir /path/to/data_raw_videos/processed_eden \
[--num_workers N]
Hypersim
python preprocess_hypersim.py --hypersim_dir /path/to/hypersim \
--output_dir /path/to/processed_hypersim
IRS
python preprocess_irs.py
--root_dir /path/to/data_irs
--out_dir /path/to/processed_irs
Matterport3D
python preprocess_mp3d.py --root_dir /path/to/data_mp3d/v1/scans \
--out_dir /path/to/processed_mp3d
MVImgNet
python preprocess_mvimgnet.py --data_dir /path/to/MVImgNet_data \
--pcd_dir /path/to/MVPNet \
--output_dir /path/to/processed_mvimgnet
MVS-Synth
python preprocess_mvs_synth.py --root_dir /path/to/data_mvs_synth/GTAV_720/ \
--out_dir /path/to/processed_mvs_synth \
--num_workers 32
OmniObject3D
python preprocess_omniobject3d.py --input_dir /path/to/input_root --output_dir /path/to/output_root
PointOdyssey
python preprocess_point_odyssey.py --input_dir /path/to/input_dataset --output_dir /path/to/output_dataset
RealEstate10K
python preprocess_re10k.py --root_dir /path/to/train \
--info_dir /path/to/RealEstate10K/train \
--out_dir /path/to/processed_re10k
SmartPortraits
You need to follow the official processing pipeline first. Replace the convert_to_TUM/utils/convert_to_tum.py
with our datasets_preprocess/custom_convert2TUM.py
(You may need to change the input path and output path).
Then run
python preprocess_smartportraits.py \
--input_dir /path/to/official/pipeline/output \
--output_dir /path/to/processed_smartportraits
Spring
python preprocess_spring.py \
--root_dir /path/to/spring/train \
--out_dir /path/to/processed_spring \
--baseline 0.065 \
--output_size 960 540
Synscapes
python preprocess_synscapes.py \
--synscapes_dir /path/to/Synscapes/Synscapes \
--output_dir /path/to/processed_synscapes
UASOL
python preprocess_uasol.py \
--input_dir /path/to/data_uasol \
--output_dir /path/to/processed_uasol
UrbanSyn
python preprocess_urbansyn.py \
--input_dir /path/to/data_urbansyn \
--output_dir /path/to/processed_urbansyn
HOI4D
python preprocess_hoi4d.py \
--root_dir /path/to/HOI4D_release \
--cam_root /path/to/camera_params \
--out_dir /path/to/processed_hoi4d