Spaces:
Running
Data Preparation
The Omni3D dataset is comprised of 6 datasets which have been pre-processed into the same annotation format and camera coordinate systems. To use a subset or the full dataset you must download:
- The processed Omni3D json files
- RGB images from each dataset separately
Download Omni3D json
Run
sh datasets/Omni3D/download_omni3d_json.sh
to download and extract the Omni3D train, val and test json annotation files.
Download Individual Datasets
Below are the instructions for setting up each individual dataset. It is recommended to download only the data you plan to use.
KITTI
Download the left color images from KITTI's official website. Unzip or softlink the images into the root ./Omni3D/
which should have the folder structure as detailed below. Note that we only require the image_2 folder.
datasets/KITTI_object
βββ training
βββ image_2
nuScenes
Download the trainval images from the official nuScenes website. Unzip or softlink the images into the root ./Omni3D/
which should have the folder structure as detailed below. Note that we only require the CAM_FRONT folder.
datasets/nuScenes/samples
βββ samples
βββ CAM_FRONT
Objectron
Run
sh datasets/objectron/download_objectron_images.sh
to download and extract the Objectron pre-processed images (~24 GB).
SUN RGB-D
Download the "SUNRGBD V1" images at SUN RGB-D's official website. Unzip or softlink the images into the root ./Omni3D/
which should have the folder structure as detailed below.
./Omni3D/datasets/SUNRGBD
βββ kv1
βββ kv2
βββ realsense
ARKitScenes
Run
sh datasets/ARKitScenes/download_arkitscenes_images.sh
to download and extract the ARKitScenes pre-processed images (~28 GB).
Hypersim
Follow the download instructions from Thomas Germer in order to download all *tonemap.jpg preview images in order to avoid downloading the full Hypersim dataset. For example:
git clone https://github.com/apple/ml-hypersim
cd ml-hypersim/
python contrib/99991/download.py -c .tonemap.jpg -d /path/to/Omni3D/datasets/hypersim --silent
Then arrange or unzip the downloaded images into the root ./Omni3D/
so that it has the below folder structure.
datasets/hypersim/
βββ ai_001_001
βββ ai_001_002
βββ ai_001_003
βββ ai_001_004
βββ ai_001_005
βββ ai_001_006
...
Data Usage
Below we describe the unified 3D annotation coordinate systems, annotation format, and an example script.
Coordinate System
All 3D annotations are provided in a shared camera coordinate system with +x right, +y down, +z toward screen.
The vertex order of bbox3D_cam:
v4_____________________v5
/| /|
/ | / |
/ | / |
/___|_________________/ |
v0| | |v1 |
| | | |
| | | |
| | | |
| |_________________|___|
| / v7 | /v6
| / | /
| / | /
|/_____________________|/
v3 v2
Annotation Format
Each dataset is formatted as a dict in python in the below format.
dataset {
"info" : info,
"images" : [image],
"categories" : [category],
"annotations" : [object],
}
info {
"id" : str,
"source" : int,
"name" : str,
"split" : str,
"version" : str,
"url" : str,
}
image {
"id" : int,
"dataset_id" : int,
"width" : int,
"height" : int,
"file_path" : str,
"K" : list (3x3),
"src_90_rotate" : int, # im was rotated X times, 90 deg counterclockwise
"src_flagged" : bool, # flagged as potentially inconsistent sky direction
}
category {
"id" : int,
"name" : str,
"supercategory" : str
}
object {
"id" : int, # unique annotation identifier
"image_id" : int, # identifier for image
"category_id" : int, # identifier for the category
"category_name" : str, # plain name for the category
# General 2D/3D Box Parameters.
# Values are set to -1 when unavailable.
"valid3D" : bool, # flag for no reliable 3D box
"bbox2D_tight" : [x1, y1, x2, y2], # 2D corners of annotated tight box
"bbox2D_proj" : [x1, y1, x2, y2], # 2D corners projected from bbox3D
"bbox2D_trunc" : [x1, y1, x2, y2], # 2D corners projected from bbox3D then truncated
"bbox3D_cam" : [[x1, y1, z1]...[x8, y8, z8]] # 3D corners in meters and camera coordinates
"center_cam" : [x, y, z], # 3D center in meters and camera coordinates
"dimensions" : [width, height, length], # 3D attributes for object dimensions in meters
"R_cam" : list (3x3), # 3D rotation matrix to the camera frame rotation
# Optional dataset specific properties,
# used mainly for evaluation and ignore.
# Values are set to -1 when unavailable.
"behind_camera" : bool, # a corner is behind camera
"visibility" : float, # annotated visibility 0 to 1
"truncation" : float, # computed truncation 0 to 1
"segmentation_pts" : int, # visible instance segmentation points
"lidar_pts" : int, # visible LiDAR points in the object
"depth_error" : float, # L1 of depth map and rendered object
}
Example Loading Data
Each dataset is named as "Omni3D_{name}_{split}.json" where split can be train, val, or test.
The annotations are in a COCO-like format such that if you load the json from the Omni3D class which inherits the COCO class, you can use basic COCO dataset functions as demonstrated with the below code.
from cubercnn import data
dataset_paths_to_json = ['path/to/Omni3D/{name}_{split}.json', ...]
# Example 1. load all images
dataset = data.Omni3D(dataset_paths_to_json)
imgIds = dataset.getImgIds()
imgs = dataset.loadImgs(imgIds)
# Example 2. load annotations for image index 0
annIds = dataset.getAnnIds(imgIds=imgs[0]['id'])
anns = dataset.loadAnns(annIds)