- [Data Preparation](#data-preparation) - [Download Omni3D json](#download-omni3d-json) - [Download Individual Datasets](#download-individual-datasets) - [Data Usage](#data-usage) - [Coordinate System](#coordinate-system) - [Annotation Format](#annotation-format) - [Example Loading Data](#example-loading-data) # Data Preparation The Omni3D dataset is comprised of 6 datasets which have been pre-processed into the same annotation format and camera coordinate systems. To use a subset or the full dataset you must download: 1. The processed Omni3D json files 2. RGB images from each dataset separately ## Download Omni3D json Run ``` sh datasets/Omni3D/download_omni3d_json.sh ``` to download and extract the Omni3D train, val and test json annotation files. ## Download Individual Datasets Below are the instructions for setting up each individual dataset. It is recommended to download only the data you plan to use. ### KITTI Download the left color images from [KITTI's official website](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d). Unzip or softlink the images into the root `./Omni3D/` which should have the folder structure as detailed below. Note that we only require the image_2 folder. ```bash datasets/KITTI_object └── training ├── image_2 ``` ### nuScenes Download the trainval images from the [official nuScenes website](https://www.nuscenes.org/nuscenes#download). Unzip or softlink the images into the root `./Omni3D/` which should have the folder structure as detailed below. Note that we only require the CAM_FRONT folder. ```bash datasets/nuScenes/samples └── samples ├── CAM_FRONT ``` ### Objectron Run ``` sh datasets/objectron/download_objectron_images.sh ``` to download and extract the Objectron pre-processed images (~24 GB). ### SUN RGB-D Download the "SUNRGBD V1" images at [SUN RGB-D's official website](https://rgbd.cs.princeton.edu/). Unzip or softlink the images into the root `./Omni3D/` which should have the folder structure as detailed below. ```bash ./Omni3D/datasets/SUNRGBD ├── kv1 ├── kv2 ├── realsense ``` ### ARKitScenes Run ``` sh datasets/ARKitScenes/download_arkitscenes_images.sh ``` to download and extract the ARKitScenes pre-processed images (~28 GB). ### Hypersim Follow the [download instructions](https://github.com/apple/ml-hypersim/tree/main/contrib/99991) from [Thomas Germer](https://github.com/99991) in order to download all \*tonemap.jpg preview images in order to avoid downloading the full Hypersim dataset. For example: ```bash git clone https://github.com/apple/ml-hypersim cd ml-hypersim/ python contrib/99991/download.py -c .tonemap.jpg -d /path/to/Omni3D/datasets/hypersim --silent ``` Then arrange or unzip the downloaded images into the root `./Omni3D/` so that it has the below folder structure. ```bash datasets/hypersim/ ├── ai_001_001 ├── ai_001_002 ├── ai_001_003 ├── ai_001_004 ├── ai_001_005 ├── ai_001_006 ... ``` # Data Usage Below we describe the unified 3D annotation coordinate systems, annotation format, and an example script. ## Coordinate System All 3D annotations are provided in a shared camera coordinate system with +x right, +y down, +z toward screen. The vertex order of bbox3D_cam: ``` v4_____________________v5 /| /| / | / | / | / | /___|_________________/ | v0| | |v1 | | | | | | | | | | | | | | |_________________|___| | / v7 | /v6 | / | / | / | / |/_____________________|/ v3 v2 ``` ## Annotation Format Each dataset is formatted as a dict in python in the below format. ```python dataset { "info" : info, "images" : [image], "categories" : [category], "annotations" : [object], } info { "id" : str, "source" : int, "name" : str, "split" : str, "version" : str, "url" : str, } image { "id" : int, "dataset_id" : int, "width" : int, "height" : int, "file_path" : str, "K" : list (3x3), "src_90_rotate" : int, # im was rotated X times, 90 deg counterclockwise "src_flagged" : bool, # flagged as potentially inconsistent sky direction } category { "id" : int, "name" : str, "supercategory" : str } object { "id" : int, # unique annotation identifier "image_id" : int, # identifier for image "category_id" : int, # identifier for the category "category_name" : str, # plain name for the category # General 2D/3D Box Parameters. # Values are set to -1 when unavailable. "valid3D" : bool, # flag for no reliable 3D box "bbox2D_tight" : [x1, y1, x2, y2], # 2D corners of annotated tight box "bbox2D_proj" : [x1, y1, x2, y2], # 2D corners projected from bbox3D "bbox2D_trunc" : [x1, y1, x2, y2], # 2D corners projected from bbox3D then truncated "bbox3D_cam" : [[x1, y1, z1]...[x8, y8, z8]] # 3D corners in meters and camera coordinates "center_cam" : [x, y, z], # 3D center in meters and camera coordinates "dimensions" : [width, height, length], # 3D attributes for object dimensions in meters "R_cam" : list (3x3), # 3D rotation matrix to the camera frame rotation # Optional dataset specific properties, # used mainly for evaluation and ignore. # Values are set to -1 when unavailable. "behind_camera" : bool, # a corner is behind camera "visibility" : float, # annotated visibility 0 to 1 "truncation" : float, # computed truncation 0 to 1 "segmentation_pts" : int, # visible instance segmentation points "lidar_pts" : int, # visible LiDAR points in the object "depth_error" : float, # L1 of depth map and rendered object } ``` ## Example Loading Data Each dataset is named as "Omni3D_{name}_{split}.json" where split can be train, val, or test. The annotations are in a COCO-like format such that if you load the json from the Omni3D class which inherits the [COCO class](https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocotools/coco.py#L70), you can use basic COCO dataset functions as demonstrated with the below code. ```python from cubercnn import data dataset_paths_to_json = ['path/to/Omni3D/{name}_{split}.json', ...] # Example 1. load all images dataset = data.Omni3D(dataset_paths_to_json) imgIds = dataset.getImgIds() imgs = dataset.loadImgs(imgIds) # Example 2. load annotations for image index 0 annIds = dataset.getAnnIds(imgIds=imgs[0]['id']) anns = dataset.loadAnns(annIds) ```