The dataset consists of images of the 21 YCB-V objects captured using a robotic manipulator Kuka IIWA 14 with a Basler camera with 2K resolution. The image poses are calculated from the kinematic chain of the robot, the camera is calibrated with hand-eye configuration and the OpenCV routine, and the image undistortion is done with COLMAP.
The dataset contains one folder per object containing undisorted images, object masks, camera calibration and camera poses in multiple formats compatible with NerfStudio and COLMAP.
The data is oragnized as folllows:
├── Object │ ├── transforms_100.json │ ├── transforms_10.json │ ├── transforms_150.json │ ├── transforms_15.json │ ├── transforms_20.json │ ├── transforms_25.json │ ├── transforms_300.json │ ├── transforms_3.json │ ├── transforms_50.json │ ├── transforms_5.json │ ├── transforms_75.json │ ├── transforms_all.json │ └── undistorted_images │ ├── img_0001.png │ ├── img_0002.png │ ├── img_0003.png │ ├── img_0004.png │ ├── ... │ └── ... │ └── masks │ ├── img_0001.png │ ├── img_0002.png │ ├── img_0003.png │ ├── img_0004.png │ ├── ... │ └── ... │ └── colmap │ ├── points3D.txt │ ├── images.txt │ └── cameras.txt ...............
transforms_N.json
: contains the camera poses and intrinsics for a subset ofN
image in the format expected by the NerfStudio dataparser. The poses have already been pre-processed to fit inside a unit cube to facilitate the Nerf-based reconstruction. The subset of images are in the range of 3, 5, 10, 15, 20, 25, 50, 75, 100, 150, 300, and all images. The poses can be converted into meter scale by multiplying the poses with the parameterreal_world_scale
in the JSON file. All camera poses are registered to the coordinate frame of the object defined by the BOP release of the YCB-V meshes.colmap
: contains an empty 3D model with the camera calibration and the image poses written using the COLMAP convention.undistorted_images
: contains the undistorted imagesmasks
: contains the object's masks
The mesh dataset consists of the meshes of the 21
YCB-V objects reconstructed from the
abovementioned calibrated images and using the following methods.
The codebase used for each method is specified between ( )
.
1. BakedSDF ( SDFstudio)
2. Capture Reality (Native)
3. Colmap (Native)
4. MonoSDF (SDFstudio)
5. Nerfacto (NerfStudio)
6. Neuralangelo (Native)
7. NeUS (SDfStudio)
8. Instant-NGP (Native)
9. Plenoxels (Native)
10. VolSDF (SDFstudio)
12. UniSurf (SDFstudio)
For a given method, each object mesh is contained in one folder either in .obj
or .ply
format.
The folder also holds a texture file whenever the reconstrcution method
produced one, else the mesh is colored only.
As for the camera psoes, the meshes are already registered to the coordinate frame of the object defined by the
BOP release of the YCB-V meshes
├── Method_subset │ ├── 01_master_chef_can │ │ ├── mesh_0.png │ │ ├── mesh.mtl │ │ └── mesh.obj │ ├── 02_cracker_box │ │ ├── mesh_0.png │ │ ├── mesh.mtl │ │ └── mesh.obj │ ├── 04_tomatoe_soup_can │ │ ├── mesh_0.png │ │ ├── mesh.mtl │ │ └── mesh.obj │ ├── 05_mustard_bottle │ .... ..............
The image dataset can be downloaded from the data page with:
# e.g. to download the 19_large_clamp object,
mkdir -p data/19_large_clamp
wget https://data.ciirc.cvut.cz/public/projects/2023BenchmarkPoseEstimationReconstructedMesh/Image_dataset/19_large_clamp.zip -P data/
unzip data/19_large_clamp.zip -d data/19_large_clamp
The mesh dataset can be downloded from the data_page with:
wget https://data.ciirc.cvut.cz/public/projects/2023BenchmarkPoseEstimationReconstructedMesh/reconstructed_meshes/<method>_<dataset_size>.zip"
# e.g. To download meshes reconstructed by Nerfacto trained on all images
wget https://data.ciirc.cvut.cz/public/projects/2023BenchmarkPoseEstimationReconstructedMesh/reconstructed_meshes/<nerfacto>_<all>.zip"
We provide scripts to convert the poses from the NerfStudio convention to the COLMAP and BOP-Benchmark ones.
ones:
python3 -m scripts.nerf_to_colmap --dataset_dir <path to object image folder>
python3 -m scripts.nerf_to_bop --dataset_dir <path to object image folder>
The YCB objects are licensed under Creative Commons Attribution 4.0 International (CC BY 4.0).
This dataset is also released under Creative Commons Attribution 4.0 International (CC BY 4.0).
If you use any of the above data in a publication, please consider citing the following papers:
@inproceedings{calli2015ycb,
title={The ycb object and model set: Towards common benchmarks for manipulation research},
author={Calli, Berk and Singh, Arjun and Walsman, Aaron and Srinivasa, Siddhartha and Abbeel, Pieter and Dollar, Aaron M},
booktitle={2015 international conference on advanced robotics (ICAR)},
pages={510--517},
year={2015},
organization={IEEE}
}
@article{calli2015benchmarking,
title={Benchmarking in manipulation research: Using the Yale-CMU-Berkeley object and model set},
author={Calli, Berk and Walsman, Aaron and Singh, Arjun and Srinivasa, Siddhartha and Abbeel, Pieter and Dollar, Aaron M},
journal={IEEE Robotics \& Automation Magazine},
volume={22},
number={3},
pages={36--52},
year={2015},
publisher={IEEE}
}
@inproceedings{xiang2018posecnn,
Author = {Xiang, Yu and Schmidt, Tanner and Narayanan, Venkatraman and Fox, Dieter},
Title = {PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes},
Journal = {Robotics: Science and Systems (RSS)},
Year = {2018}
}