Dataset

Dataset collection

3D dataset collection

Objects are placed on a rotating table such that it can capture the objects from different angle. However, this can be done manually on a normal table and change the object orientation manually.

Note

This only works with a single object.

Setup:

Using external camera
Launch the camera

Go to RealSense2 camera for more information about the camera.

Apply static transform from camera_frame to base_link as explained in RealSense2 camera

Make sure the pointcloud of the plane is parallel to the gorund on rviz by transforming/rotating it.

Note

Passthrough filter will not work if it’s not parallel to the ground
Launch multimodal object recognition
roslaunch mir_object_recognition multimodal_object_recognition.launch debug_mode:=true
Note

To enable dataset collection, it requires to be in debug_mode. You can also point to a specifi logdir to save the data e.g. logdir:=/home/robocup/cloud_dataset.
Start collectiong dataset
rostopic pub /mir_perception/multimodal_object_recognition/event_in std_msgs/String e_start
Using robot arm camera

Bringup the robot

Start multimodal_object_recognition and continue with the next steps as described previously.
Note

The segemented point clouds are saved in the logdir.

2D dataset collection

Images can be collected using the robot camera or external camera. They can also be collected using easy augment too which use Intel Realsense D435 camera to capture the image and automatically annotate them for 2D object detection.

Dataset preprocessing

Before training training the model, the data should be preprocessed, and this includes but not limited to removing bad data, normalization, and converting it to the required format such as h5 for point clouds and VOC or KITTI for images.

3D dataset preprocessing

An example of the data directory structure:

b-it-bots_atwork_dataset
├── train
│   ├── AXIS
|       ├── axis_0001.pcd
|       ├── ...
│   ├── ...
├── test
│   ├── AXIS
|       ├── axis_0001.pcd
|       ├── ...
│   ├── ...

The dataset preprocessing can be found in this notebook.

It will generate pgz files containing a dictionary of objects consisting of x y z r g b and label.

2D dataset preprocessing

Create semantic labels using labelme.
Convert the semantic labels using labelme2voc.
If KITTI dataset is required, convert VOC dataset to KITTI using vod-converter