Dataset
Dataset collection
3D dataset collection
Objects are placed on a rotating table such that it can capture the objects from different angle. However, this can be done manually on a normal table and change the object orientation manually.
Note
This only works with a single object.
Setup:
Using external camera
Launch the camera
Go to RealSense2 camera for more information about the camera.
Apply static transform from camera_frame to base_link as explained in RealSense2 camera
Make sure the pointcloud of the plane is parallel to the gorund on rviz by transforming/rotating it.
Note
Passthrough filter will not work if it’s not parallel to the ground
Launch multimodal object recognition
roslaunch mir_object_recognition multimodal_object_recognition.launch debug_mode:=trueNote
To enable dataset collection, it requires to be in debug_mode. You can also point to a specifi logdir to save the data e.g. logdir:=/home/robocup/cloud_dataset.
Start collectiong dataset
rostopic pub /mir_perception/multimodal_object_recognition/event_in std_msgs/String e_start
Using robot arm camera
Bringup the robot
Start multimodal_object_recognition and continue with the next steps as described previously.
Note
The segemented point clouds are saved in the logdir.
2D dataset collection
Images can be collected using the robot camera or external camera. They can also be collected using easy augment too which use Intel Realsense D435 camera to capture the image and automatically annotate them for 2D object detection.
Dataset preprocessing
Before training training the model, the data should be preprocessed, and this includes but not limited to removing bad data, normalization, and converting it to the required format such as h5 for point clouds and VOC or KITTI for images.
3D dataset preprocessing
An example of the data directory structure:
b-it-bots_atwork_dataset
├── train
│ ├── AXIS
| ├── axis_0001.pcd
| ├── ...
│ ├── ...
├── test
│ ├── AXIS
| ├── axis_0001.pcd
| ├── ...
│ ├── ...
The dataset preprocessing can be found in this notebook.
It will generate pgz files containing a dictionary of objects consisting of x y z r g b and label.
2D dataset preprocessing
Create semantic labels using labelme.
Convert the semantic labels using labelme2voc.
If KITTI dataset is required, convert VOC dataset to KITTI using vod-converter