ROS wrapper for Singleshotpose (Yolo6D) on custom dataset
- tested on Ubuntu 18.04, ROS Melodic, RTX 2080-Ti, CUDA 10.1, Python3.7, PyTorch 1.4.1
- git clone in your catkin_ws https://github.com/avasalya/Yolo6D_ROS.git
- refer
environment.yml
for other anaconda packages
- https://github.com/microsoft/singleshotpose (original)
- https://github.com/avasalya/singleshot6Dpose (modified for onigiri usage)
conda env create -f environment.yml
- https://drive.google.com/drive/folders/19Cc6pna7r8qb4ebS_C8b7RwF3QoFsoeZ?usp=sharing
- weights v1.xs and v2.xs were trained on synthetic dataset.
- v3.x onwards on real dataset.
- only weights v7.x and 8.x trained on tx provided dataset.
- change values in txonigiri/txonigiri.data
- change values in txonigiri/txonigiri.data
- flag
use_dd
(default:False
) - flag
use_nms
(default:True
) - flag
use_pixelD
(default:True
) - flag
use_pnpG
(default:True
) - tune
conf
- tune
dd
- tune
nms
roslaunch realsense2_camera rs_rgbd.launch align_depth:=true
- it publishes estimated pose as geometry_msgs/PoseArray
roslaunch yolo6d_ros yolo6d.launch
- also possible via
roslaunch yolo6d_ros rviz.launch
rosrun yolo6d_ros yolo6d_ros.py
- or simply
python3 scripts/yolo6d_ros.py
- refer this repository to train on your custom dataset https://github.com/avasalya/singleshot6Dpose
Label files consist of 21 ground-truth values. We predict 9 points corresponding to the centroid and corners of the 3D object model. Additionally we predict the class in each cell. That makes 9x2+1 = 19 points. In multi-object training, during training, we assign whichever anchor box has the most similar size to the current object as the responsible one to predict the 2D coordinates for that object. To encode the size of the objects, we have additional 2 numbers for the range in x dimension and y dimension. Therefore, we have 9x2+1+2 = 21 numbers.
Respectively, 21 numbers correspond to the following: 1st number: class label, 2nd number: x0 (x-coordinate of the centroid), 3rd number: y0 (y-coordinate of the centroid), 4th number: x1 (x-coordinate of the first corner), 5th number: y1 (y-coordinate of the first corner), ..., 18th number: x8 (x-coordinate of the eighth corner), 19th number: y8 (y-coordinate of the eighth corner), 20th number: x range, 21st number: y range.
The coordinates are normalized by the image width and height: x / image_width and y / image_height. This is useful to have similar output ranges for the coordinate regression and object classification tasks.
-
use this repository to create your own dataset for Yolo6D (develop branch) https://github.com/avasalya/RapidPoseLabels/tree/develop (instructions are provided)
-
I have made changes in the original repository to meet the necessary requirements to produce dataset for yolo6D.
-
please read
dataset.sh
for further instructions on how to create your own dataset -
once you run
dataset.sh
, this will generate theout_cur_date
folder with several contents, however to trainYolo6D
you just need to copyrgb
,mask
,label
folders, andtrain.txt
,test.txt
,yourObject.ply
files. -
Please refer to original work here for further support, https://github.com/rohanpsingh/RapidPoseLabels, PS: but my forked branch is not updated.
-