This is the code used for my bachelor's thesis: 'Traffic Scene Segmentation and Reconstruction with Deep Learning'. It uses a modified version of Yassouali's Pytorch Segmentation Framework, so make sure to check it out if you're interested.
It was used to reconstruct the first recording from The USyd Campus Dataset, using it's lidar, video and odometry data. You can see the results in this video.
You can play around with this at home. We recommend having an Nvidia GPU for faster inference. This was done on a GTX 1060) but we recommend a high end graphics card if you wanted to train your own segmentation model (not necessary, since we provide one).
- Follow the instructions at the dataset tools repository to get the lidar_camera_projection node working.
- Download Week 1 of the USyd Campus Dataset.
- Clone this repository and add the subfolder
tfg
in your catkin workspace. - Download the model weights file from here and put it inside tfg/src/pytorch_segmentation
- Run the lidar_camera_projection node from the dataset_metapackage.
- Run our point_cloud_listener node (
rosrun tfg point_cloud_listener.py
). - Optionally, you can preview the generated cloud with rviz by subscribing to the topic
final_cloud_publisher
by using the frame of referencegmsl_centre_link
- Once you close (
ctrl+c
) the point_cloud_listener node, it will generate a file namedcloud_color.ply
file with your combined point cloud. - Use the provided script
cloud2mesh.py
to convert your point cloud into a polygonal mesh. You will have to provide it with the input file name, and the output will be your input + '.obj'. We recommend not reconstructing sections that took more than 5 minutes to record, since this last step is extremely slow. If you wanted to reconstruct a larger portion, you could make it by small chunks and then combine them in Blender. The result will have to be rotated-90º
in the X axis and will look something like this