This repository evaluates the 2DPASS algorithm, originally introduced in 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds by Yan et al., on additional datasets including Waymo Open Dataset and Pandaset. This evaluation aims to extend the insights of the algorithm's strengths and limitations in handling various datasets beyond SemanticKITTI and NuScenes.
Member: Yui Chang, Daniel Xu, Hoang Nguyen
Our project focuses on enhancing the understanding of semantic segmentation techniques for self-driving cars by evaluating 2DPASS across diverse datasets. Initially trained on the SemanticKITTI dataset, we introduce the Waymo Open Dataset and Pandaset for further evaluation. Using the same testing methodology from the original 2DPASS implementation, our results provide a comprehensive comparison of its performance, uncovering insights about its generalizability and potential areas of improvement.
-
Dataset Integration:
- Converted Waymo and Pandaset into formats compatible with the 2DPASS algorithm.
- Integrated the datasets into the training and evaluation pipeline.
-
Pretrained Evaluation:
- Evaluated the performance of pretrained 2DPASS weights on the new datasets.
-
Model Retraining:
- Trained 2DPASS from scratch on Waymo and Pandaset to benchmark performance differences.
-
Findings:
- Compared results with the original SemanticKITTI and NuScenes benchmarks, highlighting dataset-specific challenges and opportunities for refinement.
2DPASS leverages 2D priors from camera images to assist 3D LiDAR semantic segmentation. Key features include:
- Multi-Modal Knowledge Distillation: Integrates 2D and 3D features for improved semantic segmentation.
- MSFSKD: Multi-Scale Fusion-to-Single Knowledge Distillation, achieving state-of-the-art results on SemanticKITTI and NuScenes benchmarks.
- Point Cloud Input: Uses pure LiDAR point clouds for training, enhancing robustness.
-
Dataset Preparation:
- Converted Waymo and Pandaset datasets into the required format for 2DPASS.
-
Testing Pretrained Weights:
- Evaluated pretrained weights on the new datasets.
-
Training on New Datasets:
- Trained a new 2DPASS model from scratch on Waymo and Pandaset.
-
Performance Metrics:
- Used mIoU and class accuracy to evaluate segmentation performance.
-
2D and 3D Feature Fusion:
- Projects point clouds onto 2D image patches (P2P mapping).
- Interpolates voxel features onto point clouds (P2V mapping).
-
Training Pipeline:
- LiDAR point clouds and cropped image patches generate multi-scale features.
- These features are fused into a single semantic score using MSFSKD.
Metric | Pretrained Weights on SemanticKITTI |
---|---|
mIoU | 72.9% |
Accuracy | 90.1% |
Road Detection | 89.7% |
Car Detection | 97.0% |
Metric | Pretrained Weights | Trained on Waymo |
---|---|---|
mIoU | 8.22% | 34.9% |
Accuracy | 21.2% | 40.4% |
Road Detection | 4.51% | 0.00% |
Car Detection | 35.28% | 75.38% |
Metric | Pretrained Weights | Trained on Pandaset |
---|---|---|
mIoU | 1.3% | 57.3% |
Accuracy | 7.0% | N/A |
Road Detection | 14.4% | 79.9% |
Car Detection | 7.6% | 80.2% |
- SemanticKITTI and NuScenes benchmarks exhibit excellent performance, especially for common classes like road, car, and vegetation.
- Performance on Waymo and Pandaset was hindered due to:
- Differences in LiDAR density and camera resolution.
- Misaligned camera-LiDAR setups in Pandaset.
- Limited class frequency in training datasets.
- Pretraining on SemanticKITTI generalized poorly to Waymo and Pandaset, suggesting the need for dataset-specific fine-tuning.
-
Challenges:
- Limited transferability across datasets due to inherent differences in sensor setups and data collection conditions.
- Suboptimal performance on Pandaset caused by difference in color gamut and misaligned camera-LiDAR calibration.
-
Proposed Improvements:
- Data Augmentation: Introduce color augmentation techniques for better generalization.
- Model Adaptation: Leverage techniques like LoRA to fine-tune pretrained weights on new datasets.
- Backbone Upgrade: Replace outdated Caffe backbones with modern architectures.
- Pose Alignment: Improve camera-LiDAR pose calibration for Pandaset.
You can download the pretrained weights for evaluation using the links below:
The Pandaset conversion code is based on the repository SiMoM0/Pandaset2Kitti and has been modified to suit the needs of the 2DPASS evaluation pipeline. The modified conversion script is added to this repository as /pandaset/convert.py
.
The Waymo conversion code used is located in the repository IrohXu/waymo_to_semanticKITTI.Follow the instruction in the repository to transform the .tfrecords to the SemanticKITTI format. To remap the Waymo labels to SemanticKITTI's, utilize /waymo/labelfix.py
.
- Convert and put PandaSet or Waymo converted data into the
dataset
folder. - If using Pandaset, place the
pandaset.yaml
configuration file and label map file into their corresponding folders (similar to the SemanticKITTI structure). Ensure the contents and file names are correctly formatted. - Put the pretrained checkpoint into the
logs
folder. Pretrained logs should be stored inlogs/SemanticKITTI
.
Conversion is done using the convert.py
script:
python convert.py <path_to_pandaset> <path_to_output>
Code is forked from yanx27/2DPASS
Pandaset conversion code is modified from SiMoM0/Pandaset2Kitti
PandaSet: https://pandaset.org/