The PyTorch Implementation of the paper: RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving (ECCV 2020)
- Realtime 3D object detection based on a monocular RGB image
- Support distributed data parallel training
- Tensorboard
- Implement the Keypoint FPN in the model
- Implement part 3.2 (3D Bounding Box Estimation part), revise (formula (7))
- Revise loss for depth estimation (formula (3)) (normalize depth maybe < 0 --> couldn't apply log operator)
- Release pre-trained models
pip install -U -r requirements.txt
Download the 3D KITTI detection dataset from here.
The downloaded data includes:
- Training labels of object data set (5 MB)
- Camera calibration matrices of object data set (16 MB)
- Left color images of object data set (12 GB)
Please make sure that you construct the source code & dataset directories structure as below.
The model takes only the RGB images as the input and outputs the main center heatmap
, vertexes heatmap
,
and vertexes coordinate
as the base module to estimate 3D bounding box
.
cd src/data_process
- To visualize camera images (with 3D boxes), let's execute:
python kitti_dataset.py
Then Press n to see the next sample >>> Press Esc to quit...
Download the trained model from here,
then put it to ${ROOT}/checkpoints/
and execute:
python test.py --gpu_idx 0 --arch resnet_18 --pretrained_path ../checkpoints/rtm3d_resnet_18.pth
python evaluate.py --gpu_idx 0 --arch resnet_18 --pretrained_path <PATH>
python train.py --gpu_idx 0 --batch_size <N> --num_workers <N>...
We should always use the nccl
backend for multi-processing distributed training since it currently provides the best
distributed training performance.
- Single machine (node), multiple GPUs
python train.py --dist-url 'tcp://127.0.0.1:29500' --dist-backend 'nccl' --multiprocessing-distributed --world-size 1 --rank 0
- Two machines (two nodes), multiple GPUs
First machine
python train.py --dist-url 'tcp://IP_OF_NODE1:FREEPORT' --dist-backend 'nccl' --multiprocessing-distributed --world-size 2 --rank 0
Second machine
python train.py --dist-url 'tcp://IP_OF_NODE2:FREEPORT' --dist-backend 'nccl' --multiprocessing-distributed --world-size 2 --rank 1
To reproduce the results, you can run the bash shell script
./train.sh
- To track the training progress, go to the
logs/
folder and
cd logs/<saved_fn>/tensorboard/
tensorboard --logdir=./
- Then go to http://localhost:6006/:
If you think this work is useful, please give me a star!
If you find any errors or have any suggestions, please contact me (Email: nguyenmaudung93.kstn@gmail.com
).
Thank you!
@article{RTM3D,
author = {Peixuan Li, Huaici Zhao, Pengfei Liu, Feidao Cao},
title = {RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving},
year = {2020},
conference = {ECCV 2020},
}
@misc{RTM3D-PyTorch,
author = {Nguyen Mau Dung},
title = {{RTM3D-PyTorch: PyTorch Implementation of the RTM3D paper}},
howpublished = {\url{https://github.com/maudzung/RTM3D-PyTorch}},
year = {2020}
}
[1] CenterNet: Objects as Points paper, PyTorch Implementation
${ROOT}
└── checkpoints/
├── rtm3d_resnet_18.pth
└── dataset/
└── kitti/
├──ImageSets/
│ ├── test.txt
│ ├── train.txt
│ └── val.txt
├── training/
│ ├── image_2/
│ ├── calib/
│ ├── label_2/
└── testing/
│ ├── image_2/
│ ├── calib/
└── classes_names.txt
└── src/
├── config/
│ ├── train_config.py
│ └── kitti_config.py
├── data_process/
│ ├── kitti_dataloader.py
│ ├── kitti_dataset.py
│ ├── kitti_data_utils.py
│ └── transformation.py
├── models/
│ ├── resnet.py
│ ├── model_utils.py
└── utils/
│ ├── evaluation_utils.py
│ ├── logger.py
│ ├── misc.py
│ ├── torch_utils.py
│ ├── train_utils.py
├── evaluate.py
├── test.py
├── train.py
└── train.sh
├── README.md
└── requirements.txt