Training RetinaNet for Traffic Sign detection based on Fizyr implementation of RetinaNet in Keras
Traffic Sign detection palys a significant role in Autonomous driving. This project is to explore the application of RetinaNet on the German Traffic Sign Detection Benchmark Dataset.
The Dataset can be downloaded from http://benchmark.ini.rub.de/?section=gtsdb&subsection=news
Split the ground truth text file gt.txt
into train, val and test files, define them in a csv format, for more information follow the guide in keras-retinanet
RetinaNet as described in this paper Focal Loss for Dense Object Detection uses a focal loss function training on a sparse set of hard examples and prevents vast number of easy negatives during training.
Clone the repository from (https://github.com/fizyr/keras-retinanet) to install and setup Keras RetinaNet, follow the instructions here, after you have installed keras-retinanet and defined your data in csv files
There is a debug.py
tool to help find the most common mistakes. This tool helps to visualize the annotations and at the same times checks the compatibility of the data
To train using CSV files, run from the repository:
# Run directly from the root directory of the cloned repository:
keras_retinanet/bin/train.py csv {PATH TO ANNOTATIONS FILE} {PATH TO CLASSES FILE}
Download and extract the dataset, after extracting you will have the following files:
# images in ppm format
./FulllJCNN2013/00**.ppm
# ground truth file
./FulllJCNN2013/gt.txt
# Classes - ID mapping
./FulllJCNN2013/classes.txt
Place the dataset in the bin folder
./keras_retinanet/bin/FulllJCNN2013
The training and evaluation of this project is based on the German Traffic Sign Detection Benchmark Dataset. It includes images of traffic signs belonging to 43 classes and the data distribution is shown below
Lets have a look at few images of the dataset, follow the code in notebook for visualization and splitting the data. Make sure that the path specified is correct as the notebook uses relative imports to load the files.
Ground Truth annotations
The default backbone of RetinaNet which is resnet50 with pretrained weights on MS COCO dataset was used for transfer learning. The backbone layers were freezed and only top layers were trained.The pretrained MS COCO model can be downloaded here.
After cloning the repo from (https://github.com/fizyr/keras-retinanet) and installing keras-retinanet, run the following from terminal
# Training keras-retinanet
python keras_retinanet/bin/train.py --weights {PRETRAINED_MODEL} --config ../config.ini --compute-val-loss --weighted-average --multiprocessing --steps 500 --epochs 50 csv ./data/train.csv ./data/classes.csv --val-annotations ./data/val.csv
Input | RetinaNet output |
---|---|
Though the model was able to detect the traffic signs in adverse conditions, there were also some misclassifications due to class imbalance as seen in the class distribution graph
Input | RetinaNet output | Correct label |
---|---|---|
(uneven-road) | ||
(speed-limit 60) | ||
(pedestrian crossing) |
Commonly used metric for object detection is mAP, computed according to the PASCAL VOC dev kit can be found here
Results using the cocoapi
are shown in the paper Focal Loss for Dense Object Detection ( according to the paper, this model configuration achieved a mAP of 0.357 on COCO Dataset).
Similary, mAP value was computed while training the model on GTSD dataset. The mAP graph logged in Tensor board shows that the model acheived a maximum of value 0.30 which is less and main reasons are small dataset size and class imbalance. There are several classes with 0 AP resulting in a low mAP value.
Data Augmentation to tackle the class imbalance problem
Traffic-Sign Detection and Classification in the Wild
sridhar912 Traffic Sign Detection and Classification
German traffic signs detection specific code is distributed under MIT License.