By Cheng-Yang Fu*, Wei Liu*, Ananth Ranga, Ambrish Tyagi, Alexander C. Berg.
*=Equal Contribution
The first version is done. Users can start training the SSD/DSSD with Resnet-101 now.
Stay tuned. Models, trained for Pascal VOC 2007, 2012 and COCO , will be released soon.
Deconvolutional SSD brings additional context into state-of-the-art general object detection by adding extra deconvolution structures. The DSSD achieves much better accuracy on small objects compared to SSD.
The code is based on SSD. For more details, please refer to our arXiv paper.
Please cite DSSD in your publications if it helps your research:
@inproceedings{Fu2016dssd,
title = {{DSSD}: Deconvolutional Single Shot Detector},
author = {Fu, Cheng-Yang and Liu, Wei and Ranga, Ananth and Tyagi, Ambrish and Berg, Alexander C.},
booktitle = {arXiv preprint arXiv:1701.06659},
}
-
Download the code from github. We call this directory as
$CAFFE_ROOT
later.git clone https://github.com/chengyangfu/caffe.git cd $CAFFE_ROOT git checkout dssd
-
Build the code. Please follow Caffe instruction to install all necessary packages and build it.
# Modify Makefile.config according to your Caffe installation. cp Makefile.config.example Makefile.config make -j8 # Make sure to include $CAFFE_ROOT/python to your PYTHONPATH. make py make test -j8 # (Optional) make runtest -j
-
Please Follow the Orginal SSD to do all the preparation works. You should have lmdb fils for VOC2007. Check the following two links exist or not.
ls $CAFFE_ROOT/examples # $CAFFE_ROOT/examples/VOC0712/VOC0712_trainval_lmdb # $CAFFE_ROOT/examples/VOC0712/VOC0712_test_lmdb
-
Download the Resnet-101 models from the Deep-Residual-Network.
# creat the directory for ResNet-101 cd $CAFFE_ROOT/models mkdir ResNet-101 # Rename the Resnet-101 models and put in the ResNet-101 direcotry ls $CAFFE_ROOT/models/ResNet-101 # $CAFFE_ROOT/models/ResNet-101/ResNet-101-model.caffemodel # $CAFFE_ROOT/models/ResNet-101/ResNet-101-deploy.prototxt
-
Train and Eval the SSD model
# Train the SSD-ResNet-101 321x321 python examples/ssd/ssd_pascal_resnet_321.py # GPU setting may need be change according to the numbers of gpu # models are generated in: # $CAFFE_ROOT/models/ResNet-101/VOC0712/SSD_VOC07_321x321 # Evaluate the model cd $CAFFE_ROOT ./build/tools/caffe train --solver="./models/ResNet-101/VOC0712/SSD_VOC07_321x321/test_solver.prototxt" \ --weights="./models/ResNet-101/VOC0712/SSD_VOC07_321x321/ResNet-101_VOC0712_SSD_VOC07_321x321_iter_80000.caffemodel" \ --gpu=0 # batch size in the test.prototxt may need be changed. # If the batch size is changed, remeber to change the test_iter in test_solver.prototxt at same time. # It should reach 77.5* mAP at 80k iterations.
-
Train and Evaluate the DSSD model. In this script, Resnet-101 and SSD related layers are frozen and only the DSSD related layers are trained.
# Use the SSD-ResNet-101 321x321 as the pretrained model python examples/ssd/ssd_pascal_resnet_deconv_321.py # Evaluate the model cd $CAFFE_ROOT ./build/tools/caffe train --solver="./models/ResNet-101/VOC0712/DSSD_VOC07_321x321/test_solver.prototxt" \ --weights="./models/ResNet-101/VOC0712/DSSD_VOC07_321x321/ResNet-101_VOC0712_DSSD_VOC07_321x321_iter_30000.caffemodel" \ --gpu=0 # It should reach 78.6* mAP at 30k iterations.
-
Train and Evalthe DSSD model. In this script, we try to fine-tune the entire network. In order to sucessfully finetune the network, we need to freeze all the batch norm related layers in Caffe.
# Use the DSSD-ResNet-101 321x321 as the pretrained model python examples/ssd/ssd_pascal_resnet_deconv_ft_321.py # Evaluate the model cd $CAFFE_ROOT ./build/tools/caffe train --solver="./models/ResNet-101/VOC0712/DSSD_VOC07_FT_321x321/test_solver.prototxt" \ --weights="./models/ResNet-101/VOC0712/DSSD_VOC07_FT_321x321/ResNet-101_VOC0712_DSSD_VOC07_FT_321x321_iter_40000.caffemodel" \ --gpu=0 # Finetuning the entire network only works for the model with 513x513 inputs not 321x321.
-
We add two scripts for training SSD/DSSD with 513x513 inputs on COCO.
# Train SSD513-ResNet101 on COCO python examples/ssd/ssd_coco_resnet_513.py # Train DSSD513-ResNet101 on COCO and use SSD513 as the pretrained model python examples/ssd/ssd_coco_resnet_deconv_513.py
-
We strongly suggest to use the trained models instead of training from scracth.
# move the compressed files at $CAFFE_ROOT/models/ResNet-101 cd $CAFFE_ROOT/models/ResNet-101 tar -vzxf SSD_513_COCO.tar.gz tar -vzxf DSSD_513_COCO.tar.gz
P.S.: Please change the field "start" to offset" in PriorBox Layers.
-
In our experiments, the model with 513x513 inputs are trained using NVIDIA P40 which consists of 22GB memory. Because we add extra batch normalization layers, it's important to make the mini-batchs size at least 5 in each gpu. So, if you use the gpu with smaller memory, I don't think you can replicate the results.