The codes are with the CVPR2016 paper "CRAFT Objects from Images".
In a word, we extend the conventional two-stage object detection framework (first locating object proposals, then classifying object categories) to a four-stage pipeline, in which the proposal localization task is solved with a cascade network of Region Proposal Network (RPN) and Fast R-CNN to improve the proposal quality, while the object classification task is handled by a cascade network of two Fast R-CNN nets with different objective functions (one-hot classification and one-vs-rest classification) to eliminate false positives.
We name our approach "CRAFT" (short for "Cascade Rpn And FasT-rcnn") and show considerable improvement over Fast R-CNN and Faster R-CNN baselines on PASCAL VOC 07/12 and ILSVRC datasets. For more details please refer to our CVPR2016 paper.
The codes are built on RPN (Stage 1) and Fast R-CNN (Stage 2,3,4). It would be easier to use the codes if you are familiar with these two projects.
The codes are tested on Ubuntu 14.04, 256GB Memory, Titan X GPU, MATLAB R2015a.
- Follow instructions in Faster R-CNN to make the codes in
1_RPN
, using Caffe provided by Shaoqing Ren - Follow instructions in Fast R-CNN to make the codes in
2_CasRPN
,3_FRCN
, and4_CasFRCN
, using our slightly modified Caffe - Download the VGG16 pre-trained model and PASCAL VOC 2012 dataset and make proper links pointing to them
- You can create a soft link of folders
caffe-fast-rcnn
anddata
for2_CasRPN
,3_FRCN
, and4_CasFRCN
for convenience.
The whole pipeline is stage-wise. Now we show how to train an object detector using CRAFT approach on PASCAL VOC 2012 train+val dataset and test it on PASCAL VOC 2012 test set. For simplicity, we do not use joint training between RPN and Fast R-CNN networks.
Stage 1. RPN
cd 1_RPN
matlab ./experiments/script_faster_rcnn_VOC2012_VGG16.m
matlab saveProposals.m
Stage 2. CasRPN
cd 2_CasRPN
bash train.sh
bash test.sh
matlab saveProposals.m
Stage 3. FRCN
cd 3_FRCN
bash train.sh
bash test.sh
matlab saveDetections.m
Stage 4. CasFRCN
cd 4_CasFRCN
bash train.sh
bash test.sh
| training data | test data | mAP
------------------------- |:--------------------------------------:|:--------------------:|:-----: CRAFT, VGG-16 | VOC 2007 trainval + 2012 trainval | VOC 2007 test | 75.7% CRAFT, VGG-16 | VOC 2012 trainval | VOC 2012 test | 71.3%
Note: The real mAP results may vary a little from the above results reported in the paper. We do not adopt joint training between RPN and Fast R-CNN currently.
If you use our codes in your research, we are grateful if you cite the paper:
@inproceedings{binyang16craft,
title={Craft Objects from Images},
author={Yang, Bin and Yan, Junjie and Lei, Zhen and Li, Stan},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
year={2016}
}
We give our sincere gratitude to the following people, groups and institutions:
- Anonymous reviewers
- Ross Girshick for the Fast R-CNN project
- Shaoqing Ren for the Faster R-CNN project
- Caffe team
- VGG team
- SenseTime Group Limited
- NVIDIA Corporation