This is an experimental Tensor Flow implementation of Faster RCNN (TFFRCNN), mainly based on the work of smallcorgi and rbgirshick. I have re-organized the libraries under lib
path, making each of python modules independent to each other, so you can understand, re-write the code easily.
For details about R-CNN please refer to the paper Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks by Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun.
- Resnet networks support
- KITTI object detection dataset support
- Position Sensitive ROI Pooling (psroi_pooling), not testing yet
- Hard Example Mining
- Data Augment
-
Requirements for Tensorflow (see: Tensorflow)
-
Python packages you might not have:
cython
,python-opencv
,easydict
(recommend to install: Anaconda)
- For training the end-to-end version of Faster R-CNN with VGG16, 3G of GPU memory is sufficient (using CUDNN)
- Clone the Faster R-CNN repository
git clone https://github.com/CharlesShang/TFFRCNN.git
- Build the Cython modules
cd TFFRCNN/lib make # compile cython and roi_pooling_op, you may need to modify make.sh for your platform
After successfully completing basic installation, you'll be ready to run the demo.
To run the demo
cd $TFFRCNN
python ./faster_rcnn/demo.py --model model_path
The demo performs detection using a VGG16 network trained for detection on PASCAL VOC 2007.
-
Download the training, validation, test data and VOCdevkit
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
-
Extract all of these tars into one directory named
VOCdevkit
tar xvf VOCtrainval_06-Nov-2007.tar tar xvf VOCtest_06-Nov-2007.tar tar xvf VOCdevkit_08-Jun-2007.tar
-
It should have this basic structure
$VOCdevkit/ # development kit $VOCdevkit/VOCcode/ # VOC utility code $VOCdevkit/VOC2007 # image sets, annotations, etc. # ... and several other directories ...
-
Create symlinks for the PASCAL VOC dataset
cd $TFFRCNN/data ln -s $VOCdevkit VOCdevkit2007
-
Download pre-trained model VGG16 and put it in the path
./data/pretrain_model/VGG_imagenet.npy
-
Run training scripts
cd $TFFRCNN python ./faster_rcnn/train_net.py --gpu 0 --weights ./data/pretrain_model/VGG_imagenet.npy --imdb voc_2007_trainval --iters 70000 --cfg ./experiments/cfgs/faster_rcnn_end2end.yml --network VGGnet_train --set EXP_DIR exp_dir
-
Run a profiling
cd $TFFRCNN # install a visualization tool sudo apt-get install graphviz ./experiments/profiling/run_profiling.sh # generate an image ./experiments/profiling/profile.png
-
Download the KITTI detection dataset
http://www.cvlibs.net/datasets/kitti/eval_object.php
-
Extract all of these tar into
./TFFRCNN/data/
and the directory structure looks like this:KITTI |-- training |-- image_2 |-- [000000-007480].png |-- label_2 |-- [000000-007480].txt |-- testing |-- image_2 |-- [000000-007517].png |-- label_2 |-- [000000-007517].txt
-
Convert KITTI into Pascal VOC format
cd $TFFRCNN ./experiments/scripts/kitti2pascalvoc.py \ --kitti $TFFRCNN/data/KITTI --out $TFFRCNN/data/KITTIVOC
-
The output directory looks like this:
KITTIVOC |-- Annotations |-- [000000-007480].xml |-- ImageSets |-- Main |-- [train|val|trainval].txt |-- JPEGImages |-- [000000-007480].jpg
-
Training on
KITTIVOC
is just like on Pascal VOC 2007python ./faster_rcnn/train_net.py \ --gpu 0 \ --weights ./data/pretrain_model/VGG_imagenet.npy \ --imdb kittivoc_train \ --iters 160000 \ --cfg ./experiments/cfgs/faster_rcnn_kitti.yml \ --network VGGnet_train