Skip to content

SNIPER / AutoFocus is an efficient multi-scale object detection training / inference algorithm

License

Notifications You must be signed in to change notification settings

mahyarnajibi/SNIPER

 
 

Repository files navigation

R-FCN-3000 at 30fps: Decoupling Detection and Classification

R-FCN-3k is a real-time detector for up to 3,130 classes. The idea is to decouple object detection into objectness detection and fine-grained classification, which speeds up inference and training with only marginal mAP drop. It is trained on ImageNet Classification data with bounding boxes and obtains 34.9% mAP on ImageNet Detection Dataset (37.8% with SNIPER).

With generalized objectness detection, we demonstrate that it is possible to learn a universal objectness detector. With the universal objectness detector of R-FCN-3k, we can obtain a detector on anything in seconds by learning only the classification layer.

R-FCN-3k is described in the following paper:

R-FCN-3000 at 30fps: Decoupling Detection and Classification
Bharat Singh*, Hengduo Li*, Abhishek Sharma and Larry S. Davis (* denotes equal contribution)
CVPR 2018.

Demo for Detection on New Classes

With the trained universal objectness detector, you can obtain a new detector simply by training a light linear classifier in seconds!

  1. Please download trained R-FCN-3k model [GoogleDrive][BaiduYun] and put them into
SNIPER/output/chips_resnet101_3k/res101_mx_3k/fall11_whole/
  1. Download images [GoogleDrive][BaiduYun] for new classes and put it in demo/.

  2. Run python demo.py to extract features and train the classifier on new classes. Visualization of detection results on evaluation images are saved in vis_result.

  3. You can use your own data to train the classifier and obtain a detector. Simply put image folders under demo/image/ (like demo/images/cat/xxx.jpg) and run python demo.py. You may need to change train eval split strategy and hyper-parameters based on your own data and purpose.

Demo for R-FCN-3k

With trained R-FCN-3k detector, you can detect up to 3,000 classes.

  1. Please follow previous instruction to get trained R-FCN-3k model.

  2. Download the modified ILSVRC2014_devkit [GoogleDrive][BaiduYun] and put it in data.

  3. Run python demo_3k.py to detect with R-FCN-3k. Source images should be put in demo/image_3k/. Visulization are saved in vis_result.

Training

  1. Please download ImageNet Full Fall 2011 Release and ILSVRC2013_DET validation images, together with the bounding boxes.

  2. Download the modified ILSVRC2014_devkit [GoogleDrive][BaiduYun] which contains essential files for training and evaluation. Please make them look like this:

    data
    |__ imagenet
        |__ fall11_whole
            |__ n04233124
                |__ xxx.JPEG
                    ...
                ...
        |__ fall11_whole_bbox
            |__ n04233124
                |__ xxx.xml
                    ...
                ...
        |__ ILSVRC2013_DET_val
        |__ ILSVRC2013_DET_val_bbox
        |__ ILSVRC2014_devkit
            |__ data
                |__ 3kcls_1C_words.txt
                |__ 3kcls_cluster_interval1.txt
                |__ 3kcls_index.txt
                |__ wnid_name_dict.txt
                |__ 3kcls_cluster_result1.txt
                |__ meta_det.mat
                |__ det_lists
                    |__ val.txt
                        ...
            |__ evaluation
                |__ eval_det_3k_1C.m
                    ...
                |__ 3k_1C_pred
                    |__ 3k_1C_matching.txt
                |__ cache
  1. Run the following script downloads and extract the pre-trained models into the default path (data/pretrained_model):
bash download_imgnet_models.sh
  1. To train R-FCN-3k, use the following command:
python main_train.py

Evaluation

  1. Please download trained R-FCN-3k model [GoogleDrive][BaiduYun] and put them into
/home/ubuntu/3ksniper/SNIPER/output/chips_resnet101_3k/res101_mx_3k/fall11_whole/
  1. To evaluate trained model, use the following command:
python main_test.py

Citing

@article{singh2017r,
  title={R-FCN-3000 at 30fps: Decoupling Detection and Classification},
  author={Singh, Bharat and Li, Hengduo and Sharma, Abhishek and Davis, Larry S},
  journal={CVPR},
  year={2018}
}

About

SNIPER / AutoFocus is an efficient multi-scale object detection training / inference algorithm

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published