mxnet-finetuner

An all-in-one Deep Learning toolkit for image classification to fine-tuning pretrained models using MXNet.

Prerequisites

docker
docker-compose
jq
wget or curl

When using NVIDIA GPUs

nvidia-docker

Setup

$ git clone https://github.com/knjcode/mxnet-finetuner
$ cd mxnet-finetuner
$ bash setup.sh

Example usage

1. Arrange images into their respective directories

A training data directory (images/train) and validation data directory (images/valid) should containing one subdirectory per image class.

For example, arrange training data and validation data as follows.

images/
    train/
        airplanes/
            airplane001.jpg
            airplane002.jpg
            ...
        watch/
            watch001.jpg
            watch002.jpg
            ...
    valid/
        airplanes/
            airplane101.jpg
            airplane102.jpg
            ...
        watch/
            watch101.jpg
            watch102.jpg
            ...

2. Edit config.yml

Edit config.yml as you like.

For example

common:
  num_threads: 16
  gpus: 0,1,2,3

data:
  quality: 95
  shuffle: 1
  center_crop: 0

finetune:
  pretrained_models:
    - imagenet1k-resnet-50
  optimizers:
    - sgd
  num_epochs: 30
  lr: 0.0001
  lr_factor: 0.1
  lr_step_epochs: 30
  mom: 0.9
  wd: 0.00001
  batch_size: 10

3. Do Fine-tuning

$ docker-compose run finetuner

mxnet-finetuner will automatically execute the followings according to config.yml.

Create RecordIO data from images
Download pretrained models
Replace the last fully-connected layer with a new one that outputs the desired number of classes
Data augumentaion
Do Fine-tuning
Make training accuracy graph
Make confusion matrix
Upload training accuracy graph and confusion matrix to Slack

Training accuracy graph and/or confusion matrix are save at logs/ directory.
Trained models are save at model/ directory.

Trained models are saved with the following file name for each epoch.

model/201705292200-imagenet1k-nin-sgd-0000.params

If you want to upload results to Slack, set SLACK_API_TOKEN environment variable and edit config.yml as below.

finetune:
  train_accuracy_graph_slack_upload: 1
test:
  confusion_matrix_slack_upload: 1

4. Predict with trained models

Select the trained model and epoch you want to use for testing and edit config.yml

If you want to use model/201705292200-imagenet1k-nin-sgd-0001.params, edit config.yml as blow.

test:
  model_prefix: 201705292200-imagenet1k-nin-sgd
  model_epoch: 1

When you want to use the latest highest validation accuracy trained model, edit config.yml as below.

test:
  use_latest: 1

If set this option, model_prefix and model_epoch are ignored.

When you are done, you can predict with the following command

$ docker-compose run finetuner test

Predict result and classification report and/or confusion matrix are save at logs/ directory.

Available pretrained models

Classification accuracy of available pretrained models. (Datasets are ImageNet1K, ImageNet11K and Place365 Challenge)

Please note that the following results are calculated with different datasets.

ResNet accuracy from Reproduce ResNet-v2 using MXNet
Other accuracy from MXNet model gallery and MXNet - Image Classification - Pre-trained Models

model	Top-1 Accuracy	Top-5 Accuracy	download size	model size (MXNet)	dataset	image shape
CaffeNet	54.5%	78.3%	233MB	9.3MB	ImageNet1K	227x227
SqueezeNet	55.4%	78.8%	4.8MB	4.8MB	ImageNet1K	227x227
NIN	58.8%	81.3%	30MB	30MB	ImageNet1K	224x224
ResNet-18	69.5%	89.1%	45MB	43MB	ImageNet1K	224x224
VGG16	71.0%	89.8%	528MB	58MB	ImageNet1K	224x224
VGG19	71.0%	89.8%	549MB	78MB	ImageNet1K	224x224
Inception-BN	72.5%	90.8%	44MB	40MB	ImageNet1K	224x224
ResNet-34	72.8%	91.1%	84MB	82MB	ImageNet1K	224x224
ResNet-50	75.6%	92.8%	98MB	90MB	ImageNet1K	224x224
ResNet-101	77.3%	93.4%	171MB	163MB	ImageNet1K	224x224
ResNet-152	77.8%	93.6%	231MB	223MB	ImageNet1K	224x224
ResNet-200	77.9%	93.8	248MB	240MB	ImageNet1K	224x224
Inception-v3	76.9%	93.3%	92MB	84MB	ImageNet1K	299x299
ResNeXt-50	76.9%	93.3%	96MB	89MB	ImageNet1K	224x224
ResNeXt-101	78.3%	94.1%	170MB	162MB	ImageNet1K	224x224
ResNeXt-101-64x4d	79.1%	94.3%	320MB	312MB	ImageNet1K	224x224
ResNet-152 (imagenet11k)	41.6%	-	311MB	223MB	ImageNet11K	224x224
ResNet-50 (Place365 Challenge)	31.1%	-	181MB	90MB	Place365ch	224x224
ResNet-152 (Place365 Challenge)	33.6%	-	313MB	223MB	Place365ch	224x224

The download size is the file size when first downloading pretrained model.
The model size is the file size to be saved after fine-tuning.

To use these pretrained models, specify the following pretrained_models name in config.yml.

model	pretrained_models names
CaffeNet	imagenet1k-caffenet
SqueezeNet	imagenet1k-squeezenet
NIN	imagenet1k-nin
VGG16	imagenet1k-vgg16
VGG19	imagenet1k-vgg19
Inception-BN	imagenet1k-inception-bn
ResNet-18	imagenet1k-resnet-18
ResNet-34	imagenet1k-resnet-34
ResNet-50	imagenet1k-resnet-50
ResNet-101	imagenet1k-resnet-101
ResNet-152	imagenet1k-resnet-152
ResNet-152 (imagenet11k)	imagenet11k-resnet-152
ResNet-200	imagenet1k-resnet-200
Inception-v3	imagenet1k-inception-v3
ResNeXt-50	imagenet1k-resnext-50
ResNeXt-101	imagenet1k-resnext-101
ResNeXt-101-64x4d	imagenet1k-resnext-101-64x4d
ResNet-50 (Place365 Challenge)	imagenet11k-place365ch-resnet-50
ResNet-152 (Place365 Challenge)	imagenet11k-place365ch-resnet-152

Benchmark

Speed (images/sec)

dataset: 8000 samles
batch size: 10, 20, 30, 40
Optimizer: SGD
GPU: Maxwell TITAN X (12GiB Memory)

model	batch size 10	batch size 20	batch size 30	batch size 40
CaffeNet	755.64	1054.47	1019.24	1077.63
SqueezeNet	458.27	579.37	534.68	549.55
NIN	443.88	516.21	612.83	656.14
ResNet-18	257.40	308.30	331.57	339.09
ResNet-34	149.88	182.69	201.75	207.49
Inception-BN	147.60	183.82	193.74	203.86
ResNet-50	88.55	102.44	109.98	111.04
Inception-v3	67.11	75.90	80.67	82.34
VGG16	56.38	58.01	59.80	59.35
ResNet-101	53.42	63.28	68.14	68.35
VGG19	45.02	46.88	48.62	48.28
ResNet-152	37.88	44.89	48.48	48.62
ResNet-200	22.58	25.61	27.17	27.32
ResNeXt-50	53.30	64.40	71.07	72.96
ResNeXt-101	31.76	39.56	42.90	43.99
ResNeXt-101-64x4d	18.22	23.08	out of memory	out of memory

Memory usage (MiB)

dataset: 8000 samples
batch size: 10, 20, 30, 40
Optimizer: SGD
GPU: Maxwell TITAN X (12GiB GPU Memory)

model	batch size 10	batch size 20	batch size 30	batch size 40	Reference accuracy (imagenet1k Top-5)
CaffeNet	430	496	631	716	78.3%
SqueezeNet	608	937	1331	1672	78.8%
NIN	650	902	1062	1222	81.3%
ResNet-18	814	1163	1497	1853	88.7%
ResNet-34	1127	1619	2094	2598	91.0%
Inception-BN	1007	1569	2212	2772	90.8%
ResNet-50	1875	3080	4265	5483	92.6%
Inception-v3	2075	3509	4944	6383	93.3%
VGG16	1738	2960	4751	5977	89.8%
ResNet-101	2791	4576	6341	8158	93.3%
VGG19	1920	3242	5133	6458	89.8%
ResNet-152	3790	6296	8777	11330	93.1%
ResNet-200	2051	2769	3471	4201	unknown
ResNeXt-50	2248	3863	5468	7089	93.3%
ResNeXt-101	3350	5749	8126	10539	94.1%
ResNeXt-101-64x4d	5140	8679	out of memory	out of memory	94.3%

Available optimizers

SGD
DCASGD
NAG
SGLD
Adam
AdaGrad
RMSProp
AdaDelta
Ftrl
Adamax
Nadam

To use these optimizers, specify the optimizer name in lowercase in config.yml.

Utilities

util/counter.sh

Count the number of files in each subdirectory.

$ util/counter.sh testdir
testdir contains 4 directories
Leopards    197
Motorbikes  198
airplanes   199
watch       200

util/move_images.sh

Move the specified number of jpeg images from the target directory to the output directory while maintaining the directory structure.

$ util/move_images.sh 20 testdir newdir
processing Leopards
processing Motorbikes
processing airplanes
processing watch
$ util/counter.sh newdir
newdir contains 4 directories
Leopards    20
Motorbikes  20
airplanes   20
watch       20

Prepare sample images for fine-tuning

Download Caltech 101 dataset, and split part of it into the example_images directory.

$ util/caltech101_prepare.sh

example_images/train is train set of 60 images for each classes
example_images/valid is validation set of 20 images for each classes
example_imags/test is test set of 20 images for each classes

$ util/counter.sh example_images/train
example_images/train contains 10 directories
Faces       60
Leopards    60
Motorbikes  60
airplanes   60
bonsai      60
car_side    60
chandelier  60
hawksbill   60
ketch       60
watch       60

With this data you can immediately try fine-tuning.

$ util/caltech101_prepare.sh
$ rm -rf images
$ mv exmaple_images images
$ docker-compose run finetuner

Misc

Use DenseNet

ImageNet pretrained DenseNet-169 model is introduced on A MXNet implementation of DenseNet with BC structure.

You can use this model as below

Download parameter and symbol files

densenet-imagenet-169-0-0125.params https://drive.google.com/open?id=0B_M7XF_l0CzXX3V3WXJoUnNKZFE

densenet-imagenet-169-0-symbol.json https://raw.githubusercontent.com/bruinxiong/densenet.mxnet/master/densenet-imagenet-169-0-symbol.json

Rearrange downloaded files

Change the name of the downloaded files and store it as below

model/imagenet1k-densenet-169-0000.params
model/imagenet1k-densenet-169-symbol.json

Modify config

To use DenseNet-169 pretrained models, specify the imagenet1k-densenet-169 in config.yml.

Try image classification with jupyter notebook

$ docker-compose run --service-ports finetuner jupyter

Please note that the --service-port option is required

Replace the IP address of the displayed URL with the IP address of the host machine and access it from the browser.

Open the classify_example/classify_example.ipynb and try out the image classification sample using the VGG-16 pretrained model pretrained with ImageNet.

config.yml options

common settings

common:
  num_threads: 4
  # gpus: 0  # list of gpus to run, e.g. 0 or 0,2,5.

data settings

train, validation and test RecordIO data generation settings.

mxnet-finetuner pack the image files into a recordIO file for increased performance.

data:
  quality: 95
  shuffle: 1
  center_crop: 0
  # use_japanese_label: 1

finetune settings

finetune:
  pretrained_models:  # the pretrained models
    - imagenet1k-nin
    # - imagenet1k-inception-v3
    # - imagenet1k-vgg16
    # - imagenet1k-resnet-50
    # - imagenet11k-resnet-152
    # - imagenet1k-resnext-101
    # etc
  optimizers:  # the optimizer types
    - sgd
    # optimizers: sgd, dcasgd, nag, sgld, adam, adagrad, rmsprop, adadelta, ftrl, adamax, nadam
  num_epochs: 10  # max num of epochs
  # load_epoch: 0  # specify when using user fine-tuned model
  lr: 0.0001  # initial learning rate
  lr_factor: 0.1  # the ratio to reduce lr on each step
  lr_step_epochs: 10  # the epochs to reduce the lr, e.g. 30,60
  mom: 0.9  # momentum for sgd
  wd: 0.00001  # weight decay for sgd
  batch_size: 10  # the batch size
  disp_batches: 10  # show progress for every n batches
  # top_k: 0  # report the top-k accuracy. 0 means no report.
  # random_crop: 0  # if or not randomly crop the image
  # random_mirror: 0  # if or not randomly flip horizontally
  # max_random_h: 0  # max change of hue, whose range is [0, 180]
  # max_random_s: 0  # max change of saturation, whose range is [0, 255]
  # max_random_l: 0  # max change of intensity, whose range is [0, 255]
  # max_random_aspect_ratio: 0  # max change of aspect ratio, whose range is [0, 1]
  # max_random_rotate_angle: 0  # max angle to rotate, whose range is [0, 360]
  # max_random_shear_ratio: 0  # max ratio to shear, whose range is [0, 1]
  # max_random_scale: 1  # max ratio to scale
  # min_random_scale: 1  # min ratio to scale, should >= img_size/input_shape. otherwise use --pad-size
  # rgb_mean: '123.68,116.779,103.939'  # a tuple of size 3 for the mean rgb
  # monitor: 0  # log network parameters every N iters if larger than 0
  # pad_size: 0  # padding the input image
  auto_test: 1  # if or not test with validation data after fine-tuneing is completed
  train_accuracy_graph_output: 1
  # train_accuracy_graph_fontsize: 12
  # train_accuracy_graph_figsize: 8,6
  # train_accuracy_graph_slack_upload: 1
  # train_accuracy_graph_slack_channels:
  #   - general

test settings

test:
  use_latest: 1  # Use last trained model. If set this option, model_prefix and model_epoch are ignored
  model_prefix: 201705292200-imagenet1k-nin-sgd
  model_epoch: 1
  # model_epoch_up_to: 10  # test from model_epoch to model_epoch_up_to respectively
  test_batch_size: 10
  # top_k: 10
  classification_report_output: 1
  # classification_report_digits: 3
  confusion_matrix_output: 1
  # confusion_matrix_fontsize: 12
  # confusion_matrix_figsize: 16,12
  # confusion_matrix_slack_upload: 1
  # confusion_matrix_slack_channels:
  #   - general

Acknowledgement

Licnese

Apache-2.0 license.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
classify_example		classify_example
common		common
data		data
images		images
logs		logs
model		model
util		util
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
docker-entrypoint.sh		docker-entrypoint.sh
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mxnet-finetuner

Prerequisites

Setup

Example usage

1. Arrange images into their respective directories

2. Edit config.yml

3. Do Fine-tuning

4. Predict with trained models

Available pretrained models

Benchmark

Speed (images/sec)

Memory usage (MiB)

Available optimizers

Utilities

util/counter.sh

util/move_images.sh

Prepare sample images for fine-tuning

Misc

Use DenseNet

Download parameter and symbol files

Rearrange downloaded files

Modify config

Try image classification with jupyter notebook

config.yml options

common settings

data settings

finetune settings

test settings

Acknowledgement

Licnese

About

Releases

Packages

Languages

thomasdic2000/mxnet-finetuner

Folders and files

Latest commit

History

Repository files navigation

mxnet-finetuner

Prerequisites

Setup

Example usage

1. Arrange images into their respective directories

2. Edit config.yml

3. Do Fine-tuning

4. Predict with trained models

Available pretrained models

Benchmark

Speed (images/sec)

Memory usage (MiB)

Available optimizers

Utilities

util/counter.sh

util/move_images.sh

Prepare sample images for fine-tuning

Misc

Use DenseNet

Download parameter and symbol files

Rearrange downloaded files

Modify config

Try image classification with jupyter notebook

config.yml options

common settings

data settings

finetune settings

test settings

Acknowledgement

Licnese

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages