Deep Learning in OpenCV

Deep Learning is the most popular and the fastest growing area in Computer Vision nowadays. Since OpenCV 3.1 there is DNN module in the library that implements forward pass (inferencing) with deep networks, pre-trained using some popular deep learning frameworks, such as Caffe. In OpenCV 3.3 the module has been promoted from opencv_contrib repository to the main repository (https://github.com/opencv/opencv/tree/master/modules/dnn) and has been accelerated significantly.

The module has no any extra dependencies, except for libprotobuf, and libprotobuf is now included into OpenCV.

The supported frameworks:

Caffe
TensorFlow
Torch
Darknet
Models in ONNX format (as the main method to import models from PyTorch and Keras for some cases)

The supported layers:

AbsVal
Accum
AveragePooling
BatchNormalization
BNLL
Concatenation
Convolution (1d, 2d, including dilated convolution, 3d)
Crop
CropAndResize (RCNN-specific layer)
Deconvolution, a.k.a. transposed convolution or full convolution
DetectionOutput (SSD-specific layer)
Dropout
Einsum
Eltwise (+, *, max)
ELU
Expand
Flatten
FullyConnected
FlowWarp
Gather
Interpolation
LRN
LSTM
MaxPooling
MaxUnpooling
Mish
MVN
NormalizeBBox (SSD-specific layer)
Padding
Permute
Power
PReLU (including ChannelPReLU with channel-specific slopes)
PriorBox (SSD-specific layer)
ReLU
ReduceL1
ReduceL2
ReduceLogSum
ReduceLogSumExp
ReduceMax
ReduceMean
ReduceMin
ReduceProd
ReduceSum
ReduceSumSquare
Region (for DarkNet models)
Reorg
Resize
RNN
ROI Pooling (RCNN-specific layer)
Scale
Shift
ShuffleChannel
Sigmoid
Slice
Softmax
Split
Swish
TanH

You also can write your own Custom layer.

The module includes some SSE, AVX, AVX2 and NEON acceleration of the performance-critical layers as well as support of CUDA for the most of the layers. There is also constantly-improved Halide backend. OpenCL (libdnn-based) backend is being developed and should be integrated after OpenCV 3.3 release. Here you may find the up-to-date benchmarking results: DNN Efficiency

The provided API (for C++ and Python) is very easy to use, just load the network and run it. Multiple inputs/outputs are supported. Here are the examples: https://github.com/opencv/opencv/tree/master/samples/dnn.

There is Habrahabr article describing the module: https://habrahabr.ru/company/intel/blog/333612/ (in Russian).

The following networks have been tested and known to work: