The Gesture Recognition dataset was created by Ankitesh Gupta. For details on how dataset was created check this cool repository
The images were Normalized using the Mean Pixel Value and the Standard Deviation of the Pixel Value before giving it to the model for Training and Testing. The code for normalizing the data is in preprocess.py
The Model Consists of the Following Layers:
-
Zero Padding Layer
i) Height Padding : 2
ii) Width Padding : 2 -
Convolution Layer
i) Kernels : 64
ii) Kernel Height : 3
iii) Kernel Width : 3
iv ) Stride : 1 -
Pooling Layer
i) Pooling Height : 2
ii) Pooling Width : 2
iii) Stride Height : 2
iv) Stride Width : 2 -
Relu Layer
-
Zero Padding Layer
i) Height Padding : 2
ii) Width Padding : 2 -
Convolution Layer
i) Kernels : 128
ii) Kernel Height : 3
iii) Kernel Width : 3
iv ) Stride : 1 -
Pooling Layer
i) Pooling Height : 2
ii) Pooling Width : 2
iii) Stride Height : 2
iv) Stride Width : 2 -
Relu Layer
-
Flatten Layer
-
Affine Layer : 128 Neurons
-
Affine Layer : 64 Neurons
-
Affine Layer : 16 Neurons
-
Affine Layer : 5 Neurons (This is the Output Layer)
-
Softmax Layer
After 100 Epochs The Model Performance is:
Training Accuracy: 100%
Validation Accuracy: 100%
Test Accuracy: 99.8%
It took about 1-1.5 hour to train this model for 100 Epochs.
Initial Weights of the Network were assigned using Xavier Initialization.
The Model was trained using Mini-Batch Gradient Descent with Adam Optimizer.
The Mini-Batch was sampled at random during training.
- Download the Model
- Extract it in models folder
- Load the Model using the command below
from networks.network import network model = network.load("model.json") prediction = model.predict(X)