A handcrafted CNN implementation using NumPy for matrix operations instead of loops, very fast, can use 100% usage of CPU for computation.
The model is defined in model_config.json
, and you can edit it to build various architectures, just like TensorFlow's Sequential. Detailed comments are provided in nn_layers.py, you can clearly understand how each layer propagates, updates, and initializes from the comments.
This repo is inspired by my undergrad project. The training speed was improved by 60x after refactoring the code.
Conv, FullyConnected, MaxPool, AvgPool, ReLU, LeakyReLU, BatchNorm2D, BatchNorn1D, Softmax, CrossEntropy
-
Create a virtual env and install dependencies:
pip install tensorflow matplotlib scikit-image
- TensorFlow here is only used to download the MNIST dataset.
-
Start training:
python3 main.py
After training, the model is evaluated using test data.
If plot_sample_prediction == True
, a sample prediction plot will be generated after the testing is completed.