This is a reimplementation of the TC-ResNet 8 and 14 architecture, proposed by Hyperconnect. The research aims for a lightweight Convolutional Neural Network model to solve the Keyword Spotting problem with audio data in real time on mobile devices.
Author's implementation with Tensorflow
For training, testing and validation, this implementation uses Google's Speech Command Dataset. Please download the dataset, and extract into a folder named dataset
in the root folder of the repository.
Run main.py
to train the model.
Run live.py
to demostrate live prediction of the model in real time.