Skip to content

Handwritten digit recognition for the fifth largest spoken language in the world

Notifications You must be signed in to change notification settings

Sirsho1997/BengaliDigits

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 

Repository files navigation

Handwritten digit recognition for the fifth largest spoken language in the world

IPython Notebook has been used for this project.

The data set that has been used is CMATERdb data set. [https://code.google.com/archive/p/cmaterdb/] It is a balanced data set of total 6000 Bangla numerals.

CMATERdb is the pattern recognition database repository created at the 'Center for Microprocessor Applications for Training Education and Research' (CMATER) research laboratory, Jadavpur University, Kolkata 700032, INDIA. This database is free for all non-commercial uses.

It is a balanced dataset of total 6000 Bangla numerals (32x32 RGB coloured, 6000 images), each having 600 images per class (per digit).

For viewing the whole code - Open In Colab

One data set example

Building the model

Building the neural network requires configuring the layers of the model, then compiling the model. The basic building block of a neural network is the layer. Layers extract representations from the data fed into them. Hopefully, these representations are meaningful for the problem at hand.

CNN Architecture

A very common architecture for a CNN is a stack of Conv2D and MaxPooling2D layers followed by a few denesly connected layers. The idea is that the stack of convolutional and maxPooling layers extract the features from the image. Then these features are flattened and fed to densly connected layers that determine the class of an image based on the presence of features.

#Set up the layers
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10))

Compiling the model

Before the model is ready for training, it needs a few more settings.

Loss function —It measures how accurate the model is during training. You want to minimize this function to "steer" the model in the right direction.

Optimizer —It decides how the model is updated based on the data it sees and its loss function.

Metrics —Used to monitor the training and testing steps. The following example uses accuracy, the fraction of the images that are correctly classified.

#Compile the model
model.compile(optimizer='adam',loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),metrics=['accuracy'])

Evaluate Accuracy

Next, comparing how the model performs on the test dataset

test_loss, test_accuracy = model.evaluate(test_x,  test_y, verbose=2)
print("Accuracy : ",test_accuracy*100,"%")

Plotting several images along with their predictions

Contributor -

About

Handwritten digit recognition for the fifth largest spoken language in the world

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published