This document presents a comparison of different convolutional neural network (CNN) models for handwritten number recognition that vary by layers. The models are trained on the MNIST dataset. A basic CNN model with convolutional, pooling, and fully connected layers is described. Models with different numbers and placements of layers are tested, and their training accuracy, validation accuracy, and test loss are compared. The optimal model is found to have two dropout layers and achieves 99.64% validation accuracy and the lowest test loss. User input can be tested on the model, and future work may involve improving accuracy for different writing styles.
Handwritten Digit Recognition and performance of various modelsation[autosaved]
1. HANDWRITTEN NUMBER RECOGNITION USING CNN AND COMPARISON OF PERFORMANCE
OF MODELS VARYING BY LAYERS
Presented by:
Name: Subhradeep Maji,
M.Sc, Part II, 2nd Semester
Department of Computer Science and Engineering
University of Kalyani
2. CONTENTS
• Introduction
• Purpose
• Digits recognition method
• Different deep learning models
• Convolution neural networks (CNN)
• Layers of basic CNN model
• MNIST dataset
• Basic steps need to be followed
• Creating the model
• Train the model
• Test on user input
• Comparison of different model ( varied by
layers)
• Discussion
• Result
• Conclusion
• Future work
3. INTRODUCTION
• With the rapid growth of technology, application of deep learning is increasing. Handwritten
Digit Recognition is one of the more important researches of deep learning in the current age.
• This project describes an approach for efficiently recognize digits or numbers written by
different people with the help convolution neural network (CNN), taking into account different
writing style and ink color. This model is finely tuned with the “Modified National Institute of
Standards and Technology (MNIST)” dataset.
• Here we will compare the performance of different models based on different layers further.
4. PURPOSE
• The main purpose of a handwriting digit recognition system is to convert handwritten digits
into machine readable formats.
• The main objective of this work is to effectively recognize handwritten digits and making
several official operations easier, error free and time efficiency.
5. DIGIT RECOGNITION METHODS
Deep learning is the most convenient method of recognizing digits .
Figure 1: Performance Comparison between Deep Learning vs Other Algorithms [1]
6. DIFFERENT DEEP LEARNING MODELS
• SUPERVISED MODEL
• Classic Neural Networks (Multilayer Perceptron)
• Convolutional Neural Networks (CNNs)
• Recurrent Neural Networks (RNNs)
• UNSUPERVISED MODEL
• Self-organizing Maps (SOM)
• Here in my project I have used Convolutional Neural Network to solve my classification problems.
7. Convolution Neural Network (CNN)
• CNNs were specially designed for image data and might be the most efficient and flexible model
for image classification problems.
• CNN have multiple layers that processes the image, extracts features and classifies to correct
class.
• Convolution layer: It consists several filters that performs feature extraction
• Rectified Linear Unit (ReLU) : To introduce non-linearity in our ConvNet. Output is rectified feature map.
• Pooling layer: It is a down-sampling operation that reduces the dimensions of the feature map. Here I
have used the Max Pooling layer which selects max value from the region covered by the filter matrix.
• Fully connected layer: A fully connected layer forms when the flattened matrix from the pooling layer is
fed as an input, which classifies and identifies the images.
8. LAYERS OF A BASIC CNN MODEL
Convolution + ReLU + MaxPooling
Fully Connected Layer
Figure 2: Layers in a basic CNN model
9. MNIST DATASET
• Modified national institute of standards and Technology
(MNIST) dataset. [2]
• It is a dataset of 60,000 training samples and 10,000
test samples and all samples are indeed a square with
28x28 pixels and all are in gray format.
Figure 3: MNIST dataset [3]
11. CREATING THE MODEL
• Design the sequential model that consists following layers:
12. TRAIN THE MODEL
• Once we have the model, following steps are followed for training:
13. COMPARISON OF DIFFERENT MODEL (VARIED BY LAYERS)
Layers
Dropout
layer used
Batch size Epoch
Max Train
Accuracy
Max Validation
Accuracy
Total Test Loss
Conv1 + Pooling1 +
Conv2 + Pooling2 +
Hidden1 + Hidden2
One ( after
Hidden1 layer)
64 15 98.18% 99.04% 0.0267
Conv1 + Conv2 +
Pooling + Hidden1 +
Hidden2
No 64 15 99.88% 98.57% 0.0428
Conv1 + Pooling1 +
Conv2 + Pooling2 +
Hidden1 + Hidden2
Two (after
pooling2 and
hidden1 layers)
64 15 99.62% 99.64% 0.0239
Conv1 + Conv2 +
Pooling + Hidden1 +
Hidden2
One (after
Hidden1 layer)
64 15 98.19% 99.17% 0.0261
14. DISCUSSION
• Although we achieve a training accuracy of 99.88% in case 2, but we are not considered that
model optimal, as it produces maximum test loss of 0.0428 which results due to overfitting.
• In case 3, we have achieved a validation accuracy of 99.64% which is most among all the test
cases which also produces minimum loss of 0.0239. Hence we consider this as our optimal
classification model.
17. RESULT (CONTD.)
• We also get some wrong predicted output while testing which may cause due to training loss or
may be due to some overfitting
18. CONCLUSION
• In this project, the variations of accuracies for handwritten digit were observed for 15 epochs by
varying the hidden layers using CNN model and MNIST digit dataset.
• The maximum accuracy in the performance was found 99.64% and the total lowest test loss is
0.0239 approximately.
• This type of higher accuracy will cooperate to speed up the performance of the machine more
adequately.
• This low loss will provide CNN better performance to attain better image resolution and noise
processing.
19. FUTURE SCOPE
• Make a CNN model that classifies more accurately by varying the number of hidden layers and
batch size for different handwritten style.
• An approach called “Ensemble Model” can give much better accurate prediction in recognizing
numbers.
• Include new features that can predict numbers from live or real-time videos.