Project report - Bengali digit recongnition using SVM

CMPUT 551 Project Report
Bengali Handwritten Digit Recognition with Support
Vector Machines
Submitted By
Mohammad Saiful Islam
Student Id: 1270123
Date of Submission
21st
December, 2010

Introduction:
Originally character recognition is a subset of patter recognition. But the need of recognizing
characters in various fields boosts the field of patter recognition and image analysis [1].
Character recognition can be classified in to two types, online and offline. In online character
recognition, the system has to recognize the dynamic motion of the pen to recognize the
character while it is written and in offline character recognition, static text is used as input for
recognition purpose. From another perspective character recognition can be divided into two
types, machine printed and handwriting recognition.
Bengali is an eastern Indic language. It is native to the region of eastern South Asia known as
Bengal, which comprises the present Bangladesh and Indian state of west Bengal, and parts of
Indian state Tripura and Assam. With 300 million native speakers, it is ranked 6th
based on
number of native speakers [2].
In the current project I have decided to work on the handwriting recognition of Bengali language.
This is due to the fact a lot of work has been already done in the field of machine printed
character recognition but there has been a few work on the handwriting recognition. For this
project I have worked on the digits of Bengali characters. This is the tradeoff I had to make
because of limited time. Working on a project requires a good dataset for training and testing
purpose and my first and most difficult problem was to find a good dataset for Bengali alphabets
and numerals. After exhaustive searching I have found only a small set of numeral data set and
for time constraint I have able to build a small set of mine so I am working only in digit
recognition.
The work of digit recognition can be divided into several blocks, in each block there are several
internal steps. The block diagram of the whole process is given in Figure 1.
Figure 1: Process of Digit Recognition.
In the current project I have worked mainly on the classification step, where the system is given
a set of training feature vectors of digits to train itself then when a test feature vector of the digit
is given it will classify the digit into respected class.
Document
Input
Pre-processing Feature
Extraction
Classification
Post
Processing
Output

Literature Review:
In early stages of the OCR, template matching based techniques were used. These templates are
designed using small number of samples. But as the number of samples became high, this
technique failed to give good results. Then researchers turned into methods based on learning
from examples strategies like artificial neural networks. Support vector machines are applied in
modern recognition task with great accuracy. Non parametric statistical methods like 1-Nearest
neighbors (1-NN), K-NN, decision trees, since all the training samples have to be stored and
compared.
Research on OCR systems for recognizing Bengali characters have been started since mid 1980’s
and a variety of approaches were applied. Among those works some were complete systems and
some were only the part of a complete system like preprocessing, feature extraction,
classification and post processing. Researchers used many types of classifiers for OCR like
Nearest Neighbor [3], Feature based tree classifier [4], Template matching [5], Distance based
classifier [6], Neural Networks [7], Hidden Markov Model [8] and Support Vector Machines
[11]. Hasnat at el. [8] developed a Hidden Markov Model based multi font supported OCR
where they have separate HMM models for each segmented character or word. The system uses
HTK toolkit for data preparation, model training and recognition. They transformed the raw
pixel value using Discrete Cosine Transform. Rahan at el. [12] proposed a multistage method for
Bengali handwriting recognition. They said that by using distinct characteristics present in the
alphabet, alphabets can be grouped and building multi stage classifier by using these groups
made the classifier more robust. The multistage classifier can outperform single stage classifier
because of the ability to detect extreme variance in the training and the test examples. Arora at
el. [9] compared two most popular methods for handwriting recognition, ANN and SVM for
Devnagari Charecters, which is similar to Bengali and found that SVM can work equally well
compared to ANN, which is widely used in handwriting recognition. Bhowmik at el. [10] made a
comparative study among multilayer perception, radial basis function network, and SVM for
Bengali character recognition and found that SVM outperforms the other two methods. They
proposed that a hierarchical learning architecture (HLA), based on SVM will perform better than
the single stage SVM. For the first stage of the classifier they have developed groups on the
basis of confusion matrix obtained by SVM. Chanda at el. [11] used SVM to automatically
identify an individual based on handwriting for Bengali language. They have experimented with
discrete directional features and gradient features and got satisfactory result for gradient features.
Umapada at el. [13] used SVM for recognizing multi-oriented Bengali printed characters. For
recognition of multi-sized/multi-oriented characters the features were computed from different
angular information obtained from the external and internal contour pixels of the characters.
These angular information were computed in such a way that they do not depend on the size and
rotation of the characters. Circular and convex hull rings had been used to divide a character into
smaller zones to get zone-wise features for higher recognition results. Liu et al. [14] compared
six classifiers like MLP, MQDF, DLQDF, PNC, CFPC, SVM for Bengali handwritten digit

classification and they found out the SVM produces the highest classification accuracy. They
concluded that good results can be obtained by gray scale image classification opposed to binary
classification using gray scale normalization, and by moment or bi-moment normalization. Maji
at el. [15] found that though polynomial kernels with SVMs are mainly used for digit recognition
with raw pixels, they are impractical due to high complexity at runtime. So they proposed using
improved features with a low complexity classifier. Their experiments with standard digit
databases showed high accuracy compared to complex classifier using RBF kernels. Edson at el.
[19] showed that SVM performs better than HMM for offline handwriting recognition.
Methods:
For this project I have chosen to use Multiclass Support Vector Machine (MSVM). The original
Binary Support Vector Machine (SVM) was invented by Vladimir Vapnik, and soft margin case
was proposed by Corinna Cortes and Vladimir Vapnik [16]. The MSVM is a special case of the
binary SVM which extends the capability of SVM to classify data into multiple classes. In this
assignment I have used the both the linear and nonlinear version of MSVM by using kernels
which was proposed by Bernhard Boser, Isabelle Guyon and Vapnik [17].
Generally SVM is a non-probabilistic binary linear classifier which constructs a hyper plane or a
set of hyper planes in a high dimensional space, which can be used for classification. A special
property is that they simultaneously minimize the empirical classification error and maximize
the geometric margin; hence they are also known as maximum margin classifiers [18]. In this
project the basic SVM is implemented using soft margins. Corinna Cortes and Vapnik suggested
the soft margin to allow mislabelled examples [16]. If there exist no hyper plane that can fully
separate the two classes than Soft Margin will create a hyper plane that splits the examples as
good as possible, maximizing the distances to near cleanly split data points. The kernel trick is
used to transform the feature space. The transformation may be non linear thus the classifier may
be hyper plane in the higher dimension but non linear in the original input space. The data may
not be separable in the original space but this transformation may turn them into linearly
separable in the higher dimensional space.
Solving a multiclass problem is a special binary class problem. The original problem is
transformed into several binary classification problems. Each of the problems yields a binary
classifier, which is assumed to produce an output function that gives relatively large values for
examples from the positive class and relatively small values for examples belonging to the
negative class. There are two common methods to solve multiclass problems with binary
classifiers. The one-versus-all method and the one-versus-one method.
Let we have C classes. The one-versus-all method will create C distinct classifiers. The ith
classifier is trained using data points from class i as positive and all other negative. For a new
data, it is assigned to a class whose classifier gives the highest value. For the one-versus-one
method, we need C(C-1)/2 binary classifiers. Classifier Cij will classify class i as positive and

class j as negative. For a new example, majority voting for the positive result is done. After each
classifier is applied to the data, it is assigned to the class with largest number of votes.
For this project I have implemented the one-versus-all method. The reason to choose this method
over the other one is pretty obvious. There are 10 classes for this assignment and I am building
10 classifiers now. But if I have implemented the other method I would need to implement 45
classifiers. Though the other method gives more accurate result than this one I think the time
needed for the training of 45 classifiers is too much compared to the time required for this
method and I would get good accuracy with this method.
The three types of kernels used in the project are
1. Linear kernel Klin(x,y) = x.y
2.
Polynomial kernel Kpoly(x,y) = (x.y+1)d
3. RBF kernel Krbf(x,y)= exp(-||x-y||2)/2σ2
In the implementation the learner function will take as input two parameters X and y as the
training data and will output a model which will be used by the classifier to classify new data.
For the current problem y is a vector of number ranging 0-9. But for the binary classifier ci, it
needed a label vector Yi such that,
Yij = +1 when yj= i
-1 otherwise
First the label vector y is transformed into 10 separate label vectors for each classifier. Then the
test data is provided to each of the 10 classifiers to find 10 weight vectors (li) and offsets (bi).
These values together with the original training data comprise the model.
In the classify function, the new data is classified using 10 separate classifiers. At first I was
using sign function as the result of these classifiers. But it would create inaccuracy in the result
because of the ambiguous states created so the method used to improve this situation proposed
by Vapnik [4] is to use continuous values of SVM decision function rather than their signs. The
class of a data point is whichever class has the decision function with highest value regardless of
the sign.

Figure 2: MSVM with continuous decision function
One of the difficult phases of this project was choosing the classifiers. From the literature review
I have learned that for handwriting recognition, especially Bengali handwriting recognition the
most popular classifiers used were Artificial Neural Networks (ANN), Hidden Markov Model
(HMM) and Support Vector Machines (SVM). There are also some mixed multi-layer
approaches.
Comparison between ANN and SVM on different properties is given below [9].
Complexity of training: The parameters of neural classifiers are generally adjusted by gradient
descent. By feeding the training samples a fixed number of sweeps, the training time is linear
with the number of samples. SVMs are trained by quadratic programming (QP), and the training
time is generally proportional to the square of number of samples. Some fast SVM training
algorithms with nearly linear complexity are available.
Model selection: The generalization performance of neural classifiers is sensitive to the size of
structure, and the selection of an appropriate structure relies on cross-validation. The
convergence of neural network training suffers from local minima of error surface. On the other
hand, the QP learning of SVMs guarantees finding the global optimum. The performance of
SVMs depends on the selection of kernel type and kernel parameters, but this dependence is less
influential.
Classification accuracy: SVMs have been demonstrated superior classification accuracies to
neural classifiers in many experiments.

Storage and execution complexity: SVM learning by QP often results in a large number of
SVs, which should be stored and computed in classification. Neural classifiers have much less
parameters, and the number of parameters is easy to control. In a word, neural classifiers
consume less storage and computation than SVMs.
Unlike ANN, the computational complexity of SVM does not depend on the dimensionality of
the input space. ANN use empirical risk minimization, while SVM use structural risk
minimization. SVM often outperforms ANN because SVM are less prone to over fitting. For
these reasons I preferred SVM over ANN.
The HMM has attracted the attention of many researchers in pattern recognition, and in
handwriting, speech and signature veriﬁcation. This statistical learning theory has the ability to
absorb both the variability and the similarity between patterns. It is based on the empirical risk
minimization (ERM) principle, which is the simplest of induction principles, where a decision
rule is chosen. The decision rule is based on a ﬁnite number of known examples (training set).
There are some problems related to HMM. First is finding the probability of observation
sequence given the model and computing it is very expensive even using dynamic programming
using back propagation. Second one is to adjust the parameters to maximize the probability the
current observation and there is no way to analytically find the global maxima so it could stick
into local maxima. Again determining the number of states in the model and determining the
number of models is an important task cause performance of the classifier depends on this.
Several of the literatures I have reviewed stated that SVM can show a good performance on
handwritten character recognition, especially for Bengali character and digit recognition. Some
of the literature compared the performance of SVM, HMM and ANN and showed that SVM can
even sometimes outperforms other two methods in Handwriting recognition. Last but not the
least; SVM is new approach for classification in Machine Learning compare to other methods
which has created a great interest in both academia and industry. I wanted to explore this new
field in the given assignment to gain some inner knowledge in this method.
An important part of any handwritten character recognition is the preprocessing part. In this part
first continuous characters are segmented to find the individual characters next individual
characters are read in monochrome or grayscale mode to obtain the raw features to be used in
training/ testing step. Often several intermediate steps are applied like applying filters to improve
the raw features to ensure greater classification accuracy. Segmentation and filtering is itself a
huge research area so I am skipping this part in my project. I assume that I am given a set of
segmented images of digits. In most of the previous literatures filters are used to improve the
features but I wanted to test the accuracy on raw pixels.
Recognition if Bengali characters are very difficult for different reasons. There are 13 vowels
which can take into modified forms when connected with consonants. Some of the characters
have half forms when connected together. These compound characters make character

segmentation very difficult. All the individual characters
“Matra”. This makes it difficult to isolate individual characters fro
isolated dots, which are vowel modifiers, namely,
which add up to the confusion. Ascenders and Descender
there is no database to use so I had to b
these difficulties I preferred to work on the digits only.
Hypotheses:
For this project I have a set of hypotheses, which I intend to test using experiments. They are
1. SVM can show good performance
2. Use of RBF kernels will boost the performance compared to Linear and Polynomial
Kernels
3. Using raw pixels we can achieve good accuracy on the recognition.
4. Training the classifier using samples from one person and
different persons will reduce the accuracy of recognition.
Experimental design:
To test the stated hypotheses, I have planned to run a set of experiments.
have to first select the dataset. From the internet I have found only a small dataset of grayscale
image but I wanted to test with monochrome image to so I have built a
dataset was created by using a tablet to write single digits at a time in
each image in a monochrome bitmap format.
the images are saved in a monochrome format they consist only 0 and 1.
background and 0 represents the actual digits.
digits written, 70 for each of the digits.
testing. A sample set of digits are given below.
All the individual characters are joined by a head line called
”. This makes it difficult to isolate individual characters from the words. There are various
h are vowel modifiers, namely, “Anuswar”, “Visarga” and “Chandra Bindu”,
sion. Ascenders and Descender recognition is also complex.
there is no database to use so I had to build one of my own which would take a lot of time.
these difficulties I preferred to work on the digits only.
of hypotheses, which I intend to test using experiments. They are
SVM can show good performance in Bengali handwritten digit recognition
Use of RBF kernels will boost the performance compared to Linear and Polynomial
Using raw pixels we can achieve good accuracy on the recognition.
Training the classifier using samples from one person and testing with samples from
different persons will reduce the accuracy of recognition.
, I have planned to run a set of experiments. For the experiments I
From the internet I have found only a small dataset of grayscale
image but I wanted to test with monochrome image to so I have built a dataset
dataset was created by using a tablet to write single digits at a time in a paint software and sav
each image in a monochrome bitmap format. Each image has dimension of 20 by 20 pixel and as
the images are saved in a monochrome format they consist only 0 and 1. 1 represents the white
background and 0 represents the actual digits. Two persons wrote all the digits and there are 700
0 for each of the digits. 500 digits are used for training and 2
A sample set of digits are given below.
a head line called
m the words. There are various
“Anuswar”, “Visarga” and “Chandra Bindu”,
recognition is also complex. Again
uild one of my own which would take a lot of time. Given
of hypotheses, which I intend to test using experiments. They are
in Bengali handwritten digit recognition.
Use of RBF kernels will boost the performance compared to Linear and Polynomial
testing with samples from
For the experiments I
From the internet I have found only a small dataset of grayscale
dataset of my own. The
software and saving
Each image has dimension of 20 by 20 pixel and as
1 represents the white
all the digits and there are 700
gits are used for training and 200 are used for

All the training samples are written by one person but the 200 test samples are written by two
persons, 100 each, to test the last hypothesis.
The images are read using Octave to get a 20 by 20 matrix of 0 and 1. Each of the matrixes is
then reshaped to get a 1 by 400 vector which represents an image. 700 such vectors are stacked
to form a 700 by 400 feature vector and they are labeled appropriately from 0 to 1.
The dataset found in the internet was the ISI Bengali numeral dataset [20]. The original dataset
has 19,392 training samples and 4000 test samples, where the images are gray scaled with noisy
background and the gray level of the foreground varies considerably. I was only able to get a
partial dataset because the obtaining the full one would require some time. The partial set has
500 samples, 50 for each digit. I have used first 40 samples of each digit as training set and last
10 as test set. An example of the dataset is given below.
The tif format images are read using octave. The images had various sizes so I rescaled them to
20 by 20 pixels. They are all gray scaled so pixel values ranges from 0 to 255 where 0 denotes
the most dark color and 255 denotes white background. The width of the stroke is greater than
one.
For each of the three kernels a set of experiments is done using varying regularization parameter
beta and kernel parameter d/sigma. For each experiment the classifier is trained using the
training sample and then tested using the test samples. Next the recognition accuracy is recorded
for result analysis.
As I wanted to test the effect of regularization parameter beta and kernel parameters d and sigma
so no cross validation is used. Again no feature selection method was applied. For each of the
kernels 10 beta is used starting from 2-5
to 16 (2-5
, 2-4
,…..24
). For RBF kernel 15 sigma is used
starting from 2-15
to 16 (2-15
, 2-14
,…..24
). For polynomial kernel 10 d is used from 0 to 9.

Experiments:
First set of experiments are done using 100 test samples from one person. Percentage of accuracy
for different beta using linear kernel is given in Table 1. From the table it can be seen that linear
kernels shows very good accuracy and the performance is not dependent on the regularization
parameter beta.
Beta 0.0312 0.0625 0.125 0.25 .5 1 2 4 8 16
%
Accuracy
99 99 99 99 99 99 99 99 99 97
Table 1: Percentage accuracy for different beta using linear kernel – Built in data, one person
Percentage of accuracy for different beta and d using Polynomial kernel is given in Table 2.
From the table it can be observed that polynomial kernels don’t show good performance for all d.
for smaller degree (1-3) the classifier is able to show good performance but for larger degree the
performance drops dramatically. Again beta affects the performance with one degree (d=3)
beta 0.03125 0.0625 0.125 0.25 0.5 1 2 4 8 16
d/ %
accuracy
0 10 10 10 10 10 10 10 10 10 10
1 99 99 99 99 99 99 99 99 99 97
2 88 98 98 99 99 99 99 99 99 99
3 10 63 88 89 97 99 99 99 99 99
4 10 10 10 10 10 10 38 94 98 98
5 10 10 10 10 10 10 10 10 10 10
6 10 10 10 10 10 10 10 10 10 10
7 10 10 10 10 10 10 10 10 10 10
8 10 10 10 10 10 10 10 10 10 10
9 10 10 10 10 10 10 10 10 10 10
Table 2: Percentage accuracy for different beta and d using polynomial kernel
Percentage of accuracy for different beta and sigma using RBF kernel is given in Table 3. From
the table it can be seen that with larger sigma (2, 4, and 8) the classifier can give good results but
for smaller sigma performance drops. Again increase of beta has a negative effect on the
classifier.

beta 0.03125 0.0625 0.125 0.25 0.5 1 2 4 8 16
sigma/%
accuracy
0.000977 10 10 10 10 10 10 10 10 10 10
0.001953 10 10 10 10 10 10 10 10 10 10
0.003906 10 10 10 10 10 10 10 10 10 10
0.007812 10 10 10 10 10 10 10 10 10 10
0.015625 10 10 10 10 10 10 10 10 10 10
0.03125 10 10 10 10 10 10 10 10 10 10
0.0625 10 10 10 10 10 10 10 10 10 10
0.125 30 30 30 30 30 30 10 10 10 10
0.25 92 92 92 92 92 92 10 10 10 10
0.5 92 92 92 92 92 92 10 10 10 10
1 14 15 18 27 88 21 10 10 10 10
2 90 90 90 90 90 91 92 13 10 10
4 98 98 98 98 98 96 96 96 95 95
8 98 98 98 99 99 97 95 95 95 95
16 99 99 97 97 95 94 94 94 95 95
Table 3: Percentage accuracy for different beta and sigma using RBF kernel
From the above experiments it is clear that SVM classifier using raw pixel features can achieve
good performance on Bengali handwritten digit recognition. But it was strange to see that using
non linear kernels (Polynomial or RBF) did not boost the performance where as they tend to
show lower performance for some parameter values. This can be explained from the training set
used. The number of features in the training set is 400 and number of examples is 500. So
applying non linearity in the feature vector is doing no good here. Rather using non linear
functions can made the classifier prone to over-fitting in such cases which explains the
performance degradation. Because of this problem we can see that, highly regularized version of
kernels performs better.
Next set of experiments are done with the gray scaled data. Percentage of accuracy for different
beta using linear kernel is given in Table 4. From the table it can be seen that we can get an
average result using linear kernels and the performance is not dependent on the regularization
parameter.
beta 0.0312 0.0625 0.125 0.25 .5 1 2 4 8 16
%
Accuracy
10 62 75 80 79 81 81 80 80 80
Table 4: Percentage accuracy for different beta using linear kernel – ISI data

From Table 5. We can see that the polynomial kernel only gives reasonable accuracy for degree
= 1 and the performance does not depend on the regularization parameter.
beta 0.03125 0.0625 0.125 0.25 0.5 1 2 4 8 16
d/ %
accuracy
0 10 10 10 10 10 10 10 10 10 10
1 10 67 77 80 79 81 81 80 80 80
2 10 10 10 10 10 10 10 10 10 10
3 10 10 10 10 10 10 10 10 10 10
4 10 10 10 10 10 10 10 10 10 10
5 10 10 10 10 10 10 10 10 10 10
6 10 10 10 10 10 10 10 10 10 10
7 10 10 10 10 10 10 10 10 10 10
8 10 10 10 10 10 10 10 10 10 10
9 10 10 10 10 10 10 10 10 10 10
Next set of experiments are done with different sigma and beta using RBF kernel which is
presented in table 6. Here we can see that the RBF kernel consistently performs bad for all beta
and all sigma.
beta 0.03125 0.0625 0.125 0.25 0.5 1 2 4 8 16
sigma/%
accuracy
0.000977 10 10 10 10 10 10 10 10 10 10
0.001953 10 10 10 10 10 10 10 10 10 10
0.003906 10 10 10 10 10 10 10 10 10 10
0.007812 10 10 10 10 10 10 10 10 10 10
0.015625 10 10 10 10 10 10 10 10 10 10
0.03125 10 10 10 10 10 10 10 10 10 10
0.0625 10 10 10 10 10 10 10 10 10 10
0.125 10 10 10 10 10 10 10 10 10 10
0.25 10 10 10 10 10 10 10 10 10 10
0.5 10 10 10 10 10 10 10 10 10 10
1 10 10 10 10 10 10 10 10 10 10
2 10 10 10 10 10 10 10 10 10 10
4 10 10 10 10 10 10 10 10 10 10
8 10 10 10 10 10 10 10 10 10 10
16 10 10 10 10 10 10 10 10 10 10
Table 6: Percentage accuracy for different beta and sigma using RBF kernel – ISI data

The main cause of non-linear classifiers not doing good in this data set is over fitting due to
small sample size. And the dataset has some ambiguity too. There are many Bengali digits which
can be easily confused for each other because of writing style of different peoples. Some sources
of confusion are given below.
Another source of error is background noise and varying gray levels. The foreground gray levels
are also varying. Normalizing the images using linear normalization or moment normalization
can remove the noises and thus provide good results. Again the digits size are different so the
feature vectors are different for same digit. As I am using raw features without any normalization
this would affect performance. By using gradient features that are independent of size or
orientation of the image we would get good results.
The last set of experiments is done with the built in data but with training and testing sample
taken from different persons. The goal is to see the change in performance depending on
individual person handwriting style. From Table 7 we can observe that the accuracy drops
around 50% for linear kernels and it’s independent of beta.
beta 0.0312 0.0625 0.125 0.25 .5 1 2 4 8 16
%
Accuracy
53 52 53 53 53 53 53 53 54 54
Table 7: Percentage accuracy for different beta using linear kernel – different person.

From Table 8 it can be seen that, performance drops by 50% and for only one degree (d=1) we
have reasonable results. Here also beta affects for only d=3.
Beta 0.03125 0.0625 0.125 0.25 0.5 1 2 4 8 16
d/ %
accuracy
0 10 10 10 10 10 10 10 10 10 10
1 53 53 53 53 53 53 53 53 54 55
2 43 49 51 53 54 54 54 54 55 54
3 24 22 43 49 53 53 53 53 53 53
4 10 10 10 10 10 10 19 33 42 52
5 10 10 10 10 10 10 10 10 10 10
6 10 10 10 10 10 10 10 10 10 10
7 10 10 10 10 10 10 10 10 10 10
8 10 10 10 10 10 10 10 10 10 10
9 10 10 10 10 10 10 10 10 10 10
For RBF kernel the only reasonable results come from sigma = 8 and the higher the beta the
lower the performance.
beta 0.03125 0.0625 0.125 0.25 0.5 1 2 4 8 16
sigma/%
accuracy
0.000977 10 10 10 10 10 10 10 10 10 10
0.001953 10 10 10 10 10 10 10 10 10 10
0.003906 10 10 10 10 10 10 10 10 10 10
0.007812 10 10 10 10 10 10 10 10 10 10
0.015625 10 10 10 10 10 10 10 10 10 10
0.03125 10 10 10 10 10 10 10 10 10 10
0.0625 10 10 10 10 10 10 10 10 10 10
0.125 11 11 11 11 11 11 10 10 10 10
0.25 44 44 44 44 44 44 10 10 10 10
0.5 46 46 46 46 46 46 10 10 10 10
1 10 10 11 11 11 11 10 10 10 10
2 43 43 43 43 43 43 40 10 10 10
4 51 51 51 51 51 49 44 39 35 38
8 54 54 54 54 53 49 47 46 45 43
16 56 55 50 50 46 45 46 46 46 46
Table 9: Percentage accuracy for different beta and sigma using RBF kernel

The performance drop can be easily explained through the samples. There is a considerable
difference between the handwriting of two persons.
the feature vector differs between the training and the test set.
be done in raw feature to improve performance.
person and testing it with samples from another person degrades the performance. Again over
fitting due to small number of training data degrades the performance for the non
Few comparisons between two sample set
set and lower ten samples are from test set.
From the above experiments I can say that SVM can perform well for handwritten digit
recognition but the digits need to be preprocessed before giving input to the system.
Normalization need to be done to discard background noise or gray level variability.
can work well if all the samples are normalized to same size or else some kind of oriental
features should be used. A good training set is indeed needed so that the system can cope up with
the high variance of the handwriting pattern of differen
significantly. Last non linear kernels are only useful when the number of training sample is
greater than number of features or else over fitting can reduce performance in which case linear
kernels gives good performance.
for each digit to train the classifier and it would give very high accuracy.
difference between the handwriting of two persons. Again the size of the digits varies
the feature vector differs between the training and the test set. Some kind of normalization must
be done in raw feature to improve performance. So training the classifier with samples from one
fitting due to small number of training data degrades the performance for the non
o sample set are given below. The upper ten digits are from training
set and lower ten samples are from test set.
Normalization need to be done to discard background noise or gray level variability.
A good training set is indeed needed so that the system can cope up with
the high variance of the handwriting pattern of different people or else the performance drops
Last non linear kernels are only useful when the number of training sample is
nce. If we need to built for one person only, we need a small sample
for each digit to train the classifier and it would give very high accuracy.
size of the digits varies much so
Some kind of normalization must
aining the classifier with samples from one
fitting due to small number of training data degrades the performance for the non-linear kernels.
The upper ten digits are from training
Normalization need to be done to discard background noise or gray level variability. Raw pixels
A good training set is indeed needed so that the system can cope up with
t people or else the performance drops
Last non linear kernels are only useful when the number of training sample is
for one person only, we need a small sample

Conclusion:
Handwriting recognition is a very big research area of pattern recognition and image processing
because of its high level of applicability in different places. SVM is the state of the art method
for handwriting recognition which can provide very good accuracy for general systems. In this
project we learnt how SVM can be applied for Bengali digit recognition. We have seen that with
proper set of training data, use of good image processing techniques, oriented features can
provide us with high level of accuracy in digit recognition for Bengali script using SVM.
References:
1. Line Eikvil, "Optical Character Recognition",
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.25.3684
2. "Statistical Summaries". Ethnologue. 2005. Retrieved 2007-03-03.
3. A. K. Roy and B. Chatterjee, "Design of a Nearest Neighbor Classifier for Bengali
Character Recognition", J. IETE, vol. 30, 1984.
4. U. Pal and B. B. Chaudhuri, "OCR in Bangla: An Indo-Bangladeshi Language", Proc. of
12th Int. Conf. on Pattern Recognition, IEEE Computer Society Press, pp. 269-274, 1994.
5. B. B. Chaudhuri and U. Pal, "An OCR System To Read Two Indian Language Scripts:
Bangla And Devnagari (Hindi)", Proc. Fourth ICDAR, 1997.
6. Veena Bansal and R.M.K. Sinha, A Devanagari OCR and A Brief Overview of OCR
Research for Indian Scripts in Proceedings of STRANS01, held at IIT Kanpur, 2001.
7. A. A. Chowdhury, Ejaj Ahmed, S. Ahmed, S. Hossain and C. M. Rahman, "Optical
Character Recognition of Bangla Characters using neural network: A better approach".
2nd ICEE 2002, Khulna, Bangladesh.
8. Md. Abul Hasnat, S. M. Murtoza Habib, and Mumit Khan, Segmentation free Bangla
OCR using HMM: Training and Recognition, Proc. of 1st DCCA2007, Irbid, Jordan,
2007.
9. S. Arora, D. Bhattacharjee, M. Nasipuri, L. Malik, M. Kundu, D. K. Basu, Performance
Comparison of SVM and ANN for Handwritten Devnagari Character Recognition, CoRR
, 2010
10. T. K. Bhowmik, P. Ghanty, A. Roy and S. K. Parui, SVM-based hierarchical
architectures for handwritten Bangla character recognition, INTERNATIONAL
JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, VOL 12(2), PG 97-
108
11. Sukalpa Chanda, Katrin Franke, Umapada Pal, Tetsushi Wakabayashi, "Text Independent
Writer Identification for Bengali Script," icpr, pp.2005-2008, 2010 20th International
Conference on Pattern Recognition, 2010
12. A. F. R. Rahman, R. Rahman, M. C. Fairhurst, Recognition of handwritten Bengali
characters: a novel multistage approach, Pattern Recognition, Volume 35, Issue 5, May
2002, Pages 997-1006

13. Umapada Pal, Partha Pratim Roy, Nilamadhaba Tripathy, Josep Llados, Multi-oriented
Bangla and Devnagari text recognition, Pattern Recognition, Volume 43, Issue 12,
December 2010, Pages 4124-4136
14. Cheng-Lin Liu, Ching Y. Suen, A new benchmark on the recognition of handwritten
Bangla and Farsi numeral characters, Pattern Recognition, Volume 42, Issue 12, New
Frontiers in Handwriting Recognition, December 2009, Pages 3287-3295
15. Subhransu Maji , Jitendra Malik , Subhransu Maji , and Jitendra Malik, Fast and
Accurate Digit Classification, http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-
2009-159.pdf
16. Corinna Cortes and V. Vapnik, "Support-Vector Networks", Machine Learning, 20, 1995.
17. B. E. Boser, I. M. Guyon, and V. N. Vapnik. A training algorithm for optimal margin classifiers.
In D. Haussler, editor, 5th Annual ACM Workshop on COLT, pages 144-152, Pittsburgh, PA,
1992. ACM Press
18. http://en.wikipedia.org/wiki/Support_vector_machine
19. Edson J.R. Justino, Flavio Bortolozzi, Robert Sabourin, A comparison of SVM and
HMM classifiers in the off-line signature verification, Pattern Recognition Letters,
Volume 26, Issue 9, 1 July 2005, Pages 1377-1385
20. ISI Bengali Numerals. http://www.isical.ac.in/~ujjwal/download/BanglaNumeral.html

Project report - Bengali digit recongnition using SVM

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (19)

Ähnlich wie Project report - Bengali digit recongnition using SVM

Ähnlich wie Project report - Bengali digit recongnition using SVM (20)

Project report - Bengali digit recongnition using SVM