Segmentation-Verification for Handwritten Digit Recognition


Doctoral Thesis / Dissertation, 2016

114 Pages


Excerpt


À CEUX QUE J'AIME ...ET CEUX QUI M'AIMENT
TO THOSE I LOVE AND THOSE WHO LOVE ME

II
AA
BSTRACT
Automatic reading of digit fields from an image document has been proposed in several
applications such as bank checks, postal code and forms. In this context, two main problems
are occurred when attempting to design a handwritten digit string recognition system. The first
problem is the link between adjacent digits, which can be naturally spaced, overlapped or/and
connected. The second problem is the unknown-length of the digit string, which is not carefully
written by people in real-life situations.
In this thesis, SVM-based segmentation-verification system for segmenting two connected
handwritten digits using the oriented sliding window is proposed. It employs a segmentation-
verification system using conjointly the oriented sliding window and Support Vector Machine
(SVM) classifiers. Experimental results showed that the proposed system is more appropriate
for segmenting simple and multiple connections. Its main advantage lays in the use few rules
for finding the optimal segmentation path. Hence, the proposed approach constitutes a tradeoff
between the correct segmentation and the number of the segmentation cuts.
Thereafter, we propose a new design of a handwritten digit string recognition system based on
the explicit approach for the unknown-length digit strings. Three methods are combined
according the link of adjacent digits, which are the histogram of the vertical projection
dedicated for spaced digits, the contour analysis dedicated for overlapped digits and the Radon
transform performed on the sliding window dedicated for connected digits. A recognition and
verification module based on Support Vector Machine (SVM) classifiers allows analyzing and
deciding the rejection or acceptance each segmented digit image. Experimental results
conducted on the benchmark dataset show that the proposed system is effective for
segmenting handwritten digit strings without prior knowledge of their length comparatively to
the state-of-art.

III
RR
ESUME
La lecture automatique de champs numériques à partir d'une image de document a été
envisagée dans plusieurs applications telles que les chèques bancaires, code postal et les
formulaires. Dans ce contexte, deux principaux problèmes sont survenus lors de la tentative de
concevoir un système de reconnaissance de la chaîne de chiffres manuscrits. Le premier
problème est le type de liaison entre les chiffres adjacents, qui peuvent être naturellement
espacées, se chevauchent et/ou connectés. Le deuxième problème est la longueur inconnue de
la chaîne de chiffres, qui ne sont pas soigneusement écrit par les gens dans des situations
réelles.
Dans cette thèse, Un système de segmentation-vérification basé sur SVM est proposé pour
segmenter deux chiffres manuscrits connectés utilisant la fenêtre glissante orientée. Il emploie
un système de segmentation-vérification utilisant conjointement la fenêtre glissante orientée et
le classifieurs Support Vector Machines (SVM). Les résultats expérimentaux ont montré que le
système proposé est plus approprié pour segmenter des simples et multiples
connexions. L'avantage principal réside dans l'utilisation de quelques règles pour trouver le
chemin de segmentation optimale. Par conséquent, l'approche proposée représente un
compromis entre la segmentation correcte et le nombre de coupes de segmentation.
Ensuite, nous proposons une nouvelle conception d'un système de reconnaissance de la chaîne
de chiffres manuscrits basée sur l'approche explicite pour la longueur inconnue de la chaîne de
chiffres. Trois méthodes sont combinées selon le type de la liaison entre les chiffres adjacents,
qui sont l'histogramme de projection verticale dédiée pour les chiffres espacés, l'analyse du
contour affecté aux chiffres qui se chevauchent et la transformée de Radon effectuée sur la
fenêtre glissante destiné aux chiffres connectés. Un module de reconnaissance et vérification
est basé sur le classifieurs Support Vector Machine (SVM) qui permet d'analyser et de décider le
rejet
ou l'acceptation de chaque image
de
chiffres
segmentés. Les résultats
expérimentaux effectués sur la base de données du benchmark montrent que le système
proposé est efficace pour segmenter les chaînes de chiffres manuscrits sans connaissance au
préalable de leur longueur relativement à l'état de l'art.

IV
AA
CKNOWLEDGEMENTS
First, I would like to thank Prof. Youcef CHIBANI for her valuable contributions, corrections and
discussions.
I would like to thank the members of the LISIC Laboratory for their encouragement and
support during the preparation of this thesis.
In addition, I would like to thank the members of my examining committee. Their comments
helped to improve the quality of the final version of this thesis.

V
CC
ONTENT
Abstract
II
Résumé
III
Acknowledgements
IV
Content
V
List of figures
VI
List of tables
IX
List of acronyms and symbols
X
Introduction
01
Chapter 1. 0verview of handwritten digit recognition
1.1. Introduction
05
1.2. Character recognition systems
05
1.2.1. Acquisition mode
06
1.2.1.1. On-line mode
06
1.2.1. 2.Off-line mode
07
1.2.2. Preprocessing
07
1.2.2.1. Noise reduction
07
1.2.2.2. Normalization
07
1.2.2.3. Smoothing
09
1.2.2.4. Skeletonization
09
1.2.3. Segmentation
09
1.2.4. Feature generation
11
1.2.4.1. Global features
12
1.2.4.2. Statistical features
12
1.2.4.3. Geometrical and topological features
12
1.2.5. Classification techniques
12
1.2.5.1. Statistical techniques
13
1.2.5.2. Structural techniques
14
1.2.5.3. Stochastic techniques
14
1.3. Overview of Support vector machines « SVM »
14
1.3.1. Basic principles
15
1.3.2. SVM Kernel
19
1.3.3. Multi-class SVM
21
1. 4. Summary
23
Chapter 2. Isolated handwritten digit recognition
2.1. Introduction
24
2.2. Overview of isolated handwritten digit recognition
25
2.3. Size normalization
27
2.4. Feature generation
28
2.4.1. Global features
28
2.4.1.1. Density
28
2.4.1.2. Center of gravity
28
2.4.1.3. Second order geometrical moments
29
2.4.1.4. Number of transitions
29
2.4.2. Hu's Moment Invariants
29
2.4.3. Skew
30
2.4.4. Zernike moments
30
2.4.5. Projections
31
2.4.6. Profile Features
31
2.4.7. Background features
31
2.4.8. Foreground features
33
2.4.8.1. Contour Based Features
33
2.4.8.2. Skeleton Based Features
33
2.4.9. Ridgelet Transform
34

VI
2.4.10. Region sampling: uniform grid
35
2.5. Recognition
36
2.6. Experimental results
36
2.6.1. Experimental results on NIST SD19
37
2.6.2. Experimental results on CVL Single Digit Database
39
2.7. Summary
41
Chapter 3. Segmentation of two connected handwritten digit recognition
3.1. Introduction
42
3.2. Overview of two connected handwritten digit recognition
43
3.3. Segmentation-verification system
46
3.3.1. Segmentation of connected digits
47
3.3.1.1. Finding the interconnection points
48
3.3.1.2. Finding the cutting path
49
3.3.2. Feature generation
50
3.3.3. Recognition and verification
51
3.4. System evaluation
52
3.4.1. Databases
53
3.4.2. Parameter tuning of the SVM model
54
3.4.3. Results
55
3.4.3.1. Influence of the orientation angle
55
3.4.3.2. Complexity of the proposed segmentation-verification
56
3.4.3.3. Comparative analysis
60
3.5. Summary
62
Chapter 4. Handwritten digit string recognition
4.1. Introduction
64
4.2. Overview of handwritten digit string recognition
65
4.3. Segmentation methods of the digit string
66
4.3.1. Segmentation of spaced digits
66
4.3.2. Segmentation of overlapped digits
67
4.3.3. Segmentation of connected digits
69
4.3.3.1. Adjustment of the width
(
)
70
4.3.3.2. Finding the angular cut via the Radon transform
71
4.4. Design of the digit string recognition system
73
4.4.1. Digit recognition-verification
74
4.4.1.1. Feature generation
74
4.4.1.2. Design of the SVM classifiers on isolated digits
75
4.4.1.3. Digit verification
75
4.4.2. Spaced digit recognition-verification
75
4.4.3. Overlapped digit recognition-verification
76
4.4.4. Connected digit recognition-verification
76
4.5. Experimental results
78
4.5.1. Databases and evaluation criteria
78
4.5.2. Experimental setup
79
4.5.3. Design of SVM classifiers on isolated digits
80
4.5.4. Evaluation of the digit segmentation
81
4.5.4.1. Influence of the range of the projection angle
81
4.5.4.2. Adjustment of the angular step
85
4.5.5. Comparative analysis
89
4.5.6. Computation cost of the proposed system
90
4.6. Summary
91
Conclusion
92
References
94

VII
LL
IST OF FIGURES
Figure 1.1. Steps of the typical character recognition system.
06
Figure 1.2. Samples of the basic methods for normalization: (a) Skew normalization (b) Slant
normalization (c) Size normalization.
08
Figure 1.3. Example of skeletonization
09
Figure 1.4. Misclassification caused by over-segmentation
11
Figure 1.5. Misclassification caused by under-segmentation
11
Figure 1.6. Maximum margin hyperplane and margins for an SVM
16
Figure 1.7. Hyperplane soft margin for non-linearly separable data
19
Figure 1.8. The data not linearly separable in 2-D are mapped onto three dimensions where a linear
decision surface between the classes can be made.
20
Figure 1.9. Regions not classified by OAA approach for a problem of three classes
21
Figure 1.10. Regions not classified by OAO approach for a problem of three classes
22
Figure 1.11. Directed acyclic graph SVM method.
23
Figure 2.1. Different configurations of concavity
32
Figure 2.2. Concavity labels for digit `9'
32
Figure 2.3. Contour detection: (A) Contour of the upper region, (B) Feature vector, and (C) 8-Freeman
directions
33
Figure 2.4. Features extracted from the skeleton
34
Figure 2.5. Example of splitting a digit using a uniform grid (2x2)
35
Figure 3.1. Different types of connected digit samples.
47
Figure 3.2. Possible segmentation paths using IP and BP: Hypothesis 1 (b) Hypothesis 2 (c) Hypothesis 3.
49
Figure 3.3. Crossing and the segmentation paths according the orientation of the window.
50
Figure 3.4. Illustrative example for segmenting two connected digits using the oriented sliding window
and recognition-verification strategy.
52
Figure 3.5. Number (#) of segmentation cuts versus the orientation angle
(°)
56
Figure 3.6. The overall rate (%) versus the orientation angle
(°)
57
Figure 3.7. The overall rate (%) versus the number (#) of segmentation cuts
57
Figure 3.8. Unsuccessful segmentation-verification of connected digits produced by the proposed system
59
Figure 3.9. Performance of the segmentation-recognition algorithms
61
Figure 4.1. Correct and incorrect segmentation using the HVP.
67
Figure 4.2. Sample images of broken handwritten digits.
67
Figure 4.3. Segmentation by contour analysis. (a) Possible segmentation. (b) Impossible segmentation
67
Figure 4.4. Examples of incorrect segmentation when using the contour analysis.
68
Figure 4.5. Processing of broken parts (a) two distinct overlapped single-digits (Rule1) (b) broken single-
digit (Rule2) (c and d) broken digit (Rule 3)
69
Figure 4.6. Influence of adjusting the width on the sliding window.
71
Figure 4.7. Radon transform for three examples of connected digits. The maximum value of the Radon
transform corresponds to the orientation angle
for cutting two connected digits.
72
Figure 4.8. Two selected projections of the Radon Transform showing
= 51°
72
Figure 4.9. Impact of selecting the orientation angle for segmentation (a) Segmentation with SWRT (b)
Segmentation without SWRT
73
Figure 4.10. Full segmentation system for handwritten digit string recognition.
74
Figure 4.11. Some samples of connected digits considered by SCA as isolated digits.
76
Figure 4.12. Connected digit recognition-verification (a) Original connected digits (b) Scan IPs (c) Fixing
sliding window Radon transform around IP (d) Segmentation paths (e) Final decision.
77
Figure 4.13. Segmentation example of our proposed digit string recognition system.
78

VIII
Figure 4.14. Impact of selecting the range of the projection angle for detecting the cutting path from
[1,179] to [80, 100].
82
Figure 4.15. Influence of selecting the range of the projection angle (°) for different digit string lengths:
(a) 2-Digit string length (b) 3-Digit string length (c) 4-Digit string length (d) 5-Digit string
length (e) 6-Digit string length (f) 10-Digit string length.
84
Figure 4.16. Influence of adjusting the angular step (°) for different digit string lengths: (a) 2-Digit string
length (b) 3-Digit string length (c) 4-Digit string length (d) 5-Digit string length (e) 6-Digit
string length (f) 10-Digit string length.
87
Figure 4.17. Impact of selecting the angular step for detecting the cutting path: (a) Correct segmentation
when the angular step is fixed to 3°. (b) Incorrect segmentation when the angular step is
fixed to 6°.
88

IX
LL
IST OF TABLES
Table 2.1. Summary of features
36
Table 2.2. The recognition rate for various experiments
38
Table 2.3. Recognition results on individual features
39
Table 2.4. Recognition results on feature combinations
40
Table 2.5. Recognition rates on individual digits
40
Table 2.6. Comparison of proposed method with state-of-the-art methods
41
Table 3.1. Comparison of different segmentation methods
46
Table 3.2. Distribution of the NIST SD19 database for handwritten digits
53
Table 3.3. Types of connected numeral strings
54
Table 3.4. Distribution of the database regarding the type of connection
54
Table 3.5. The recognition rate for various orientation angles
55
Table 3.6. Segmentation effect according the orientation angle
56
Table 3.7. Some examples of the successful segmentation-verification produced by the proposed system
according the connection type
58
Table 3.8. Some examples of the unsuccessful segmentation-verification produced by the proposed system
according the connection type
59
Table 3.9. Comparative analysis
60
Table 3.10. Rank of our method comparatively to the existing methods according the connection type
62
Table 4.1. Number of digit string samples (#Strings) distributed according to the Numbers of Spaced Digits
(#SD) and Connected and/or Overlapped Digits (#C-OD) expressed also in %.
79
Table 4.2. Recognition rate (%) obtained when training and retraining SVM classifiers.
81
Table 4.3. Average rates obtained for all digit string lengths for different ranges of the projection angle (°)
85
Table 4.4. Recognition rates according to the angular step for each string length.
88
Table 4.5. Comparative analysis of various segmentation systems for the unknown-length string performed
on NIST NSTRING SD19
89
Table 4.6. Examples of the correct and incorrect segmentation-recognition produced by the proposed
system from the NSTRING SD19 Database.
90

X
LL
IST OF ACRONYMS AND SYMBOLS
PDA
Personal Digital Assistant
Tablet PCs
Tablet Personal Computers
SAM
Spectral Angle Mapper
ANN
Artificial Neural Networks
SVM
Support Vector Machines
NN
Nearest Neighbor
HMM
Hidden Markov Models
VC
Vapnik and Chervonenkis
KKT
Karush Kuhn Tucker
RBF
Radial Basis Function kernel
KMOD
Kernel with Moderate Decreasing
OAA
One Against All
OAO
One Against One
DAG
Directed Acyclic Graph
GMM
Gaussian Mixture Models
CVL
Computer Vision Lab
NIST
National Institute of Standards and Technology
MNIST
Modified NIST
NIST SD19
NIST Special Database 19
CENPARMI
Centre for Pattern Recognition and Machine Intelligence
CEDAR
Center of Excellence for Document Analysis and Recognition
NSTRING SD19 NIST STRING Special Database 19
MLP
Multi-Layer Perceptron
FIR MLP
Finite Impulse Response MultiLayer Perceptron
LIBSVM
A Library for Support Vector Machines
SOM
Self-Organizing Maps
IP
Interconnection Point
BP
Base Point
SWRT
Sliding Window Radon Transform
AHDSR
Automatic Handwritten Digit String Recognition
HVP
Histogram of the Vertical Projection
SC
Segmented Components
SCA
Segmented Component Analysis

XI
DRV
Digit Recognition-Verification
GC
Grouped Component
GCA
Grouped Component Analysis
BSVM
binary SVM
SD
Spaced Digits
C-OD
Connected and/or Overlapped Digits
w
Perpendicular vector to the separating hyperplane of an SVM
w
Transpose of w
x
Input space showing an example of the database
w
Bias setting an SVM
Possible decision space associated with an example x
VC dimension
N
Number of training set
probability
C
Regularization constant
Lagrange multipliers
Non-negative slack variables
Lagrangean function
Mapping function (quadratic transform)
K
Kernel function
m
Dimension of the feature space
Decision function of SVM
Scale parameter of RBF and KMOD
Parameter of KMOD.
m
Two-dimensional geometric moment
Zero
th
order moment
10T
Central moments
,
R
Second-order moments
The seven
th
moment invariant
,
Zernike polynomials
R
,
Radial polynomial
,
Zernike moments
()
Image function
,
Amplitudes of Zernike moments
Dirac distribution
Angular variable
Radial variable
Radon transform

XII
Orientation angle
Distance between two digits
Width of Sliding Window Radon Transform
Angular cut
Parameter defined experimentally
( , )
Radon transform
(
) Maximal value selected from 10 responses provided by SVM classifiers
Feature vector
Threshold for Digit verification
Fixed decision threshold

1
II
NTRODUCTION
Automatic handwriting recognition is intended to convert images that are understandable by
humans, in an interpretable code by a machine. It has been a subject of intensive research for
about last fifty years. The problem, which is very simple for almost every human, is extremely
complicated for the machine. Several techniques and methods have been proposed in order to
build faster and more reliable systems. However, despite all the efforts in this area, there is still
a significant gap between human and machine performances. Two fields can be released:
Online handwriting recognition and offline handwriting recognition. In Online recognition,
characters are recognized when written. Stroke information is mono-dimensionally captured
dynamically, represented by the pen trajectory. Offline handwriting recognition is different from
online handwriting recognition, because here, stroke information is not accessible. The
information is two-dimensional represented as an image and captured of the scanned device of
the characters. Offline handwriting recognition is less accurate than online systems because the
temporal information is absent. The latter is the subject of this thesis.
Offline handwriting recognition itself can be divided into two approaches, according to the
number of persons whose writing must be recognized. When this number is limited, the
recognition system can support a specific training on the base of the own writing to those
persons. The system is called the mono-script recognition. When the writings is composed from
a very large number of different persons, the system is called the multi-script recognition.
Hence, this work takes place in this theses.
A typical offline handwriting recognition system usually consists of three main processing steps:
preprocessing, feature generation and recognition using a classifier. Preprocessing consists of a
sequence of operations performed to the images in order to prepare them for feature
generation. These common operations include noise reduction, document skew correction, slant
correction, normalization, smoothing and skeletonization.
The feature generation allows representing an image as a vector of features using various
extraction techniques, which can be redundant or not. When features are redundant, a
selection algorithm can be performed to reduce the size of the input feature vector to avoid the
so-called curse of dimensionality problem.
Finally, there are also a large number of classifiers for recognition ncluding the statistical, the
structural, the stochastic classifiers and finishing on combination of classifiers. At each step,
selecting the appropriate parameters could affect the final classification performance.

2
PProblem Statement
The challenges of unconstrained handwritten digit recognition are related mainly from
the following factors:
Multi-script recognition: is considered as a complex problem for the recognition of
unconstrained handwritten digits caused by the variability of writing;
Lack of constraints: imposed upon the writing (slopes, slants, overlaps and interconnections
of digits);
Variety of writing styles: are multiple such as stick, detached, mixed and cursive script;
Tools of writing: make the line thickness not unified;
Insufficient inking: causes scanning defects largely due to the age of the writer and the
quality of the paper;
The segmentation into isolated digits can produce pseudo-digits or connected digits.
At the end this step, the system must be robust in order to avoid an over-
segmentation and under-segmentation.
In summary, the difficulties of the unconstrained handwritten digit recognition are not only
located in recognizing individual digits, but also to separate out digits each other within the
string through segmentation. It can be conducted by considering three following situations:
spaced, overlapped or connected digits. In most cases, the overlapped and connected digits are
the frequent observed situations.
Currently, the focus of our work is the development of an Automatic Handwritten Digit String
Recognition (AHDSR), which is required in many applications such as the amount of the bank
checks, postal code and forms. In this context, some complex problems are occurred for
recognizing handwritten digit strings: the presence of the noise, broken digits, overlapping
digits, connected digits and unknown length of the string.
The literature shows different methods for AHDSR. Some works are based on contours, the
background and others on the skeleton. Others combine the two approaches, using the
skeleton and contour allowing to deduce the potential cutting points and in order to achieve a
better performance.
In reviewing the various explicit segmentation methods for AHDSR, the literature presents two
different approaches: segmentation-recognition and recognition-based. The relationship
between segmentation and recognition when using the recognition-based, each step depends
on the other. In this case, the priori knowledge is required in the segmentation step. Generally,
algorithms based on the segmentation-recognition approach allows to separate the digit
string into segments using rules without recognition.

3
In this work, the segmentation of handwritten digit string into digits is the most crucial part in
AHDSR. We shall deal with the foregoing problems by using a recognition-based approach
based on segmentation and verification.
G
Goals of the w ork
The primary goal of this research is focused on the recognition of isolated digits. The main
challenges in handwritten digit recognition arise from variations in size, shape, slant, and most
importantly, the differences in the writing styles of individuals. In this research, we are
interested in enhancing the feature generation step for isolated digit recognition used for
avoiding digit normalization. The idea is to find a combination of multiple features which
improves the overall recognition rates by minimizing the intra-class variability and maximizing
inter-class variability, the most desirable requirement of any pattern recognition system. The
performance of an AHDSR system depends in particular on the design of a robust Support
Vector Machines (SVM) classifier.
A second goal of this research lies in the recognition of two connected handwritten using a
SVM-based segmentation-verification system for segmenting simple and multiple connected
digits using an oriented sliding window. Two morphological features are combined based on the
contour and the skeleton for detecting and splitting correctly the connected digits. In order to
avoid the over-segmentation and the under-segmentation, a sliding window is used for finding
interconnection points. When the interconnection points are found, the window is oriented to
define the optimal cutting path. The recognition and verification process is performed using the
Support Vector Machines (SVM) through the One-Against-All (OAA) implementation.
The third goal of this research is to propose a solution for recognizing handwritten digit strings.
based on the explicit approach for the unknown-length. Three methods are combined according
the link of adjacent digits, which are the histogram of the vertical projection dedicated for
spaced digits, the contour analysis dedicated for overlapped digits and the Sliding Window
Radon Transform method dedicated for connected digits. Beside, the proposed system uses
Support Vector Machine (SVM) as classifier for recognition and verification.
Experiments carried out on isolated digits, two connected digits and strings of digits are are
performed for evaluating these three contributions using standard datasets.
Outline of the Thesis
This thesis consists of four chapters as described as follows:
Chapter 1 presents a brief overview of the handwritten character recognition system in order to
be able to introduce all definitions related to the system. This attempt is to bring out all steps of

4
the recognition system. It provides a detailed description of the popular Support Vector
Machine (SVM) classification technique since it is more accurate than other classifiers in many
areas of applications for data classification.
Chapter 2 describes the combination of different statistical and structural features for
recognition of isolated handwritten digits. These features include some global statistics,
moments, profile and projection based features and features generated from the contour and
skeleton of the digits. Some of these features are extracted from the complete image of the
digit while others are extracted from different regions of the image by first applying a uniform
grid sampling to the image. Classification is carried out using one-against-all SVM
implementation. The experiments are conducted on the isolated handwritten digit NIST SD19
and the CVL Single Digit Database.
SVM-based segmentation-verification system for segmenting two connected handwritten digits
using the oriented sliding window is addressed in Chapter 3. This latter describes a
segmentation-verification system using conjointly the oriented sliding window and support
vector machine (SVM) classifiers. The oriented sliding window is used for finding at the same
time the interconnection points and the optimal angle for cutting the adjacent digits. Whilst,
classifiers are used for recognizing and verifying the correct segmentation via a global decision
module for accepting or rejecting the processed image. Experimental results conducted on a
large synthetic database of handwritten digits show the effective use of the oriented sliding
window for segmentation-verification.
Chapter 4 is dedicated to propose a new segmentation and recognition system for unknown-
length handwritten digit strings. Three segmentation methods are combined based on the
histogram vertical projection, contour analysis and the Sliding Window Radon Transform
(SWRT) for recognizing unknown-length handwritten digit strings with the help of the
verification strategy stage. A recognition module based on the SVM classifier allows analyzing
and deciding the rejection or acceptance for each segmented digit image. Experimental results
conducted on the benchmark dataset show that the proposed strategy is effective for
segmenting handwritten digit strings.
Finally, we conclude this thesis by summarizing the main contributions of this work and new
directions are proposed for its improvment.

5
CC
HAPTER
O
VERVIEW OF HANDWRITTEN DIGIT RECOGNITION
Handwritten character recognition is an important area in image processing and pattern
recognition field. Its goal is to transform the handwritten characters to automatically recognized
by a machine readable. In this context, this thesis focuses on the handwritten digit recognition
from an image document since many applications can be envisaged as bank checks, postal code
and forms. Hence, this chapter is devoted to overview the main modules composed a
handwritten digit recognition system.
1.1. Introduction
Automatic reading of digit fields from an image document has been proposed in several
applications such as bank checks [1], postal code [2] and forms [3]. The digit fields can
be printed (or typewritten) and written by a writer text.
The first case is considered as a closed problem since all difficulties have overcome. In contrast,
the second case is entirely different because of the high variability of the writing [4].
This chapter is devoted to overview the different modules that allow recognizing a handwritten
digit character. Basically, a handwritten digit recognition system is composed of five distinct
modules: acquisition, preprocessing, segmentation, feature generation and classification
followed by an optional post-processing module.
1.2. Character recognition systems
The design of a handwritten character recognition system can be conducted mainly in five
steps: acquisition, preprocessing, segmentation, feature generation and classification. The first
module is the image acquisition, where a digital image of the text is obtained. This could be
done off-line using a scanner or on-line by a digital pen/stylus. Next, the preprocessing module
generally consists of several methods used to improve the quality of images for further
processing. The segmentation module is used for separating the overlapping and/or joining of
adjacent digits into elementary digits in order to deduce the possible distinct classes [5-9]. After
that, a feature generation is performed on the digit image for reducing the dimension of the
representation and thus makes the design of the classification system. Next, a decision function

6
Training
Reference
patterns
allows assigning a character image to predefined class. Figure 1.1 illustrates the different
modules for recognizing a handwritten digit.
The following sections describe each module composed a handwritten digit recognition system.
Figure 1.1. Steps of the typical character recognition system.
1.2.1. Acquisition mode
The acquisition of handwritten characters is done in two modes: on-line or off-line. Each mode
has its own acquisition tools and its corresponding recognition algorithms. On-line mode takes
into account the chronological information of the movements of the writer's hand; whereas the
off-line mode treats the information in delayed independently of the time.
1.2.1.1. On-line mode
This mode works in real time (during writing) i.e. the digital ink sample is composed as a set of
coordinates ordered in time. It is possible to track the path to know the pen-tip movements as
well as pen-up/pen-down switching and eventually the slope and speed.
This is notably the case of light pens, digital pens or styluses on touch screens, PDAs or Tablet
PCs. The symbols are recognized as they are written by hand. The obtained signal is converted
into character codes which are recognized by means of character recognition systems. The
writing is represented as a set of points whose coordinates are a function of time, which can be
regarded as a digital representation of handwriting [10].
On-line recognition has a great advantage for its possible correction and modification
interactively of writing having regard continuous system response [11].
Acquisition
Preprocessing
Segmentation
Feature generation
Classification
Recognition
Recognized
character

7
1.2.1.2. Off-line mode
The off-line handwriting recognition is performed on an image of a scanned document. In this
context, the acquired data is regarded as a static representation of handwriting. This mode is
appropriate for printed documents and manuscripts previously written after the acquisition.
The design of the off-line handwriting recognition is difficult comparatively to the on-line
handwriting recognition system since many desirables characteristics are not available as the
velocity, the pressure and the coordinates of the character. This mode can be considered as the
most general case for recognizing a handwriting character.
In our case, the image of the written text acquired by the off-line mode involves capturing the
text image using physical sensors (scanner, camera ...) with a minimum degradation. During
this step, despite the good quality of acquisition systems, noises might appear in scanned
document images. This is caused by the texture type, the area and its lighting.
1.2.2. Preprocessing
Preprocessing is performed essentially for reducing the superimposed noise on the image data
and trying to keep only the material information of the document. The noise may be due to
acquisition conditions (lighting, wrong placing the document ...) or the quality of the original
document.
One of the problems in the recognition of handwritten characters is the skew/slant detection
and correction in the text document, which introduces challenges for segmentation. Therefore,
the preprocessing of the images normally yields better results. These tasks commonly
include noise reduction, normalization, smoothing and skeletonization.
1.2.2.1. Noise reduction
Prior to the character recognition, it is necessary to eliminate the imperfections. Noise reduction
is the process of removing noise from an image. The imperfection in the optical scanning
devices intensity of light, scratches on the camera scanner lens or the writing instrument causes
disconnected line segments introduces noises in the scanned images.
There are many techniques to reduce the noise. Basically, the filtering function is used to
remove the noises and diminish spurious points in the image. For example, the symmetric
Gaussian filter function is used for smoothing equally in all directions. An alternative approach is
the use of the Morphological operations, which are basically neighbourhood operations. It is
performed on the input image using a structuring element. Two basic morphological operations
are used: dilation and erosion [1-2, 6-7].
1.2.2.2. Normalization
The normalization method is popularly used in character recognition to reduce all types of
variations and to obtain standardized data. However, it also gives rise to excessive shape

8
distortion and eliminates some useful information. The usual methods for normalizing a
character are the following:
Skew normalization::
Due to variation in the writing style, the skew can hurt the effectiveness
of recognition and therefore should be detected and corrected with respect to the
baseline (see Fig. 1.2.a). Various methods have been used, which are the projection profile
of the image [12], the Hough transform [13] or the shape of nearest neighbor clustering
[14]. After skew detection, the character or word is translated to the origin and rotated until
the baseline is horizontal.
Slant normalization:
The character inclination typically found in cursive script is called slant.
Formally, it is defined as the angle between longest stroke in a word and the vertical
direction referred to the word slant. Slant normalization is used to normalize all characters to
a standard form with no slant (see Fig. 1.2.b). Many methods have been proposed to detect
and to correct the slant of cursive words. One of the used methods is based on the center of
gravity [15], another method uses the projection profiles [16] and some used a variant of
the Hough transform [17].
Size normalization:
It is used to adjust the size, position and shape (dimension) of the
character image. This step is required for reducing the shape variation between images of
the class to facilitate the feature generation and improve their classification [18] (see
Fig. 1.2.c).
Figure 1.2. Samples of the basic methods for normalization: (a) Skew normalization (b) Slant
normalization (c) Size normalization.
(a)
(b)
(c)

9
1.2.2.3. Smoothing
The smoothing operation is done to regularize the edges in the image, to remove small bits of
noise and to reduce the high frequency noise in the image [19]. Furthermore, different
preprocessing methods are used for smoothing image in order to acquire more accurate output
image. In freeman's direction extraction, smoothing is done by comparing each code with
previous and next code [6, 18].
1.2.2.4. Skeletonization
Skeletonization is a morphological operation used for reducing foreground regions (contour) in
a binary image to a skeletal, which the connectivity of the original region is detected while
destroying most of the original foreground pixels.
Methods for the skeletonization are divided into two main approaches: iterative and non-
iterative [20]. When using the iterative approach, the peeling contour process parallel or
sequentially by erasing or removing the unwanted pixels in each iteration. In contrast, the non-
iterative approach, the skeleton is straightforward extracted without examining each pixel
individually. Unfortunately, these techniques are difficult to implement and slow as well.
Thinning can be somewhat performed for skeletonization using methods like erosion or
opening. In this mode, it is commonly used by reducing all lines to single pixel thickness as
shown Fig. 1.3.
Figure 1.3. Example of skeletonization.
1.2.3. Segmentation
When designing a digit recognition system, the most important step is the segmentation of the
digit string to obtain isolated digits. This step is a non-trivial problem, due to several
complexities. The first one is the inherent nature of the script that is cursive and at the same
time overlapped. The second one is the high degree of variation in writing styles produced by
writers.

10
The segmentation systems can be divided into two approaches: implicit and explicit.
Implicit approach
considers all traced points as potential segmentation points [3, 9]. In this
case, the segmentation and recognition are performed simultaneously for recognizing a digit
string. Indeed, this approach does not attempt to separate digits, but rather it incorporates
the implicit segmentation into the recognition module.
Explicit approach
is performed by finding the best way to separate adjacent digits before
recognition. In this case, the segmentation and recognition are performed separately. Also,
three cases can be occurred when attempting to separate two adjacent digits: spaced,
overlapped and/or connected digits. In most cases, the overlapped and connected digits are
the frequent observed situations.
Hence, many algorithms were proposed to separate the couple or string of contiguous digits. In
this case, the explicit segmentation algorithms can be categorized in two groups [8-9]:
The recognition-based approaches
usually generate multiple candidate segmentation points
which are verified with recognizer for choosing the optimal segmentation points. Although
this approach gives better efficiency than the segmentation-recognition approaches, the
main weakness lies in the computational time to compare all the segmentation hypotheses.
The segmentation-recognition approaches
are based on the morphology of the connected
digits presented by the contour (concavities, profiles) and the skeleton (Background,
foreground). It is used to construct the segmentation paths where each segmented
component should contain an isolated digit, which is submitted to the recognizer. These
methods are generally faster than recognition-based because a number of separation
attempts must be performed before a segmentation.
Generally, the recognition reaches very high performance when dealing with spaced digits.
However, when a complete segmentation system is used, the recognition is more difficult since
it can generate either an over-segmentation or under-segmentation.
When the over-segmentation problem is occurred, the segmentation cut is performed in intra-
digits and provides better results than other inter-digit segmentation cuts. Figure 1.4 shows
some examples of the over-segmentation problems.

11
20 020
19 101
14 121
Figure 1.4. Misclassification caused by over-segmentation
When the under-segmentation problem is occurred, the lack segmentation produces wrong
result than the correct segmentation cuts. For example, the connected digits often are
recognized as isolated digit. Therefore, this problem can provide the confusion between isolated
and under-segmented digits. Some examples of the under-segmentation problem are shown in
Figure 1.5.
00
18
Figure 1.5. Misclassification caused by under-segmentation
The proposed segmentation system for separating the couple or string of contiguous digits will
be discussed in detail later in the Chapters 3 and 4.
1.2.4. Feature generation
The feature generation can be defined as a problem for extracting the most pertinent
information from the image for a classification problem i.e. which minimizes the intra-class
variability and maximizes the inter-class variability [21-23]. This pertinent information often is
represented by a numeric value vector.
The feature generation methods are many and varied. Each one has its own properties and can
only apply to various contexts and different conditions. Before bringing our choice on a method
than another, it is important to know all the ins and outs of their use. There are numerous
types of features but they are broadly classified into three types: Global features, Statistical
features, and Geometrical and topological features.

12
1.2.4.1. Global features
The global features are generated from the entire character image using for instance the center
of gravity, moments, the coefficient obtained from mathematic transforms such as Fourier,
wavelet and Ridgelet coefficients... [23].
The global features is faster speed since the required values for calculating and the matching
classification is convenient, but the distinction ability of the character details is weak, and
sensitive to the image deformation. It is generally used for the simple character detection.
1.2.4.2. Statistical features
The statistical features are derived from the statistical distribution of pixels in the character
image of the pixels [21]. They offer high speed and low complexity and take care of writing
variations to some extent. They may also be used for reducing the dimension of the feature set.
One of the most common statistical features is the moments extracted from images (Hu,
Zernike moments...). Therefore, the Zoning used for dividing the character into several frame
containing overlapping or non overlapping zones (angular, concentric and circles grids). Finally,
the horizontal or vertical projection features count the number of foreground pixels.
1.2.4.3. Geometrical and topological features
The geometrical and topological features may represent global and local properties of the
character's structure and have high tolerances to deformation, distortions and writing variations
(translation and rotation) [22]. The topological features can encode some knowledge about the
character structure or may require some knowledge for this kind of components.
The specific features include shape descriptor and geometric structure features. The shape
descriptor is used to describe the character structure while the geometric structure feature is
used for reflecting character shape structure and the stroke segment change such the direction
of the stroke segment due to writing variations.
The geometrical and topological features generated from the character include the complex
stroke as curves and splines, such as stroke directions, end points, intersections of line
segments and loops, the number of holes, size of slopes and curves, length and thickness of
strokes, the areas and perimeters...
All these measures can be integrated into a single feature vector for recognition of a
handwritten character.
1.2.5. Classification techniques
The classification is a technique that allows assigning an unknown pattern into a predefined
class [24]. The classification workflow uses either unsupervised or supervised methods to
categorize character features into many classes. It can perform an unsupervised classification

13
without providing training data. In this case, the classes are generated according to the
resemblance between samples. In contrast, the supervised classification requires training data
and specify a classification method such as the maximum likelihood, minimum distance,
Mahalanobis distance, or Spectral Angle Mapper (SAM).
For handwritten digit recognition, the nearest neighbour classifier is one of the simplest and
widely used for classifying the digit image [32]. Artificial Neural Networks (ANN) are another
well studied classification method, and widely used for handwritten character recognition [33].
Hidden Markov Model (HMM) is also a popular approach to this problem which uses both the
statistical and structural information contained into handwritten digit shapes [85]. Support
Vector Machines (SVM) originally developed as a two class classifiers have been extended to
multi-class problems and applied to handwritten character recognition [28,88].
In the case of the handwritten digit recognition, supervised classification has been adopted
since digit classes are well specified. It can be performed in two stages:
Training stage:
The goal of the training stage is to train the classifier with the known digit
dataset for further recognition with unknown digit dataset.
Recognition and decision stage:
This stage classifies the input pattern by comparing them to
a list of reference patterns. This stage also uses classification techniques in the form of
decision. The decision stage is strongly influenced by the feature generation step, and a
successful of the classifier.
1.2.5.1. Statistical techniques
Statistical techniques are based on a statistical decision theory that uses the statistical decision
functions and a set of optimality criteria. It is performed on the shapes to be recognized.
However, it requires a significant number of samples in order to achieve correct training of the
probability distributions for different classes. The study allows reflecting their distribution in a
metric space and the statistical characteristics of the classes allow to take a decision as higher
probability of the class membership (more confidence) [24].
The main statistical techniques which are applied in the character recognition field are the
followings [18, 24]:
Parametric Recognition:
This method requires a priori information about the characters for
obtaining a parametric model for each character. Therefore, once the parameters of the
model which may be based on some probabilities are obtained, the characters are classified
according to some decision rules such as maximum Likelihood or Bayes's method [33,28,85].
Non-parametric Recognition:
This method does not require a priori information about the
data. It is used to separate different pattern classes along hyper planes defined in a given
hyper space. The finest known method of non-parametric categorization is the Nearest
Neighbor (NN) and is extensively used in character recognition. An incoming pattern is

14
classified using the cluster, whose the center is the minimum distance from the pattern over
all the clusters [18,32].
1.2.5.2. Syntactic and structural techniques
Syntactic and Structural techniques are based on the structural primitives taking into account
the physical structure of characters. In general, it is assumed that the character primitives
extracted from writing are quantifiable and one can find the relations among them. For
example, syntactic pattern recognition can be used to find out what objects are present in an
image. Furthermore, structural methods are strong in finding a correspondence mapping
between two images of an object [18, 32-33].
The main difference between these structural techniques and statistical techniques is that these
features are topological primitives and not measures. Several techniques are available such as
graph-matching algorithm, grammatical Methods and string matching.
1.2.5.3. Stochastic techniques
The stochastic techniques use a model for recognition, taking into account the high variability of
the shape. In these techniques, the models are often discrete and many studies are based on
the theory of Markov fields and Bayesian estimation. Markov fields permit the modeling of the
global properties while using local constraints. The model describes these states using state
transition probabilities and observation probabilities. The most common methods in these
techniques are the methods using Hidden Markov Models (HMM) [18, 24, 85].
11.3. Overview of Support vector machines (SVM)
The support vector machine (SVM) is a popular classification technique including linear and
non-linear, which not only has a solid theoretical foundation, but also it is more accurate than
other classifiers in many areas of applications for data classification. SVMs are originally
designed to use the principle of the structural risk [25].
In addition, they come to meet two major disadvantageous in machine learning which are:
Parametrically controlling the capacity of the SVM measured by the Vapnik­Chervonenkis
(VC) dimension;
Avoid over-learning (overfitting).
Two other advantages are particularly offered by SVMs. First, with an appropriate kernel, the
SVM can work well even if data are not linearly separable in feature space. Second, especially
popular in texts and images classification problems where very high-dimensional spaces are the
norm. These two advantages allow us to select SVMs as better candidates for performing
handwritten digit recognition.

15
SVMs are a machine learning algorithm for performing classification and regression via a
hyperplane in a high feature space. SVM has its ability to select the representative dataset from
the training dataset, which is commonly called "Support vectors" and attempts to find the
optimal line (hyperplane) that maximizes the distance (margin) between the closest points from
two classes. Once learned, a decision function is constructed in order to classify data according
to the region separated by the hyperplane.
The error is minimized by maximizing the margin controlled by the VC dimension of the
classifier. For a better clarity, the theoretical aspects of SVMs are explained in the following
sections [25-31].
1.3.1. Basic principles
Support Vector Machine (SVM), introduced by Vapnik in 1995 [25], is a binary
classification technique for supervised learning methods which can be used for both
classification and regression. In simple terms, given a set of training samples, each labeled as
members of one of two classes, an SVM classification training algorithm tries to construct a
decision model be able to predict whether a new sample falls within one class or the other. If
the samples are represented as points in space, a linear SVM model can be interpreted as a
division of the space to the examples belonging to separate classes, which are divided by a
clear gap that is as wide as possible. New samples are then predicted to belong to a class
based on which side of the gap they fall on. It is based on the use of a function called decision
function that allows separating optimally the data.
The principle of SVM during optimization is to maximize the margin between classes in order to
increase the capability of separation. The SVM classifier will classify all the points on one side of
the decision boundary as belonging to one class and all those on the other side as belonging to
the other class.

16
Figure 1.6. Maximum margin hyperplane and margins for an SVM
belongs to the hyperplane with the equation:
+
= 0
. Where is the normal of
the hyperplane while
|
|/
14T
is the perpendicular distance between the hyperplane and the
origin (Figure 1.6). Figure 1.6 shows an example where input data are grouped into two classes
which should be identified by the SVM classifier (Support vectors are circled).
In the case when the training data are linearly separable, the classifier can select two
hyperplanes in a way that they separate the data and there are no points between them, and
then try to maximize their distance. For all data (support vectors)
of the class
{ 1, +1}
14T
based learning, these hyperplanes (
,
0
)
14T
can be described through the
following equations:
+
> 0
= 1
+
< 0
=
1
(1.1)
It is possible to rescale
14T
and
14T
such that:
+
1
= 1
+
1
=
1
(1.2)
These can be combined into one set of inequalities:
(
+
)
1
(1.3)
Support
Vector
+
= 0
Optimal hyperplane
Maximum margin
+
< 0
h2
h1
+
> 0
/

17
The points for which the equality in equation (1.2) holds lie on the supporting hyperplane h1:
+
= 1
14T
and the supporting hyperplane h2:
+
=
1
14T
.Note that h1 and h2
are parallel and that no training points fall between them.
For a hyperplane equation
( ) =
+
, the distance of a point
to the
hyperplane is
( )/
14T
.
SVM constructs a decision surface hyperplane that maximizes the separation margin by
following a principle approach rooted in from the statistical learning theory. More precisely, the
VC dimension is a measure of the capacity and complexity of a statistical classification theory,
because it can predict a probabilistic upper bound on the test error of a classification model.
Where
14T
denotes the number of training set (restriction: this formula is valid when
).
Now, choose some
14T
such that
0
1
14T
. Then, for losses taking these values, with
probability
1
14T
, the following bound holds (Vapnik [30]):
test error
training error +
(log( ) + 1)
log( )
(1.4)
Where
14T
is a non-negative integer called VC dimension and is a measure of the capacity of the
learning machine. More precisely, the support vector machine is an approximate
implementation of the method of structural risk minimization. This induction principle is based
on the fact that the error rate of a learning machine on test data is bounded by the sum of the
training-error rate and a term that depends on the VC dimension. In the case of separable
patterns, a SVM produces a zero value for the first term (right-hand side of equation (1.4)) and
minimizes the second term, thus minimizing the overall risk. In the case of non-separable
patterns, the cost function that must be minimized will now perform an extra term weighted by
the parameter C which penalizes any SVM generates too non-separable patterns.
For the SVM optimization problem, the maximum possible margin between the
hyperplane and the support vectors are obtained by minimizing
, under the constraints:
(
+
) > 1
for any
1, ... ,
14T
. According to the optimization theory, the
goal (minimize
) and constraints (
(
+
) > 1
) are strictly convex. This
problem can be resolved by introducing the Lagrange multipliers
14T
,
therefore the constrained
problem can be expressed as:
( ,
, ) =
[ . ( .
+
)
1]
(1.5)
That must be canceling the partial derivatives with respect to
14T
and
14T
. In this expression
called dual form, the constraints of good ranking become a trade-off between a large margin
and a small error penalty.

18
The Karush-Kuhn-Tucker (KKT) conditions play a central role in both the theory and practice of
constrained minimization problem, it proves that this is equivalent to the equations solutions of
annulling the derivatives of Lagrangian with respect to the variables ,
,
.
The annulment
of partial derivatives of these partial derivatives gives :
{
( .
)}
,
= 0
(1.6)
can be computed thanks to the terms:
=
(1. 7)
The Lagrange multipliers are deduced by setting the partial derivatives to zero, which
correspond to support vectors. All other points have null
. Finally, the equation of the
separating hyperplane is defined:
( ) = (
) +
=
. ( .
) +
(1.8)
Where
are the solutions of non-null
and
is found by placing the coordinates of a
support vector
of the class
in
+
> 0
if
= 1
or in
+
< 0
if
=
1
.
The Soft Margin method allows selecting a hyperplane that splits the examples as cleanly as
possible, while still maximizing the distance to the nearest cleanly split examples. In the case of
non-linearly separable patterns, the objective function is then increased by a function which
penalizes non-null
, and the minimization optimization problem. The good ranking constraint
initially defined by:
(
+
) > 1
(1.9)
The Soft Margin method introduces non-negative slack variables , which measure the degree
of misclassification of the data . If the penalty function is linear, the optimization problem
released through a parameter
to become a:
(
+
) > 1
(1.10)
In this case, the maximum possible margin between the hyperplane and the support vectors,
initially obtained by minimizing
. If the penalty function is linear, the minimization
optimization problem becomes:
+
(1.11)
where is a strictly positive soft margin parameter to be determined called also regularization
constant. This parameter allows striking a balance between the two competing criteria of

19
margin maximization and error minimization, whereas the slack variables
indicates the
distance of the incorrectly classified points from the optimal hyper plane. Figure 1.7 shows the
incomplete separable data and slack variables (wrong side of the classifier).
Figure 1.7. Hyperplane soft margin for non-linearly separable data (
is misclassified while
is properly classified although its
is non-null)
1.3.2. SVM Kernel
The problem of a nonlinear classifier is addressed by nonlinear transformation of the inputs into
high-dimensional feature space, in which the probability of linear separation is high. This
transformation of input data into high-dimensional feature space is achieved using kernel
functions.
The method of linear separation, previously presented is somewhat limited. The introduction of
kernel functions will allow overcoming this limitation. The main idea of the nonlinear extension
is to map the data from the original input space to a feature space and find the hyperplane with
a large margin in the feature space. For an enlarged feature space, consider transformations of
a mapping as follow:
:
(1.12)

20
Figure 1.8. The data not linearly separable in 2-D are mapped onto three dimensions where a
linear decision surface between the classes can be made.
which maps the training data from
to some higher Euclidean space
H
, which possibly has
infinite dimensions (m is the dimension of the feature space). In this high dimension space, the
data can be linearly separable, hence the linear SVM formulation above can be applied to these
data. In the SVM formulation, the training data only appear in the form of dot products
.
.
These can be replace by dot products in the Euclidean space
H
:
( ).
.
.
=
( ).
(1.13)
In the equation of the separating hyperplane (1.8), the dot product
( . )
may be replaced by
any kernel function
( . )
providing a dot product. The equation of the separating
hyperplane becomes:
( ) = (
) +
=
. ( .
) +
(1.14)
where
are the solutions of:
{
( .
)}
,
= 0
(1.15)
The kernel functions commonly used are:
Linear kernel:
( , ) = ( . )
Gaussian Radial Basis Function kernel (RBF):
( , ) = exp (
)
for
>
0.
Sometimes parametrized using
=
.
Polynomial kernel:
( , ) = ( .
+ )
Sigmoid kernel or Hyperbolic tangent:
( , ) =
( .
+ )

21
The effectiveness of SVM depends on the selection type of the kernel, the kernel's parameters
and the soft margin parameter . A common choice is a Gaussian kernel, which has a single
parameter . The best combination of and
is checked using cross validation [26].
1.3.3. Multi-class SVM
SVM is designed only for separating two classes. Its extension for multi-classes has conducted
to develop various implementations SVM [27-30]. The multiclass problem can be processed as a
combination of binary classifiers. When using SVM for handwritten digit recognition, we have a
problem of ten (10) classes. Hence, some approaches are used for extending the binary SVM to
the mutli-class, which are One-Against-All (OAA), One-Against-One (OAO) and Direct Acyclic
Graph SVM. We review in the following the main properties of the three implementations.
One against all
: The most intuitive approach for making the multiclass SVM involves building
many SVMs classifier than two classes [28, 30]. One-against-all approach constructs
binary SVM classifiers and requires
comparisons for the decision, each one has the ability
to separate of which separates one class from all the rest. The classification of new samples
is done by a winner-takes-all strategy, in which the classifier with the highest output
function assigns the class.
arg
( )
(1.14)
In this way, the number of unclassified data is reduced. However, when the maximum
output value is obtained for two SVMs, it cannot assign the concerned data to a class.
Hence, this method leaves regions of the undecided feature space where more than one
class are accepted or all classes are rejected. This is demonstrated in figure 1.9.
Figure 1.9. Regions not classified by OAA approach for a problem of three classes.
( ) = 0
( ) = 0
( ) = 0

22
One-against-one:
Another combination method is "One-Against-One", also known as
"pairwise coupling", which builds one SVM for each pair of classes, or
(
1)/2
for a
problem with
classes. The classification is done by a max-wins voting strategy, in which
every classifier assigns the sample to one of the two classes. Then, the vote for the assigned
class is increased by one vote, and finally the class with the most votes determines the
sample classification.
( ) = 0
is the decision function that separates the class of the
class
by:
( ) =
( )
. During the classification step, max-wins voting strategy is
calculated as:
arg
,...,
(
( ))
,
(1.15)
However, when two classes have the same score,
will not be classified. This concept is
showed in figure 1.10.
Figure 1.10. Regions not classified by OAO approach for a problem of three classes.
Directed Acyclic Graph SVM (DAGSVM):
To resolve unclassifiable regions for the pairwise
classification, Platt et al. [31] proposed One-against-one in a DAG graph (decision-tree-
based pairwise classification), which uses directed acyclic graph (DAG) to reduce the number
of SVM that need to use during the testing phase. When having
classes, One-against-one
method requires
(
1)/2
classifiers, while the DAG method requires only to
1
classifiers. Fig. 1.11 shows the decision tree for the three classes. In the figure, shows that
does not belong to class . In this case, any pair of classes can be the top-level
classification of the tree. Except for the leaf node when
( ) > 0
, does not belong to
class . For example, if
( ) > 0
, does not belong to Class 2. Thus, it can belong to the
B C
A
C
A
B

23
Class 1 or 3 and the next classification pair is Classes 1 and 3 supplied by the decision
function.
Figure 1.11. Directed acyclic graph SVM method.
11.4. Summary
The objective of this chapter is to overview the main modules composed a character recognition
system. Depending on the application, each module has its own importance. In our case, we
focus on the development of a digit recognition system having three main modules: feature
generation, segmentation of the digit string to isolated digits and the classification based on the
SVM.
In the next chapter, we will introduce our system with a brief overview for recognition of
isolated handwritten digits.
( )
( )
( )
Not 2 Not1
Not 3 Not1 Not 2 Not1
Class 1
Class 3
Class 2

24
CC
HAPTER
I
SOLATED HANDWRITTEN DIGIT RECOGNITION
This chapter investigates the combination of different statistical and structural features for
recognition of isolated handwritten digits, a classical pattern recognition problem. The objective
of this study is to improve the recognition rates by combining different representations of non-
normalized handwritten digits. These features include some global statistics, moments, profile
and projection based features and features computed from the contour and skeleton of the
digits. Some of these features are extracted from the complete image of digit while others are
extracted from different regions of the image by first applying a uniform grid sampling to the
image. Classification is carried out using one-against-all SVM. The experiments conducted on
the isolated handwritten digit NIST SD19 and the CVL Single Digit Database realized high
recognition rates, which are comparable to state-of-the-art methods on this subject.
2.1. Introduction
Handwriting recognition has been the premier research problem of the document analysis and
recognition community for over three decades now. The sub problems in handwriting
recognition mainly include line, word or character level segmentation, recognition of isolated
characters, words, or complete lines/paragraphs and recognition of numerical strings and
isolated digits. Among these different modalities of handwriting recognition, this chapter
focuses on recognition of isolated digits, a classical pattern recognition problem that offers a
wide range of applications. Unlike alphabet, the ten glyphs of the most commonly used Western
Arabic numerals are shared by many scripts and languages around the world making them
globally acceptable. The main challenges in handwritten digit recognition arise from variations
in size, shape, slant, and most importantly, the differences in the writing styles of individuals.
With the recent advancements in image analysis and pattern classification, sophisticated digit
recognition systems have been proposed which aim to enhance the overall recognition
performance by improving the feature generation or/and classification techniques used. Some
of the studies aim to improve the classification performance by using a combination of multiple
classifiers while others aim to combine multiple features and select the most pertinent and
optimum set of features for this problem.

25
In this chapter, we are interested in enhancing the feature generation step for isolated digit
recognition used for avoiding digit normalization. The idea is to find a combination of multiple
features which improves the overall recognition rates by minimizing the intra-class variability
and maximizing inter-class variability [21-23], the most desirable requirement of any pattern
recognition system.
22.2. Overview of isolated handw ritten digit recognition
Over the years, various handwritten isolated digit recognition systems reporting high
recognition rates have been proposed. Most of these systems have been evaluated on the
widely used MNIST database [32].
Among significant contributions to digit recognition, authors in leCun et al. 1995 [32] present a
comprehensive comparison of different classification algorithms on the recognition task. Heutte
et al. [33] proposed a combination of seven different features to feed a linear discrimination
based classifier.
The method of Cai and Liu (2001) [34] present an approach that integrates both statistical and
structural information for the recognition of unconstrained handwritten numerals.
Dong et al. [35] extracted a set of gradient features while Teow and Loe [36] computed linearly
separable features from the MNIST database and applied triowise linear support vector machine
with soft voting for classification. Belongie et al. [37] developed a novel similarity measure by
finding the correspondences between points in two shapes and estimating an aligning
transform. The proposed matching technique achieved high recognition rates when applied to
digit recognition.
In another notable contribution, Lauer et al. [38] proposed a trainable feature extractor based
on LeNet5 neural network architecture. Classification carried out using Support Vector Machine
realized promising recognition rates. A comprehensive survey on handwritten digit recognition
on CENPARMI, CEDAR, MNIST databases can be found in [39].
Recently, the handwritten digit recognition competition [40] held in conjunction with ICDAR
2013 also provided a platform for comparison of state-of-the-art digit recognition techniques
under the same experimental conditions. In this competition, we have participated with two (2)
methods for handwritten digit recognition (Tébessa I and Tébessa II). In the following, we can
be cited these techniques as follow:
Salzburg I method:
The approach is based on the Finite Impulse Response Multilayer
Perceptron (FIR MLP). First, the color images are transformed to 8 bit gray-scale images.
Then these gray-scale images are resized to 20×20 pixel images and their center of mass is
computed. Each scaled image is positioned by their center of mass in the center of a 28 × 28

26
pixel image. Each pixel value has been normalized into the range [-1, 1]. For the experiments,
a neural network framework has been adapted to the FIR MLP. This method uses one partially
connected FIR MLP with four layers.
Salzburg II method:
The description of this method is similar to the previous (Salzburg I), but
uses an ensemble of four FIR MLP partially and fully connected with four layers.
Orand method:
The approach is based on the combination of four descriptors which allow
exploiting three different characteristics of image digits. For the pre-processing, a thresholding
operation using Otsu's method is applied. For the feature generation, three different
characteristics of the digits are exploited: the stroke orientations based on the HOG approach,
the relation between background and foreground based on concavities, and the contour. In
particular, the image is divided into 2 × 2 regions. For each region a histogram of orientations
using 32 bins is computed. Then, the descriptor is produced by concatenating the region
histograms. The image is resized to 40 × 40 pixels. Then a thinning operation is applied. The
profile with respect to the left and right side is computed yielding an 80-size descriptor.
Finally, the digit descriptor is the concatenation of the four described descriptors yielding a
240-size descriptor. For classification, a multi-class SVM classifier using a RBF kernel is used,
the cost parameter is set to 6, and the gamma parameter is set to 1.4.
Jadavpur method
: The system is based on a Fuzzy-Entropy-based feature selection strategy
over a combination of Quad-tree-based longest-run and convex-hull-based feature sets with
SVM classifiers. The system is finally developed using the selected 190 features.
Paris Sud method:
The images are pre-processed as following: First, getting rid of
color, second, using down-sampling to 20×20 resolution and finally, placing the images on a
28×28 grid by centering their center of gravity. Then Hamming tree algorithm and the
AdaBoost.MH implementation are used as classifier. The chosen classifier has 47,642 trees of
4 nodes (5 leaves) each. In each boosting iteration 100 random Haar filters are tested, chosen
uniformly from the possible geometries.
François Rabelais method
: First, the image is cropped to the bounding box of the digit,
deleting the white border, and is surrounded with a 1-pixel margin. It is then magnified to
have a final size of 128×128. Finally a skew and slant normalization is applied. The black
pixels are considered as input data points, and reduce them using k-means algorithm. The
Delaunay Triangulation is built on the input data points. A multilevel static uniform zoning is
computed at K distinct orders. For each cell of a grid, two values are appended: the number
of input elements in the cell ,and the average of input elements in the neighborhood of the
cell. Both the centers of gravity of the triangles and the black pixels within the cell are input
elements. A SVM classifier (libSVM) with a RBF kernel is trained and then used for the
prediction.

27
Hannover method
: The numerals are normalized, binarized and slope corrected. The feature
vector is composed of three methods: first, number of black pixels in each row and column;
second, lengths of 12 probes in different directions from different positions; and third,
normalized central moments. The digits are classified with a nearest-k-neighbor classifier.
Tébessa I method
: This method is conducted with combination two structural features without
uniform grid method, background features and foreground features of the skeleton. For the
three remaining methods: global features, ridgelet transform and foreground features, the
image was divided into four regions by using uniform grid method. The recognition module is
based on the SVM multi-class approach using the one-against-all implementation. SVM and
Tébessa II method
: This method is based on multi-scale run length features which are
determined on the binary image taking into consideration both the black pixels corresponding
to the ink trace and the white pixels corresponding to the background. The probability
distribution of black and white run-lengths has been used. The proposed method includes
global features, the ridgelet transform, background features, foreground features and multi-
scale run length features, completing the system by a multi-class SVM classifier based on
approach one-against-all.
The objective of our study is to find a combination of features which achieves high recognition
rates on non-normalized isolated handwritten digits. We have considered global and local,
structural and statistical features [22-23,41] in our work. The features that we consider in our
study include Hu's moment invariants, skew angle, Zernike moments, profile and project based
features, background and foreground features and Ridgelet transform. The proposed approach
aims to combine these different features to best represent the digits.
22.3. Size normalization
To evaluate our technique on the same database but normalized. We used the size
normalization module which allowed adjusting the size, position and shape of the digit image in
order to reduce the shape variation between images. In this way, the feature generation task is
facilitated and to expect improving their classification [42].
In our case, the method based on bilinear interpolation was selected which assign each target
point a linear combination of four nearest source points and then performing the inverse
transformation. For example, suppose that the gray levels
u(i1, i2)
of an image are known
points with integer coordinates (0, 0), (1, 0), (0, 1), (1, 1), assigning a gray scale value at the
point (x, y) is effected by a bilinear interpolation [43], is defined by:
( , ) = (1
). (1
). (0,0) + (1
). . (0,1)
+ . (1
). (1,0) + . . (1,1) (2.1)

28
Other types of interpolation can be defined by using more points with integer coordinates. By
these methods, it is possible to increase the resolution of images fictively. In our work, the
image size is reduced to a normalized size 24 x 24.
22.4. Feature generation
Feature generation aims to express input data using a numerical representation or a set of
symbols (coding) to select the best set of features (feature vector) for a particular problem,
digit recognition in our case [22-23, 33, 39, 43]. Features are generally categorized into global
or local and statistical or structural features. Statistical features represent pattern classes by
statistical measures while structural features use formal structures for data representation.
Commonly used statistical features include moments, descriptors and geometrical
measurements etc. Examples of structural features include bends, end points, intersections,
loops and measures of concavity etc. [44]. Structural properties can sometimes also be
expressed using statistical measures. Each of these types of features can be extracted globally
from the objects (digits, characters, words or paragraphs) under study or locally from small
regions of these objects. However, each feature is more suited to one type or the other giving
rise to global and local features.
In the following sections, we provide an overview of the features used in our study.
2.4.1. Global features
Global features are computed from the image of the digit as a whole. The global features we
compute include the following.
2.4.1.1. Density
Computing the density would be to count the number of black pixels and dividing by the total
number of pixels.
2.4.1.2. Center of gravity
The centre of gravity, or sometimes called as centre of mass, is a point in an image in which its
mass is concentrated. They characterize images in a way that has analogies to statistics.
Generally, an image may be considered as a Cartesian density distribution function I(x, y),
where I(x, y) is the intensity of the pixel. Then, the two-dimensional geometric moment of
order (p+q) of a function I(x, y) is defined as:
=
( , ) dxdy (2.2)
where
is the basis or the weighing kernel with , = 0,1,2, . . , . The basis may have
properties that are passed on to the moments, producing descriptors that are invariant to scale,
translation or rotation.

29
In order to express the above equation in a discrete form, the image has been sampled into
square unit pixels.
=
( , ) (2.3)
The zeroth order moment defines the total mass of the image (segmented image, its area, total
number of pixels).
=
( , ) (2.4)
The tow first orders geometrical moments define the centre of gravity of the image,
respectively the centre of coordinates of an image:
=
,
=
(2.5)
These two coordinates represent the center of gravity of the digit image. Then, the central
moments in discrete form can be expressed as :
=
(
) (
) ( , ) (2.6)
2.4.1.3. Second order geometrical moments
The second order geometrical moments are a statistical measure of the allocation of pixels
around the center of gravity. Second-order moments
m
20
and
m
02
describe the distribution of
mass of the image with respect to the coordinate axes. In mechanics, they are called the
moments of inertia. It may be used to determine the principal axes, image ellipse and radii of
gyration.
2.4.1.4. Number of transitions:
The number of white to black transitions (or vice versa) counted at each pixel in the four
principal directions.
2.4.2. Hu's Moment Invariants
Hu proposed the application of moment invariants to image analysis and object representation
problems in [45, 46]. Since then, they have been effectively applied to a large number of shape
matching and similar problems. Hu's seven moments are invariant with respect to position,
scale and orientation. These moments capture information on image area, centroid and its
orientation.
Hu's moments invariant are calculated using combinations of second and third order normalized
central moments. The normalized central moments are given by
=
(2.7)
where
=
+ 1 ( + )
2
(2.8)

30
Hu's seven invariant moments are then calculated by
=
+
, (2.9)
= (
) + 4
,
(2.10)
= (
3
) + (3
) ,
(2.11)
= (
+
) + (
+
) ,
(2.12)
= (
3
)(
+
)[(
+
)
3(
+
) ] + (3
)(
+
)[3(
+
)
(
+
) ], (2.13)
= (
)[(
+
)
(
+
) ] + 4
(
+
)(
+
), (2.14)
= (3
)(
+
)[(
+
)
3(
+
) ]
(
3
)(
+
)[3(
+
)
(
+
) ]. (2.15)
The seventh moment invariant, , is also skew invariant. These seven invariant moments were
calculated for each binary digit image.
2.4.3. Skew
The skew or orientation of the digit is calculated using Radon transform of the image [47, 48].
Radon transform of the image is the sum of radon transform of each pixel in the image. The
radon function takes parallel beam projections of the image from different angles and the skew
angle is determined based on the maximum value of radon function which is used as a feature.
2.4.4. Zernike moments
Zernike moments [49] have been widely employed in a wide variety of pattern recognition
problems and we use them in our study for characterizing the digits. For efficient computation
of Zernike moments, we implemented the method in [49] which is based on recurrence
relations for fast computation of radial polynomials of Zernike moments.
Zernike complex moments are constructed using a set of Zernike polynomials. Zernike
polynomials are a sequence of orthogonal polynomials on the unit disk and they can be
expressed as:
,
( , ) =
,
( , ) =
,
( )
(2.16)
where i =
1 ,
n
is the order of the radial polynomial and
m
is positive or negative integer
according to the conditions
| | even, |m|
n, representing the repetition of the azimuthal
angle. From
x
and
y
we obtain
and
by simple conversion to polar coordinates ( =
,
=
). The radial polynomial R
,
( )
is defined as:
,
( ) =
( 1)
(
)!
!
|
|
!
|
|
!
(
| |)/
(2.17)

31
Zernike moments for a digit image can be obtained by making use of the complex conjugate
property, which is Z
,
= Z
,
. We will skip directly to the formulation of the moments in
adequate form:
,
=
( , )
,
( , ),
+
1 (2.18)
where
I(x, y)
is the image function and denotes the complex conjugate with order
n
and
repetition
m
. The procedure for computing Zernike moments can be seen as an inner product
between the image function and the Zernike basis function.
If an image function ( , ) having Zernike moments
,
is rotated counterclockwise by
angle ; the transformed image function is ( , ) = ( ,
)
, by simply substituting that into
the equation. The discrete form of the Zernike moments of an image is expressed as follows:
,
=
,
and
,
=
,
(2.19)
The amplitudes of Zernike moments
,
are rotation invariant. Invariance to the scale and
translation can be obtained by shifting and scaling the image before the computation of Zernike
moments. In our implementation, we compute up to fourth order Zernike moments.
2.4.5. Projections
Horizontal and vertical projections are determined by counting the total number of text pixels in
each row/column of the image. These values are normalized by the width/height of the image.
The mean and variance of these projections are used as features in our study.
2.4.6. Profile Features
Left and right profiles are computed by considering for each image row, the distance between
the first text pixel and the left (right) boundary of the digit image. Like projections, the profiles
are normalized to the interval [0 1] and the mean and variance of these profiles are employed
as features.
2.4.7. Background features
The background feature vector is based on the concavity information. These features are aimed
at capturing the topological and geometrical properties of the digits. Each concavity feature is
the number of white pixels belonging to a specific concavity configuration [41]. The label for
each white pixel is chosen based on Freeman code with four directions. Each direction is
explored until a black pixel or the extremity of the digit is met.

32
Figure 2.1. Different configurations of concavity
In addition to the nine standard concavity configurations, we also consider five additional
configurations to more accurately model the loops in digits. These configurations are illustrated
in Figure 1 and produce a 14 dimensional feature vector which is normalized between 0 and 1.
Figure 2.2 shows concavity labels of the background pixel for a sample of digit `9'.
Figure 2.2. Concavity labels for digit `9'

33
2.4.8. Foreground features
The foreground features are computed from two different representations of digit, contour and
skeleton. Each of these types of features is discussed in the following.
2.4.8.1. Contour Based Features
These features are aimed at capturing the dominant orientations in the shape of the digit and
are computed from the contour of the (digit) image. The contour is detected using
morphological operations [18, 44, 41, 50] and is represented by Freeman chain codes
traversing the pixels in clockwise direction. This generates a string of codes in the interval [1 8]
for the contour of a digit. The normalized histogram of these codes is then computed and is
used as feature to characterize the digit (Figure 3).
This histogram of contour chain codes is effective in capturing the dominant stroke directions
(horizontal, vertical or diagonal). However, these features are very sensitive to noise and also
fail to capture the structure and topology of the digit.
Figure 2.3. Contour detection: (A) Contour of the upper region, (B) Feature vector, and (C) 8-
Freeman directions
2.4.8.2. Skeleton Based Features
These features are computed from the skeleton of the image of the digit. The skeleton of the
image is computed [2, 8] and is searched for end points, crossings (interconnections) and
(horizontal and vertical) directional points.

34
The neighbor N(p) of a current pixel
p
on the skeleton is calculated by using the 8-Freeman
directions in the clockwise such as:
N(p) =
I(np ) (2.20)
where
is the i
th
neighbor of
p
. Then, the end point is defined when ( ) = 1 , the
directional points is defined when ( ) = 2 and the Interconnection Point (IP) is defined when
( ) > 2 .
Figure 4 illustrates some examples of each type of points (labeled as 1, 2, 3 and 4 respectively)
in a digit. The normalized histogram of the occurrence of these points in a digit is used as a
feature.
Figure 2.4. Features extracted from the skeleton
These skeleton based features compute the structural information of the digit but like contour
based features; these features are also sensitive to noise in the image.
2.4.9. Ridgelet Transform
The Ridgelet transform defined by Candès and Donoho [51] has been effectively employed for
pattern recognition [52]. The Ridgelet transform combines the Radon and wavelet transforms.
Radon transform has the ability to detect lines in the image while the wavelet transform allows
detecting line singularities along the Radon slices.
Ridgelet transform has been successfully applied to a number of problems including image
compression, image transform coding, character recognition, watermarking, texture based
image retrieval and biometric identification [52].
The Ridgelet transform is based on the Radon transform which is computed on several angular
directions. Radon coefficients correspond to projections representing the shadow of the shape

35
at each angle [51]. Consequently, significant linear features in any direction are expressed by
high magnitudes. Thus, in order to characterize linear singularities, the one-dimensional wavelet
transform is applied on Radon slices to yield the Ridgelet coefficients. Hence, along the Radon
axis projection, the Ridgelet is constant while in the direction orthogonal to these ridges it is a
wavelet [53].
For an image ( , ), the Ridgelet transform can be computed by first calculating the Radon
transform as defined in [52].
( , ) =
(
,
) (
cos( ) +
sin( )
)
(2.21)
Where , and are Dirac distribution, angular and radial variables, respectively.
The 1-D wavelet transform is then applied on each Radon slice in order to obtain the Ridgelet
coefficients
( , ). The sum of the normalized values of the coefficients is used as feature.
2.4.10. Region sampling: uniform grid
Uniform grid sampling [54] is applied to the image of the digit which allows extracting features
from different regions of the image separately. A uniform grid creates rectangular regions for
sampling where each region is of the same size and has the same shape. For a given image of
size H x L, the position of horizontal and vertical grid lines for sampling is determined as
follows.
= ×
i = 1,2, ... , n
1
(2.22)
Where p is the vector of line positions, n is the number of horizontal or vertical regions, and k is
the width or the height of the image. Figure 5 illustrates an example of a digit split into a 2x2
grid. Once the image is divided into different regions, features are extracted from each region
separately. This allows a different level of granularity and features extracted from similar
regions of the digit can be compared with one another allowing more effective matching.
Figure 2.5. Example of splitting a digit using a uniform grid (2x2).

36
A summary of the features used in our study along with the dimensionality of each is
summarized in Table 2.1.
Feature
Description
Dimension
f1
Global features
8
f2
Hu's Moment Invariants
7
f3
Skew Angle
1
f4
Zernike Moments
50
f5
Projection Histograms
4
f6
Profile based features
4
f7
Background features
14
f8
Histogram of contour chain code
8
f9
Skeleton based features
4
f10
Ridgelet transform
1
Table 2.1. Summary of features
22.5. Recognition
The proposed recognition engine is based on SVM multi-class approach using the one-against-
all implementation [28, 29]. The features discussed in the previous section are extracted from
the training data set (to be discussed in the next section) and are fed to the SVM. Two
important parameters required for training the SVM include the regularization parameter (C)
Thus, some properties should be considered between C and .
A low value of C (<1) error rate increases with higher value of .
Large values of C counter-balance the bias introduced by large .
Very small value of may result in good training data error rate but won't be useful in the
case of test data recognition error rate.
With large value of the Gaussian kernel of RBF becomes almost linear.
We consider an optimal pair of
j
and C
i
by testing and comparing the accuracy rates of the
corresponding experimental SVMs using the validation database.
The next section presents the experimental setup and the corresponding results.
2.6. Experimental results
In order to validate the concept of feature generation as well as to show the robustness of our
system, we ran two different experimental results on two databases. The first experiments

37
conducted on the isolated handwritten digit NIST SD19 and the second experiments conducted
on CVL Single Digit Database.
2.6.1. Experimental results on NIST SD19
This work [55] is devoted to generating pertinent features in order to find the best ones of
normalized and not normalized isolated handwritten digits. The algorithm that we propose
includes global features, the Ridgelet transform, background and foreground features,
completing the system by a multi-class SVM classifier based on the One-Against-All
implementation.
The experiments are conducted on NIST SD19 database [56] containing not-normalized isolated
handwritten digits stored as binary images. The isolated handwritten digits have the advantage
of being abundantly available in the form of a labeled database. This database is divided into 3
divisions (hsf_0123, hsf_4 and hsf_7).
Therefore, we evaluate the efficiency of various combination features on division hsf_0123 from
which we extract 120,000 handwritten digits. We divide this database into three different parts:
a training database containing 60,000 digit images and a validation database containing 30,000
digit images, while the remaining 30,000 digit images are kept for testing.
Various feature methods are generated from the normalized and not normalized isolated digit
image in order to evaluate its influence. We test five feature sets independently to SVM
classifier, which are Global features (f1), Ridgelet transform (f10), background features (f7)
and foreground features (Histogram of contour chain code (f8) and Skeleton based features
(f9)) extracted from the images. The experiments are carried out with and/or without the
uniform grid.
Finally, the recognition module is based on the SVM multi-class approach using the One-
Against-All implementation. SVM and RBF kernel parameters are fixed to C=10 and
= 8,
respectively.
In the following experiments, we provide various combination of the features used in our work.
Experiment 1
: This experiment is performed with the combination of structural features
without uniform grid, in order to assess the quality of recognition. This combination appears
as follows: Global features (f1), Ridgelet transform (f10), background features (f7) , histogram
of contour chain code (f8) and skeleton based features (f9).Generally, the final features vector
is composed of 35 components. This evaluation is illustrated in the Table 2.2.
Experiment 2:
This experiment is performed with the uniform grid which is divided into four
regions each of which is the same size. Features are generated from each region as described
in Experiment 1. Finally, the global feature vector is composed of 140 (35x4) components.
This evaluation is illustrated in Table 2.2.

38
Experiment 3:
This experiment is conducted by combining two structural features background
features (f7) and Skeleton based features (f9)). For the three remaining methods: Global
features (f1), Ridgelet transform (f10) and Histogram of contour chain code (f8), the image is
divided into four regions by using the uniform grid method. Then, we generate for each
region 17 components. Finally, the global feature vector is composed of 86 (l7x4+ 14+4)
components. This evaluation is reported in Table 2.2.
Class
Experiment 1
Experiment 2
Experiment 3
N
Normalized
D
DB
N
Not normalized
D
DB
N
Normalized
D
DB
N
Not normalized
D
DB
N
Normalized
D
DB
N
Not normalized
D
DB
0
98.30
99.30
95.40
98.90
98.30
99.60
1
97.10
98.90
97.50
98.20
99.20
99.10
2
97.40
97.50
96.20
98.10
99.00
99.40
3
98.70
99.30
98.50
98.80
98.80
99.10
4
98.20
98.00
98.20
98.50
98.90
99.30
5
98.70
98.90
97.50
98.20
99.50
99.30
6
98.20
98.10
95.10
96.10
99.20
98.90
7
97.20
96.80
96.60
97.00
97.80
97.80
8
97.90
98.20
96.00
96.80
99.20
99.20
9
98.40
98.60
98.60
98.40
99.00
99.60
Overall rate
98.01
98.36
96.96
97.90
98.89
99.07
Table 2.2. The recognition rate for various experiments
We can note that the overall rate is acceptable for Experiment 3. However, some digits are
confused and often they are visually very similar.
This evaluation is significant because the results given in Table 2.2 show that the combination
of features results in an important improvement overall rate. However, we can see that the
foreground features of the skeleton and the background features without uniform grid method
allow better discrimination of digits compared to Experiments 1 and 2, respectively.
Experiment 3 shows that the use of background features without uniform grid method
allows locating effectively concave regions , for example, loop, the same way as for Skeleton
based features. It is used in order to reduce confusion between the classes "4" and "7" for
example. However, we can note that the methods of structural feature generation used work
well with the database of the not normalized digits.
The digit image is divided into separate regions via splitting sampling allowing the generation of
features having pertinent information. The use of a uniform grid allows this information to be
improved upon for each feature by defining different levels of granularity . Features generated

39
in various experiments show that the most effective features are structural, strokes and
concavities when used in combination with a 2x2 region.
2.6.2. Experimental results on CVL Single Digit Database
The proposed methodology [57] was evaluated on the CVL Single Digit database [40]. The
database comprises 7,000 digits for training, a validation set of equal size, and an evaluation
set consisting of 21,780 digits. These samples are contributed by different writers with varying
writing styles. Prior to feature generation, we binarize all images using the KittlerMet
binarization method [58]. A summary of the features used in these experiments, seven of these
features (f2, f3, f4, f5, f6, f7, f9) are extracted directly from the image of the digit while three
features (f1, f8 and f10) are extracted from different regions of the image by applying the
uniform grid method. The features are directly extracted from the binary images of digits
without any size normalization. Then, the proposed recognition module is based on SVM multi-
class approach using the one-against-all implementation. SVM and RBF kernel parameters are
Feature
Precision (%)
Recall (%)
f1
93.81
93.80
f2
17.13
20.02
f3
14.78
15.80
f4
78.62
77.22
f5
39.36
39.26
f6
53.15
49.07
f7
85.05
84.78
f8
20.40
20.52
f9
18.51
19.62
f10
25.61
23.27
Table 2.3. Recognition results on individual features.
The performance of the system is quantified by computing the precision and recall. We first
present the results of individual features summarized in Table 2.3. It can be seen that while the
performance of these features varies significantly, the global features (f1) extracted from the
four regions of the digit image outperform all other features reporting precision and recall of
approximately 94%. Zernike moments (f4) and background features (f7) also realize acceptable
recognition rates while the performance of rest of the features is not very impressive when
evaluated individually.
Table 2.4 summarizes the performance of some of the feature combinations that we have
tested. Almost all the combinations report high recognition rates whereas the highest
recognition rate (recall and precision of 96.62%) is achieved when combining all the features
(f1-f10).

40
Features combinations
Precision (%)
Recall (%)
f1, f2, f3, f4, f5
94.86
94.88
f6, f7, f8, f9, f10
95.69
95.70
f1,f4,f7
96.16
96.18
f2, f3, f5, f6, f8, f9, f10
82.58
82.64
f1,f4,f7,f8,f9
96.18
96.23
f1, f2, f3, f4, f7, f8, f9
96.19
96.25
f1, f2, f3, f4,f7, f8, f9,f10
96.30
96.31
f1, f2,f3,f4,f5,f7,f8,f9,f10
96.61
96.62
f1-f10
96.62
96.62
Table 2.4. Recognition results on feature combinations.
We also carry out a series of experiments to compute the precision and recall for each of the
digits (0-9) separately. The recognition results for each digit are summarized in Table 2.5. It
can be observed from the results in Table 2.5 that in general, the precision and recall values are
more or less consistent across different classes. Relatively low recall is achieved on some digits
(7, 8 and 9). Furthermore, it should be noted that some pairs like ('0','8'), ('2','7') and ('3','8')
offer relatively less inter class variations making their recognition more challenging.
Class
Precision (%)
Recall (%)
0
97.77
98.67
1
96.16
98.76
2
97.49
98.16
3
95.85
95.36
4
95.59
97.61
5
96.74
96.74
6
97.80
98.12
7
96.88
95.50
8
96.79
95.41
9
95.10
91.87
Average
96.62
96.62
Table 2.5. Recognition rates on individual digits
We also compare the performance of the proposed combination of features with state-of-the-
art digit recognition systems submitted to the Digit Recognition Competition held in conjunction
with ICDAR 2013. The database (CVL) and evaluation protocol considered in our study is the
same as that of the aforementioned competition. This comparison is summarized in Table 2.6
where the precision of other systems has been reproduced from [40].

41
Rank
Method
Precision (%)
Normalized Digits
1
Salzburg II
97.74
Yes
2
Salzburg I
96.72
Yes
3
Our method
96.62
No
4
Orand
95.44
Yes
5
Jadavpur
94.75
No
6
Paris Sud
94.24
Yes
7
François Rabelais
91.66
Yes
8
Hannover
89.58
Yes
9
Tébessa II
78.43
Yes
10
Tébessa I
77.53
No
Table 2.6. Comparison of proposed method with state-of-the-art methods [40]
It can be seen from Table 2.6 that the proposed method realizes better performances than
most of the systems submitted to the competition. Two systems (Salzburg II and Salzburg I)
report slightly better recognition rates with precisions of 97.74% and 96.72% respectively. It
should however be noted that our method does not require any size normalization and the
features are directly extracted from binarized images of isolated digits. The combination of
multiple features in our case reduces the confusion between digit classes and consequently
results in high values of precision and recall.
22.7. Summary
The objective of this chapter was to find a representation of isolated handwritten digits that
allows their effective recognition. We proposed a combination of different statistical and
structural features, some are computed from the complete image of digit while others are
computed by first applying uniform grid sampling to the image. This combination of features
was investigated using SVM as classifier. The experiments conducted on a standard database of
isolated digits without any normalization of images realized high recognition rates comparable
with the state-of-the-art methods proposed on this problem.
The initial results obtained by the proposed combination of features are very encouraging. In
the next chapter, the concepts about segmentation of two connected handwritten digit
recognition are discussed.

42
CC
HAPTER
S
EGMENTATION OF TWO CONNECTED HANDWRITTEN DIGIT
RECOGNITION
Connected digits are the frequent observed situations occurred on the digit string. Hence, we
propose in this chapter a segmentation-verification system using conjointly the oriented sliding
window and support vector machine (SVM) classifiers. The oriented sliding window is used for
finding at the same time the interconnection points and the optimal angle for cutting the
adjacent digits. Whilst, SVM-based segmentation-verification using the global decision module
allows the rejection or acceptance of the processed image. Experimental results conducted on a
large synthetic database of handwritten digits show the effective use of the oriented sliding
window for segmentation-verification.
3.1. Introduction
Automatic reading of digit fields from an image document has been proposed in several
applications such as bank checks [1], postal code [2] and forms [3]. The usual approach for
designing a handwritten digit string recognition system is based on the segmentation-
recognition which can be performed mainly in four steps [4]: preprocessing, segmentation,
feature generation and classification. The preprocessing step generally allows transforming the
grayscale image of the digit string into the binary image for facilitating further processing. The
segmentation step is used for separating the overlapped and/or connected adjacent digits into
elementary digits in order to deduce the possible distinct classes [5-9]. After that, a feature
generation is performed on the digit image for reducing the dimension of the representation
and thus makes the design of the classification system. Next, a decision function allows
assigning a digit image to predefined class.
In order to improve the reliability of the system, an additional step is performed for verifying
the correct segmentation using the global decision stage for rejecting or accepting the
processed image. The segmentation-recognition with verification is usually namely
segmentation-verification [59]. Generally, algorithms based on the segmentation-recognition
approach are faster than segmentation-verification since a number of separation attempts are
smaller.

43
The design of a robust handwritten digit recognition depends of many factors such as the
techniques used for processing the image, algorithms for separating the adjacent digits,
methods for generating features and the type of classifier used for recognizing the segmented
digits. In this chapter, we only are interested in the segmentation stage since it is considered as
the most difficult task for designing robust handwritten string recognition. Hence, three cases
can occur when attempting to segment two adjacent digits: distinct digits, overlapped digits or
connected digits. In most cases, connected digits are the frequent observed situations [60].
Furthermore, the connection between adjacent digits can be simple (only one connection of
adjacent digits) or multiple (many connections of adjacent digits). The simple connection is the
frequent situation encountered comparatively to the multiple connections. Various developed
algorithms are more robust for separating simple connection than multiple connections [9].
33.2. Overview of tw o connected handw ritten digit
recognition
Various explicit segmentation algorithms of two handwritten connected digit have been
proposed based on the morphology of the connected digits for finding one or multiple
interconnection points [9], which are based on the contour [61-63], the skeleton [42,64], the
reservoir [65] or by combining simultaneously the contour, profile and the skeleton [8,66].
More recently, these algorithms have been compared objectively using synthetic and real
connected digits [9]. In the following, we briefly review the most popular algorithms to segment
two connected digits according the type of the morphology for detecting the cutting path.
Many algorithms based on the contour and profile have been proposed for segmenting two
connected digits (see Table 3.1). The first one has been proposed by Fenrich and
Krishnamoorthy (1990) [67], which use the vertical histogram projection and contour
information of two touching digits in order to derive valley and peak points. The cutting path is
then decided to segment the touching pattern by joining valley and peak points. However, the
main difficulty of using this method is the detection of valley and peak points when adjacent
digits are strongly connected. Fujisawa et al. (1992) [62] proposed to analyze the distance
between upper and lower profiles in order to detect the segmentation points, which are used to
build a segmentation graph for defining the best segmentation hypothesis. However, when the
handwritten digits strongly skewed or overlapped, the projection profile cannot reach the
touching points and therefore the segmentation points are not detected. Congedo et al (1995)
[61] proposed an algorithm based on the concept of the drop fall which involves a set of rules
dictating that the next pixel is in the direction of the fall of water drop on the contour. This
algorithm has proved to be an efficient segmentation method due to its simplicity and
effectiveness. However, the main difficulty of using this algorithm is finding the best start point
which can be in the wrong place. Therefore, completely ineffective segmenting paths will be

44
obtained. Shi and Govindaraju (1997) [63] proposed to segment two connected digits followed
by using the chaincode for detecting the most significant right turns at each point of touching
according the opposite contour point. Both right turn and their opposite contour points are
located to split the contour into many pieces, which are then classified as belonging to one or
the other digit. The final segmentation point is then determined by a vertical line bisecting the
image using a set of heuristic rules like height, width and position of adjacent connected
components, plus the distance between them. Kyung et al. (2002) [68] proposed using
structural features of contours to generate multiple hypotheses. These hypotheses based on an
analysis of the ligature and the characteristics of candidate break points and verified by MLP
numeral classifier. Moreover, Pal et al. (2003) [65] proposed an improvement of the Drop falls
algorithm based on the concept of water reservoir to find the best starting point for the drop in
order to avoid the local minimum. This method is based on research of the large area
"reservoir" where the digits are touching. The reservoirs are obtained by considering areas
resulting from a water drop on the component from the top or bottom of this one. This method
is effective compared to the drop fall algorithm. However, it cannot reach correct segmentation
due to wrong reservoir. Later, Lei et al. (2004) [69] proposed a recognition-based segmentation
of touching handwritten numeral strings with help of external and internal contour analysis and
projection analysis; these features are utilized to define the segmentation lines. Furthermore,
Ma et al. (2008) [70] described a touching pattern-oriented strategy for segmenting touching
digits in order to solve the problem of determining the best separation according set of Drop-fall
algorithms[61] can produce four possible cutting paths(Descend-left, Descend-right, Ascend-left
and Ascend-right). However, Wang et al. (2009) [71] proposed a recognition-based approach to
handwritten numeral string segmentation. This method generated all possible candidate
segmentations of an input string through contour and profile analysis, and then the feature
vectors containing recognition information of the statistical modeling concepts of Gaussian
Mixture Model (GMM).
The segmentation method based on the skeleton initially has been proposed by Chen and Wang
(2000) [42] by using both background and foreground analysis for simple or multiple touching
digits. A thinning algorithm is used in order to obtain the background and foreground skeletons
of the given image. Fork points, end points, and bend points from the skeleton are used as
features. This algorithm allows removing a useless stroke, which could disturb recognition.
Thereafter, all segmentation hypotheses are combined into a segmentation graph, and a
genetic algorithm is used to search among all the possible outputs. The segmentation-
recognition approach is used for recognizing the segmented digits. More recently, Everton and
Carlos (2013) [72] proposed the clustering the feature points selected from the skeletal
touching regions via Self-Organizing Maps (SOM) for segmenting single or multiple connected
handwritten digits. This method uses two parallel processes to define the segmentation points.
Firstly, the image is thinned and the branch points of the skeleton are detected and selected.

45
Secondly, features are extracted by scanning the image left to right, top to bottom, and saving
simultaneously the coordinates of the foreground pixels. These features are clustered using
Self-Organizing Maps (SOM) into three regions: left, right and middle region. Then, the
segmentation points are defined as the feature points which are close to nodes of the SOM
network. Elnagar and Alhajj (2003) [64] proposed an alternative approach to split pairs of
connected digits based on a segmentation-recognition approach using two agents. The pattern
is first preprocessed, normalized and thinned to obtain uniform stroke width, and then the
feature points are extracted from the skeleton and contour. Segmentation points are identified
near the decision line estimated from two agents, namely, top-valley based and bottom-hill
based and two candidate cut-points from each of them. After separation, the pattern is then
restored using heuristics. Besides, the Suwa method (2005) [3] proposed a thinning-based
segmentation strategy for a couple of handwritten connected digits graph representation. This
method basically improved previous method is reported by Elnagar and Alhajj method.
Almost at the same time, Oliveira et al [59, 66] proposed to combine structural features as the
contour, the profile and the skeleton points for generating segmentation cuts. The relationships
of feature points determine the list of segmentation hypotheses of the connected digits. All
segmentation cuts are produced using line segments connecting the segmentation points.
Neural network is used to evaluate the correct segmentation path according the segmentation-
verification approach for accepting or rejecting a handwritten segmented digit [59]. The main
difficulty of using this method is that it generates an over-segmentation or under-segmentation
which can be easily confused with an isolated digit. Later, Oliveira et al. method (2002) [59]
improved their algorithm by adding a new verification scheme containing two verifiers. A new
feature set is also introduced to feed the over-segmentation verifier. All segmentation points
are used to build a segmentation graph. The best segmentation hypothesis is the shortest path
of the graph. A postprocessor is used to resolve some problems when verifiers failed and vice
versa, and the global decision module makes an accept/reject decision.In order to reduce the
number of segmentation cuts, Vellasques et al. (2008) [8] proposed the use a filter for
discriminating isolated digits from over-segmented pieces. In this way, unnecessary cuts can be
avoided in a segmentation graph of Oliveira et al. method [59] without any attempt of
classification by a general-purpose classifier. However, the difficult task for finding the best
cutting is that adjacent digits can be oriented which leads to an over-segmentation or under-
segmentation [66].
Recently, Gattal and Chibani [73-74] proposed a system recognition-based segmentation to
recognize handwritten connected digits, the proposed approach allows separating connected
digits according the connection configuration by finding the interconnection points (IPs), after
that, in the work [73], we fixed sliding window around IPs. Otherwise, in the work [74], we set
a crossing oriented window around IPs with sloping angle for finding correctly cutting path.

46
Moreover, the segmentation and recognition module use the global decision for choosing the
correct segmentation path.
Features
Methods
Approach
Data
Size
Perf. (%)
Contour
and
profile
Fenrich and Krishnamoorthy (1990) [67]
Rec-based
ZIP codes
450
94.90
Fujisawa et al. (1992) [62]
Rec-based
Proprietary
920
95.00
Congedo et al (1995) [61]
Seg-Rec
CEDAR
91.00
Shi and Govindaraju (1997) [63]
Seg-Rec
CEDAR
1,966
80.80
Kyung et al. (2002) [68]
Rec-based
NIST SD19
3,500
92.50
Pal et al. (2003) [65]
Seg-Rec
French bank
checks
2,250
94.80
Lei et al. (2004) [69]
Rec-based
NIST SD19
3,359
97.72
Ma et al. (2008) [70]
Rec-based
NJUST603HW
1,591
88.18
Wang et al. (2009) [71]
Rec-based
NIST SD19
2,000
97.95
Skeleton
Chen and Wang (2000) [42]
Seg-Rec
NIST,
proprietary
4,500
96.00
Everton and Carlos (2013) [72]
Seg-Rec
Ribas et al. [9]
323
65.79
Elnagar and Alhajj (2003) [64]
Seg-Rec
NIST, CEDAR,
proprietary
96.00
Suwa method (2005) [3]
Seg-Rec
NIST SD19
2,000
88.40
Contour,
profile
and
skeleton
Oliveira et al (2000) [66]
Rec-based
Brazilian bank
checks
900
95.24
Oliveira et al. method (2002) [59]
Rec-based
Brazilian bank
checks
900
98.50
Gattal and Chibani (2011) [73]
Rec-based
NIST SD19
250
86.26
Gattal and Chibani (2012)[74]
Rec-based
NIST SD19
600
92.46
Table 3.1. Comparison of different segmentation methods
This chapter proposes a segmentation-verification scheme for segmenting simple and multiple
connections using an oriented sliding window. Two morphological features are combined based
on the contour and the skeleton for detecting and splitting correctly the connected digits. In
order to avoid the over-segmentation and the under-segmentation, a sliding window is used for
finding interconnection points. When the interconnection points are found, the window is
oriented to define the optimal cutting path. The recognition and verification process is
performed using the Support Vector Machines (SVM) through the One-Against-All (OAA)
implementation.
33.3. Segmentation-verification system
The main difficult task for segmenting two connected digits is to find the interconnection points
and the optimal path for cutting the digit image into isolated elements [62, 75-77]. Figure 3.1
illustrates some difficult samples extracted from the NIST SD19 database [56].

47
Figure 3.1. Different types of connected digit samples.
Hence, the design of the proposed segmentation-verification system can be conducted into four
main steps: segmentation of connected digits by searching the interconnection points using the
oriented sliding window, generation of features on isolated digits, recognition of the segmented
digits using the SVM-OAA implementation and verification for accepting or rejecting the digit
recognition. More precisely, the segmentation of connected digits uses two complementary
structural features (contour and skeletal points) for detecting the interconnection and base
points and finding the optimal path. It can be performed through the following steps:
(1) Generate segmentation points;
(2) Set sliding window on the Interconnection Point (IP) positions;
(3) Rotate the window around IPs for finding correctly the cutting path;
(4) Scan all the possible relationships of Base Points (BP) and IP to determine the list of
segmentation hypotheses on the window.
(5) Evaluate the correct segmentation path using a classifier based on SVM-OAA.
In the following, we describe each step of the proposed segmentation-verification system for
segmenting two connected digits. This system has been published as shown in the reference
[78].
3.3.1. Segmentation of connected digits
The segmentation of connected digits can be performed by setting the window on a position in
order to find the IP and then searching the optimal cutting path by rotating the window around
the IP. These two procedures are explained more precisely in the following.

48
3.3.1.1. Finding the interconnection points
The finding of interconnection points is inspired from Oliveira et al. method (2000) [66] from
which we propose some modifications [73-74]. It involves analyzing the number and nature of
IP between two connected digits in order to define the optimal position for cutting a digit image
couple [8].
The first step is to define IPs and BPs from which starts the segmentation. BPs are obtained
from the local extrema (minima and maxima) detected on the contour (In contrast to Oliveira et
al. method (2000) [66], BPs are obtained from the detected local extrema on the contour and
profile of each connected component). While IPs are calculated using the Freeman code
according to 8 directions in the clock-wise. The upper contour is used in order to detect the
upper Base Point (Upper BP) whilst the lower contour is used to detect the lower Base Point
(Lower BP). Often, the connection points contain IPs, BPs or both at the same time.
Therefore, three hypothesiss can be considered for the optimal segmentation (Figure 2):
Hypothesis 1:
If the Euclidean distance between the projection of the BP and the IP is lower
than a threshold, the segmentation is making between IP with Upper BP and also between IP
with Lower BP (Figure 3.2.a).
Hypothesis 2:
If the lower segment of IP is related to a higher segment of IP (or vice versa)
and both IPs are near a BP, the skeleton path linking both IPs (Skeleton path) is used as part
of the segmentation cut with complementary paths between BPs and IPs (Composed path)
(Figure 3.2.b).
Hypothesis 3:
In some cases, even if there is a connection between two digits, the skeleton
path does not have an IP. Thus, to avoid the under-segmentation (lack of segmentation
point), the Oliveira et al. method (2000) [66] enables cut path directly from BPs from profile.
In our case, the segmentation path is based on the minimal Euclidean distance between
Upper BP and Lower BP from contour (closest points) in the middle (Figure 3.2.c).
The main problem of this algorithm is that it does not allow correctly finding the cutting path
when connected digits are oriented. Hence, we propose a method based on the oriented sliding
window for finding at the same time the IP and the cutting path.

49
Upper BP
IP
IP
Upper BP
Lower BP
Lower BP
Upper BP
Lower IP
Upper IP
Upper BP
Lower BP
Lower IP
Upper IP
Skeleton path
Skeleton path
Composed path
Lower BP
Composed path
Upper BP
Upper BP
Lower BP
Lower BP
(a)
(b)
(c)
Figure 3.2. Possible segmentation paths using IP and BP: Hypothesis 1 (b) Hypothesis 2 (c)
Hypothesis 3.
3.3.1.2. Finding the cutting path
Finding the cutting path is based on the search of the angular cutting for the optimal
segmentation. This search can be conducted through the following steps [74]:

50
Detecting the presence of IP by using a fixed window having the same height as the original
image and a constant width W1, the IP is thus located in the middle of this width.
Rotating the window in the range
]
,...,
[
with an angular step around IP. Figure 3.3
shows various crossing of the oriented window around IP.
Finding the cutting path for each oriented window: if a single IP is found, then the cut is
making between single IP with Upper BP and also between single IP with Lower BP when the
Euclidian distance between IP and BP (Upper BP or Lower BP) is lower than a threshold T1. If
there are two IPs (Upper IP and Lower IP), then the hypothesis 2 is applied. Otherwise, the
hypothesis
order to avoid under-segmentation. Figure 3.3 shows various segmentation paths according
the orientation of the window.
Figure 3.3. Crossing and the segmentation paths according the orientation of the window.
3.3.2. Feature generation
Various feature generation methods have been developed for handwritten digit recognition
[21,79,55,57]. In our case, we consider global and local structural and statistical features [80-
81], which are extracted from different regions of the isolated image digits by applying the
uniform grid method [54]. The selected features extracted from each region include [57]:
Global features as the density, center of gravity, second order geometrical moments and
number of transitions.
Hu's Moment Invariants.
Radon transform of the image for estimating the skew of the digit.
Zernike moments.
Horizontal and vertical projections.
Left and right profiles.
Concavity information deduced from the background of the digit image.
Contour and skeleton deduced from the foreground of the digit image.
0
0
0
IP
0
0
0

51
Ridgelet Transform.
Finally, the feature vector is composed by concatenating all mentioned features. It is useful to
note that these features have recently been used in Ref. 57 for realizing highest recognition
rates.
3.3.3. Recognition and verification
After isolating the digits, the classification is performed on input image in order to recognize
one of the existing classes of the system [82]. In order to make the whole system more
reliable, a verification strategy is performed for producing a final decision. Thus, the recognition
stage interacts with the verification stage in order to produce all possible segmentation
hypotheses.
The decision for an hypothesis of segmentation-recognition is given by the multiplication
between the decision functions done by its sub-components. The decision function of a sub-
component is given by the recognition system based on a multiclass SVM-OAA implementation
[27].
Finally, the decision function is performed by selecting the maximum value provided by each
segmentation hypothesis. When values of decision functions are equal, the maximum value of
the decision function is selected from the first sub-components. Figure 3.4 illustrates an
example of the segmentation-verification according hypotheses presented in Figure 3.2. This
verification can devote to reduce the confusion between isolated and under-segmented digits.

52
Figure 3.4. Illustrative example for segmenting two connected digits using the oriented sliding
window and recognition-verification strategy.
33.4. System evaluation
The evaluation of a segmentation method is very subjective. It can be done according two
approaches. The first one is to evaluate the segmentation based on a prior knowledge about
the effects of segmentation. The second approach is to evaluate the system for recognizing
handwritten digits by integrating segmentation and classification. In our case, we adopt the
second approach for evaluating performances of the segmentation on a standard database
NIST SD19.

53
3.4.1. Databases
The NIST SD19 database is provided by the National Institute of Standard and Technology
(NIST) [56]. The isolated handwritten digits have the advantage of being abundantly available
in the form of database labeled. Similarly of isolated handwritten digits, the database contains
connected handwritten digits, which is divided into three divisions (hsf_0123, hsf_7 and hsf_4).
The split of each division is illustrated in Table 3.2.
Table 3.2. Distribution of the NIST SD19 database for handwritten digits
The isolated digits from NIST SD19 database are used for finding the optimal parameters of the
multi-class SVM classifiers.
For an objective evaluation and comparison of our proposed system to the existing methods,
the touching image pairs used in this work have been extracted from the synthetic database of
79,466 images created by Ribas et al. (2013) [9]. This database composed of 79,966 is divided
into two sets: a validation set composed of 500 images and the remaining composed of 79,466
images used for testing. Table 3.3 and Table 3.4 show the type of touching digits proposed by
Chen and Wang [42] and the distribution of connected digits according the connection type.
Type 4 connection digits are not considered in the database [9].
Division name
#Isolated digits
#Connected digits
hsf_0123
195 000
4 044
hsf_4
58 646
1 819
hsf_7
60 089
2 369

54
Category
Touching type
Touching style
Example
Simple
connection
1
Single-point
touching
2
Single-segment
touching
3
No obvious
segmentation
point
4
Ligature
connection
Multiply
connected
5
Multiple-
touching
Table 3.3. Types of connected numeral strings
Table 3.4. Distribution of the database regarding the type of connection [9]
3.4.2. Parameter tuning of the SVM model
The recognition module is based on the multi-class SVM classifiers using the One-Against-All
implementation [28-30]. The training of the SVM requires selecting two parameters, which are
the regularization parameter (C) and the Radial Basis Function (RBF) kernel parameter ( ).
Hence, a fraction of an isolated handwritten digit dataset containing 150,000 samples is divided
into two parts: the first one containing 100,000 digit images is used for training and the second
one containing 50,000 digit images are used for validation the SVM models [26]. After many
experiments, parameters of the SVM classifiers are fixed to C=40 and = 8. Finally, all digit
images (150,000) are used for training the SVM, which allows improving the digit recognition.
Connection type
% of connected digits
1
2
3
5
34.80
53.62
01.59
09.99

55
3.4.3. Results
This section details the results of the experimental evaluations carried out to validate the
proposed segmentation-verification scheme for segmenting simple and multiple connections
using an oriented sliding window.
3.4.3.1. Influence of the orientation angle
The proposed algorithm requires adjusting manually the threshold T1 and the width of the
window W1. After many experiments, T1 and W1 are set to 7 and 4, respectively. In order to
study the effect of using the oriented sliding window, various orientation angles are selected in
the range
]
,...,
[
with angular step fixed to 5° to avoid an important number of
segmentation cuts. Table 3.5 reports the overall rates and the number (#) of segmentation cuts
for various angles
={0°, 5°, 10°, 15°}.
(°)
Connection type (%)
Overall rate (%)
#Segmentation cuts
1
2
3
5
0
90.79
89.80
89.30
70.77
88.24
3.44
5
95.02
93.34
98.82
75.30
92.21
10.33
10
96.67
93.64
99.68
77.54
93.18
17.22
15
96.67
93.75
99.68
77.58
93.24
24.11
Table 3.5. The recognition rate for various orientation angles
We clearly can note that the use of the oriented window influences the overall recognition rate.
Indeed, by increasing the angle from 0 to 15°, we can note that the overall rate significantly
increases whatever the connection type. The best improvement is offered for the connection
Type 3, which is due to the rotation angle that allows changing the segmentation path using
hypothesis 3 (No obvious segmentation point). This means that the position of the Upper BP
and Lower BP deduced from the contour is changed according the orientation angle in the
middle. Furthermore, in spite of the complexity for segmenting multiple connections (Type 5),
the proposed segmentation-verification system allows increasing the overall rate more than 6%
for Type 5 when the angle is increased from 0 to 15°. Table 3.6 illustrates the segmentation
effect according the selected orientation angle.

56
Connected digits
Digit labels
Incorrect segmentation
Correct segmentation
Angle
(°)
Segmentation
effect
Angle
(°)
Segmentation
effect
14
0
10
21
10
15
75
5
10
Table 3.6. Segmentation effect according the orientation angle
3.4.3.2. Complexity of the proposed segmentation-verification
The performance of the proposed segmentation-verification system requires increasing the
orientation angle in order to improve the accuracy. However, this increase inevitably leads to
the increase of the number of segmentation cuts. In order to appreciate empirically the
complexity of the proposed segmentation-verification, we deduce from Table 3.5, respectively,
figures 3.5, 3.6 and 3.7, which show the number (#) of segmentation cuts versus the
orientation angle
(°), the overall rate (%) versus the orientation angle
(°) and the overall
rate (%) versus the number (#) of segmentation cuts. It can be observed that the number of
segmentation cuts increases linearly with the orientation angle. In contrast, the overall rate (%)
does not increase in the same manner. Indeed, when the orientation angle increases from 0 to
15°, the overall rate increases of 5% while the number of segmentation cuts increases more
than 600%. Hence, a tradeoff between the accuracy and the number of segmentation cuts
should be adopted avoid to reduce the speed of the segmentation-verification.
Figure 3.5. Number (#) of segmentation cuts versus the orientation angle (°)
0
5
10
15
20
25
0
5
10
15
#Se
gme
ntation
cuts
(°)

57
Figure 3.6. The overall rate (%) versus the orientation angle
(°)
Figure 3.7. The overall rate (%) versus the number (#) of segmentation cuts
Table 3.7 reports some examples of the successful segmentation-recognition produced by the
proposed system according the connection type. We can note when the segmentation is well
performed then the recognition also performs well. However, our proposed system does not
provide satisfactory results in some cases as reported in table 3.8, which is caused by two main
problems:
The first problem is caused by the robustness of the classifier. Indeed, although, the
segmentation is well performed, the classifier does not allow recognizing the segmented digit
due to its particular shape. To resolve this problem, it is possible to add this particular digit
into the training set for improving the performance of SVM classifiers.
80
85
90
95
100
0
5
10
15
O
ve
rall
ra
te
(%
)
(°)
80
85
90
95
100
0
5
10
15
20
25
O
ve
rall
ra
te
(%
)
#Segmentation cuts

58
The second problem is caused by "no obvious segmentation point". In this case, the
hypothesis 3 (No IP) is often applied for segmenting connected digits. This means that the
segmentation path in the middle does not always provide acceptable results.
Furthermore, in some cases, our proposed system can provide unsuccessful segmentation­
verification results as depicted in figure 3.8. As shown, there are three possible mistakes of
applying the segmentation-verification of connected digits: incorrect segmentation and
recognition, correct segmentation and incorrect recognition and incorrect segmentation and
correct recognition. Most of the mistakes are due to recognition problems (07.76%).
Connection type
Original image
Digit labels
Correct
segmentation-recognition
1
13
10
2
01
36
3
10
21
5
27
90
Table 3.7. Some examples of the successful segmentation-verification produced by the proposed
system according the connection type

59
Connection
type
Original image
Digit labels
Incorrect
segmentation
Assigned
classes
1
16
40
51
57
2
96
06
07
97
3
12
0-
47
57
5
00
20
09
69
Table 3.8. Some examples of the unsuccessful segmentation-verification produced by the proposed
system according the connection type
(a) 47
19
(b)
94
54
(c)
41
41
Figure 3.8. Unsuccessful segmentation-verification of connected digits produced by the proposed
system (a) Incorrect segmentation and recognition (b) correct segmentation and incorrect recognition
(c) Incorrect segmentation and correct recognition

60
3.4.3.3. Comparative analysis
To evaluate the robustness of our system when
= 15° comparatively to other existing
systems, we report in Table 3.9 the correct segmentation and its corresponding number of
segmentation cuts. Table 3.9 reports a comparative analysis provided by Ribas et al (2013) [9]
for which we add results obtained by Everton and Carlos (2013) [72], Oliveira et al. method
(2000) [66], Congedo et al. method [61] and the proposed segmentation-verification system. It
is worth noting that, the segmentation method proposed by Oliveira et al. [66] and the
segmentation method proposed by Congedo et al. [61] have been implemented using
information provided in their respective publications.
Method
Connection type (%)
Overall rate
(%)
#Segmentation
cuts
Simple
Multiple
1
2
3
5
Chen and Wang [42]
97.87
94.23
97.55
76.76
93.80
45.40
Congedo et al. [61]
62.88
67.51
59.40
40.45
63.07
1.00
Elnagar and Alhajajj [64]
63.88
71.51
56.40
58.73
67.34
1.00
Everton and Carlos [72]
71.75
71.21
63.64
56.57
65.79
1.00
Fenrich and Krishnamoorthy [67]
97.54
93.79
99.45
65.57
92.37
4.07
Fujisawa et al. [62]
95.45
91.27
83.57
63.72
89.85
3.66
Oliveira et al. [66]
90.40
90.78
89.01
64.88
88.03
1.00
Pal et al. [65]
73.96
74.69
80.09
41.52
71.21
1.00
Shi and Govindaraju [63]
68.31
59.72
60.35
25.44
59.30
1.00
Our approach
96.67
93.75
99.68
77.58
93.24
24.11
Table 3.9. Comparative analysis
The proposed method achieves an overall performance greater than 93%. The correct
segmentation-recognition for connection types 1, 2, 3, and 5 are 96.49, 93.75, 99.68 and
77.58%, respectively. Roughly speaking, the proposed method is more appropriate for
segmenting digits having connection types 3 and 5, respectively. Furthermore, we can observe
that the proposed method offers comparable overall rate as Chen and Wang method [42].
However, their method requires more segmentation cuts (more than 45) and is very slower
comparatively to the proposed method, which requires a reduced number of segmentation cuts.
The worst overall performance is achieved by the methods proposed by Shi and Govindaraju
[63], Congedo et al. [61], Elnagar and Alhajajj [64] and Everton and Carlos [72]. The method
proposed by Shi and Govindaraju [63] achieves a worst overall performance.
In contrast, the proposed method and the segmentation algorithm proposed by Chen and
Wang[42] produce almost the same overall performance. However, by examining carefully the

61
obtained results, we can note that the proposed method surpasses all existing methods for
connection types 3 and 5. Furthermore, the Chen and Wang method is still very slow against
the proposed method since many attempts should be evaluated for recognition.
A careful observation of obtained results depicted in Figure 3.9 shows that when the number of
the segmentation cuts increases then the overall rate also increases. The proposed method
outperforms most other state-of-the-art methods except the segmentation algorithm proposed
by Chen and Wang [42]. Indeed, they obtained an overall rate of 93.80% against 93.24% for
our proposed method. When observing Table 8, we can note that the Chen and Wang method
is suitable for connection types 1 and 2. In contrast, our method is suitable for connection types
3 and 5. It is useful to recall that the connection type 5 is the more complicated situation for
separating two connected digits since multiple connections are occurred when attempting to
separate digits.
Figure 3.9. Performance of the segmentation-recognition algorithms
59.30
63.07
65.79
67.34
71.21
88.03
89.85
92.37
93.24
93.80
1
1
1
1
1
1
3.66
4.07
24.11
45.40
0
20
40
60
80
100
Shi
Congedo
Everton
Elnagar
Pal
Oliveira
Fujisawa
Fenrich
Our approach
Chen
Seg
m
entati
on method
Number of segmentation cuts
Correct segmentation (%)

62
Table 3.10 reports the rank of the proposed method comparatively to other existing methods
according the type of connections. We clearly see that our method is more appropriate for
segmenting no obvious segmentation (Type 3) and multiple touching digits (Type 5). In
contrast, the method proposed by Chen and Wang [42] is more suitable for single point
connection (Type 1) and single segment connection (Type 2).
Rank
Connection type
1
2
3
5
1
Chen and Wang [42]
Chen and Wang [42]
Our approach
Our approach
2
Fenrich and
Krishnamoorthy [67]
Fenrich and
Krishnamoorthy [67]
Fenrich and
Krishnamoorthy [67]
Chen and Wang [42]
3
Our approach
Our approach
Chen and Wang [42]
Fenrich and
Krishnamoorthy [67]
4
Fujisawa et al. [62]
Fujisawa et al. [62]
Oliveira et al. [66]
Oliveira et al. [66]
5
Oliveira et al. [66]
Oliveira et al. [66]
Fujisawa et al. [62]
Fujisawa et al. [62]
6
Pal et al. [65]
Pal et al. [65]
Pal et al. [65]
Elnagar and Alhajajj
[64]
7
Everton and Carlos [72]
Elnagar and Alhajajj
[64]
Everton and Carlos
[72]
Everton and Carlos
[72]
8
Shi and Govindaraju
[63]
Everton and Carlos
[72]
Shi and Govindaraju
[63]
Pal et al. [65]
9
Elnagar and Alhajajj
[64]
Congedo et al. [61]
Congedo et al. [61]
Congedo et al. [61]
10
Congedo et al. [61]
Shi and Govindaraju
[63]
Elnagar and Alhajajj
[64]
Shi and Govindaraju
[63]
Table 3.10. Rank of our method comparatively to the existing methods according the
connection type
It should however be noted that the proposed method can deal with several types of
connection between the digits specifically both connection types: no obvious segmentation
point and multiple connections. Other hand, it achieves encouraging results related to other
state-of-the-art methods.
33.5. Summary
The objective of this chapter aimed to propose a SVM-based segmentation-verification system
for segmenting two connected handwritten digits using the oriented sliding window and SVM
classifiers in order to find the best cut for isolating two adjacent digits.
Experimental results showed that the performance of the system depends on the selection of
the appropriate orientation angle. Indeed, obtained results clearly show that the recognition
rate increases when selecting the appropriate angle. Therefore, choosing an optimal angle is a

63
crucial step for cutting optimally two connected digits. When comparing the proposed approach
to the state of the art, we can note that it constitutes a tradeoff between the correct
segmentation and the number of the segmentation cuts.
The next chapter devotes to present the proposed a segmentation and recognition system for
unknown-length handwritten digit strings by combining several explicit segmentation methods.

64
CC
HAPTER
H
ANDWRITTEN DIGIT STRING RECOGNITION
The segmentation of handwritten digit strings into isolated digits remains a challenging task.
The difficulty for recognizing handwritten digit strings is related to several factors such as
sloping, overlapping, connecting and unknown-length of the digit string. Hence, this chapter
aims to propose a segmentation and recognition system for unknown-length handwritten digit
strings by combining several explicit segmentation methods depending on the configuration link
between digits. Three segmentation methods are combined based on histogram of the vertical
projection (HVP), the contour analysis and the Sliding Window Radon Transform (SWRT). A
recognition and verification module based on Support Vector Machine (SVM) classifiers allows
analyzing and deciding the rejection or acceptance each segmented digit image. Experimental
results conducted on the benchmark dataset show that the proposed system is effective for
segmenting handwritten digit strings without prior knowledge of their length comparatively to
the state-of-art.
4.1. Introduction
The Automatic Handwritten Digit String Recognition (AHDSR) is required in many applications
such as the amount of the bank checks, postal code and forms [1, 3, 61, 83, 84]. In this
context, two main problems are occurred when attempting to design a handwritten digit string
recognition system. The first problem is the unknown-length of the digit string, which is not
carefully written by people in real-life situations [85]. The second problem is the link between
adjacent digits, which can be naturally spaced, overlapped or/and connected [5].
We propose, in this chapter, a new design of a handwritten digit string recognition system
based on the explicit approach for the unknown-length digit strings. Three methods are
combined according the link of adjacent digits, which are the histogram of the vertical
projection dedicated for spaced digits, the contour analysis dedicated for overlapped digits and
the Radon transform performed on the sliding window dedicated for connected digits.
Our ultimate objective is to propose a full segmentation algorithm based on the explicit
approach for handwritten digit string recognition with the unknown-length string.

65
44.2. Overview of handw ritten digit string recognition
Various AHDSR systems have been developed in the past years, which can be divided into two
approaches: implicit and explicit. The implicit approach considers all traced points as potential
segmentation points [63]. In this case, the segmentation and recognition are performed
simultaneously for recognizing a digit string. Indeed, this approach does not attempt to
separate digits, but rather it incorporates the implicit segmentation into the recognition module.
For instance, Britto et al [85] proposed an AHDSR based on the Hidden Markov Models (HMM).
In their approach, the segmentation-recognition is performed into two steps. The first one takes
into account some contextual information to deliver several segmentation-recognition
hypotheses for a given preprocessed string. These hypotheses are used in a second step by
using an isolated digit classifier for verifying the accuracy of the recognition. The performance
evaluation has been conducted by considering the known-length strings and the unknown-
length strings using the length predictor.
The explicit approach is, in contrast, performed by finding the best way to isolate adjacent
digits before recognition. In this case, the segmentation and recognition are performed
separately. Also, three cases can be occurred when attempting to separate two adjacent digits:
spaced, overlapped and/or connected digits. In this case, the design of a robust handwritten
digit string recognition system can be performed into two steps: primary and secondary
segmentation. The primary segmentation is performed when the adjacent digits are naturally
spaced. In this case, digits are separated using a simple histogram of the vertical projection
[62]. This segmentation requires a correct writing of the digit image to avoid separating a digit
into multiple fragments. In contrast, the secondary segmentation is performed on the resulting
image produced from the primary segmentation in order to detect any presence of overlapped
digits and/or connected digits. Hence, various methods have been developed specifically for
overlapped and connected digits.
Segmentation methods dedicated for overlapped digits are performed when digits are slopped
or appear to imperfect alignment between the axis and the written text. Under some conditions,
it is possible to apply the methods used in cursive script text to estimate the skew value [86].
In some cases, these methods do not provide an efficient segmentation when overlapped digits
are complicated. In contrast, connected digits are the frequent situations occurred when
attempting to separate two adjacent digits. Hence, some recent segmentation methods
dedicated for connected digits have been proposed for single or multiple touching such as the
Drop-fall based algorithms [61], the thinning-based algorithms [87], the water reservoir
algorithms [65] and the removing useless stroke algorithm [42].

66
In this context, Oliveira et al. [59] proposed a segmentation and recognition system using some
heuristics to avoid an over or under-segmentation combined with the classifier and the verifier
based on Multi-Layer Perceptron (MLP) neural networks. All segmentation points are used to
build a segmentation graph. The best segmentation hypothesis is the shortest path of the
graph. A post-processing is used to resolve some problems where the verifiers failed and vice
versa, and the global decision module makes an accept/reject decision. The performance
evaluation of the system has been conducted on NSTRING SD19 database by considering the
unknown-length digit strings. Latter, Oliveira and Sabourin [88] proposed to replace the MLP
classifier by the Support Vector Machine (SVM) classifier in order to improve the accuracy of the
digit string recognition system. Sadri et al. [89] described a genetic framework using contextual
knowledge for segmentation and recognition of handwritten digit strings. This method has been
developed to locate feature points on the string image in order to generate possible
segmentation hypotheses. A genetic algorithm is used as a general search technique for finding
the global optimum segmentation hypotheses in digit strings. Experimental results have been
conducted on the digit strings extracted from NSTRING SD19 database using two different
classifiers (MLP and SVM) by considering the known and unknown-length digit strings.
44.3. Segmentation methods of the digit string
The choice of digit string segmentation depends of the link between adjacent digits. Hence,
four situations are occurred which are spaced, overlapped, connected and
overlapped/connected digits, respectively. Thus, this section is devoted to describe
segmentation methods used in our system for which we propose a new segmentation method
performed specifically for connected digits.
4.3.1. Segmentation of spaced digits
The segmentation of spaced digits is performed using the histogram of the vertical projection
(HVP) when digits are well written with some degree of neatness (not connected or not
overlapped). The HVP is performed on the binary digit string from which a simple count of the
black pixels in each column is running in order to detect the white space between successive
digits [62]. This allows determining the location of each component contained into the image.
The advantage of this method is its ability to segment the digit string with the unknown-length.
However, when digits are overlapped or connected, the segmentation is not possible. Indeed,
the HVP can split the digit string comprising either a single digit or more than two digits. Hence,
each isolated digit is well recognized by the classifier. In contrast, when digit components are
connected or overlapped, the classifier can't recognize the digit. In this case, the classifier plays
an important role for detecting the overlapped or/and connected digits. Figure 4.1 illustrates an
example of the correct and incorrect segmentation of the digit string when using the HVP.

67
Figure 4.1. Correct and incorrect segmentation using the HVP.
In some cases, segmented digits are broken in multiple parts (Figure 4.2). In this case, the
recognition is performed by the classifier without any preprocessing method.
Figure 4.2. Sample images of broken handwritten digits.
4.3.2. Segmentation of overlapped digits
The segmentation of overlapped digits is performed when adjacent digits are overlapped. It is
based on the contour analysis using the contour detection, which is extracted from the binary
image using the mathematical morphology [61, 65, 59, 88, 90]. Hence, two adjacent digits are
then separated using a distance termed , which is fixed experimentally. Figure 4.3 illustrates
the possible and impossible segmentation when analyzing the digit contour.
(a)
(b)
Figure 4.3. Segmentation by contour analysis. (a) Possible segmentation. (b) Impossible segmentation
Spaced digits
Overlapped digits Connected digits

68
In some cases, when using the contour detection, it appears broken digits which are necessary
to process for improving the performance of the segmentation. In this case, the broken parts of
an overlapped component are detected by examining the intersection with the median line (i.e.
half height) of each component image [91]. Figure 4.4 presents two examples of the incorrect
segmentation.
Figure 4.4 Examples of incorrect segmentation when using the contour analysis.
Generally, a digit is broken into two parts for which we associate each one a contour namely
(C1) and (C2), respectively. Figure 4.5 shows different configurations of a broken digit. Hence,
three rules are specifically used for their detections:
Rule 1: when both C1 and C2 intersect the median line, then the broken parts are considered
as two distinct overlapped single-digits. In this case, these adjacent digits are straightforward
separated using a fixed distance (Figure 4.5.a).
Rule 2: When both C1 and C2 do not intersect the median line, then the broken parts are
considered belonging to the same single-digit (Figure 4.5.b).
Rule 3: When C2 does not intersect the median line and C1 intersects the median line, then
C1 and C2 belong to the current digit (Figure 4.5.c ) otherwise C1 and C2 belong to the next
digit (Figure 4.5.d). For both cases, the belonging of C2 to the current digit or to the next digit
is decided by the classifier.

69
(a)
(b)
(c)
(d)
Figure 4.5. Processing of broken parts (a) two distinct overlapped single-digits (Rule1) (b) broken
single-digit (Rule2) (c and d) broken digit (Rule 3)
4.3.3. Segmentation of connected digits
A great effort has been conducted for developing various techniques to detection the connected
digits using information on the contour, skeleton and profile of the digit string. For instance,
Elnagar and Alhajj [64] proposed to split touching digits using thinning processes. The pattern
is thinned to obtain a uniform stroke width and then the feature points are extracted from the
skeleton and contour as end points, joint points and crossing points. Latter, Suwa [3] proposed
an improvement of the method proposed by Elnagar and Alhajj [64] by adopting a thinning-
based segmentation strategy for a couple of handwritten connected digit. Kyung et al. [68]
proposed using structural features of contours to generate multiple hypotheses. These
hypotheses are based on the analysis of the ligature and the characteristics of candidate break
points. Lei et al. [69] proposed an analysis of external and internal contours as well as
projection analysis. These features are used to define the segmentation lines. Ma et al. [70]
described a touching pattern-oriented strategy using Drop-fall algorithms [61] according to four
possible cutting paths (Descend-left, Descend-right, Ascend-left and Ascend-right). Wang et al.
[71] proposed to generate all possible candidate segmentations of an input string through
contour and profile analysis. Recently, Gattal and Chibani [78] proposed a method for finding
the Base Points (BPs) and Interconnection Points (IPs) on the contour and the skeleton of the
connected digits according to the connection configuration. After that, a crossing oriented
window is set around IP for finding correctly the cutting path.
In this work, the algorithm is slightly adapted such that it can segment connected digits with
unknown lengths. Also, the Radon transform is used in order to perform on the sliding window
for selecting the optimal orientation angle and consequently reducing the number of
segmentation cuts. The proposed method uses conjointly the contour and the skeleton of the
C1
Height
C2
C1
C2
C1
C2
C1
C2
Median line

70
digit string for detecting point connections, which can be performed according to the following
steps:
Step 1: Apply a contour detection on the connected digit string in order to detect all possible
BPs from the local extrema (minima and maxima).
Step 2: Perform the skeleton algorithm in order to detect all possible IPs.
Step 3: Find IPs by analyzing the skeleton between connected digits. If IP is detected, a
sliding window having the same height as the digit image and a fixed width (
) is set on
IP in the middle of the width (
). Then, the Radon transform is performed on the sliding
window around IP for finding the orientation angle (
)
in order to get the best cutting. This
window is called the Sliding Window Radon transform (SWRT). The optimal orientation angle
(
) and the width (
) match a right angle inter-digit into the most cases. Also, it
allows reducing the number of segmentation cuts.
Step 4: For each SWRT detected in the digit string, a list of segmentation hypotheses is
provided in order to find the best cutting path. Therefore, three hypotheses can be
considered: if a single IP is found, then the cut is making between single IP with upper BP
and lower BP. When two IPs (Upper IP and Lower IP) are found, then the cut is making
between BPs and IPs. In order to avoid under-segmentation, the cut is making between
closest points (Upper BP and Lower BP) in the middle. More details are reported in [78].
Step 5: In some cases, digits may be connected in such a way there is no IP on the skeleton
leading to failure of positioning the SWRT. In this case, to avoid under-segmentation due to
lack of IPs, the cut is made between closest upper and lower BPs in the middle of two
connected digits.
The evaluation of each hypothesis is performed using the digit recognition-verification module,
which is presented in section 4.3. Furthermore, the number of the segmented components
depends on the number of detected IPs. Hence, two important parameters influence the quality
of the segmentation, which are the adjustment of the width (
) and finding the orientation
angle (
) using SWRT.
4.3.3.1. Adjustment of the width
(
)
Adjusting the appropriate width (
) of the SWRT directly influences the performance of
the segmentation. For instance, a large W
generates small number of cutting paths leading
to an over-segmentation. In contrast, a small
generates many cutting paths leading to
an under-segmentation. Consequently, a trade-off between small and large width should be
defined in order to lose or ignore some important cutting paths. Figure 4.6 illustrates the
influence of adjusting the width of the sliding window.

71
In our case, the width
is set as a fraction of its height i.e.
=
/
where
is the height of the connected digits and
is a parameter defined experimentally.
When the height of connected digits is changed, the width
is also changed in order to
keep the aspect ratio and allows avoiding the problem of variations in the shapes and styles.
Figure 4.6. Influence of adjusting the width on the sliding window.
4.3.3.2. Finding the angular cut via the Radon transform
The skew or orientation angle of the connected digits is estimated using the Radon transform of
the image [47-48]. It is computed by projecting the image along specified directions, which
allows generating a set of line integrals or slices from multiple sources. More precisely, the
Radon transform takes multiple and parallel projections of the image from different angles by
rotating the source around the center of the image. Formally, the Radon transform is performed
on a binary image, which is defined as [48]:
g(r, ) =
I(x, y). (x cos + y sin
r) dxdy
(4.1)
where
is the Dirac function,
] 0, 179°] and r ]
, + ]. In other words, g(r, ) is the
integral of I(x, y) over the line defined by = x cos + y sin .
Figure 4.7 shows some illustrative examples of the Radon transform computed on the binary
connected digit images. The high brightness indicates the maximum amplitude of the Radon
transform. Therefore, the orientation angle, which defines the angular cut (
), is deduced by
finding the maximum value of the Radon transform g(r, ). Figure 4.8 shows the Radon slices
for two selected projection angles. The angular cut (
) is selected according to the amplitude
of the Radon slice. The slope of the SWRT fixed around IP allows getting a better cutting path.
Figure 4.9 shows the impact of selecting the orientation angle around IPs for segmentation with
and without using the SWRT.
Upper BP
IP Lower BP
W
SWRT
Upper BP
IP
Lower BP
W
SWRT

72
Figure 4.7. Radon transform for three examples of connected digits. The maximum value of the
Radon transform corresponds to the orientation angle
for cutting two connected digits.
Figure 4.8. Two selected projections of the Radon Transform showing
= 51°
0
10
20
30
40
50
60
70
80
90
100
-60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60
g(
r,
)
r
51°
140°

73
(a)
(b)
Figure 4.9. Impact of selecting the orientation angle for segmentation
(a) Segmentation with SWRT (b) Segmentation without SWRT
44.4. Design of the digit string recognition system
The proposed digit string recognition system allows examining each type of the segmentation
problem (spaced, overlapped or connected digits) in order to select the appropriate
segmentation method. As shown in Figure 4.10, its design requires three main modules, which
are the primary segmentation, secondary segmentation and digit recognition-verification. The
digit string image is primarily segmented using HVP, which produces multiple Segmented
Components (SC) each one is verified according to its size using the Segmented Component
Analysis (SCA). If the size is greater than a fixed threshold, SC is considered as a digit (isolated
digit). Therefore, it is submitted to Digit Recognition-Verification (DRV) for accepting or
rejecting the digit. In contrast, if SC is rejected by DRV, the secondary segmentation is
performed to separate the overlapped and/or connected digits. In this case, the contour
Binary image
Inverted binary image
Orientation angle
detection
IP
Upper BP
Orientation of the
window to
Lower BP
directions
Upper BP
Lower BP
IP
Cutting of connected
digits

74
detection is performed on SC considered as a non digit, which is analyzed and verified using
successively Contour Analysis (CA), SCA and DRV, respectively. Otherwise, SWRT technique and
Grouped Component Analysis (GCA) are used conjointly for segmenting SC. GCA uses DRV for
evaluating each Grouped Component (GC) according to the configuration link in order to
provide optimal decisions in difficult situations. In the following, we first describe the design of
DRV then the steps for segmenting the digit string image.
Figure 4.10. Full segmentation system for handwritten digit string recognition.
4.4.1. Digit recognition-verification
The Digit Recognition-Verification module is composed of two sub-modules, which are the Digit
Recognition based on feature generation module and the classification module, and the Digit
Verification. Their design directly affects the robustness of the proposed handwritten digit string
recognition. In the following, we briefly describe feature generation method, classifiers and the
technique of verifying the digit.
4.4.1.1. Feature generation
Various feature generation methods have been proposed in recent years. In the proposed
system, we use a combination of multiple features for improving the overall recognition rates by
minimizing the intra-class variability and maximizing inter-class variability. These features used
in the work [57] include some global statistics, moments, profile and projection based features
and features computed from the contour and skeleton of the digit. Some of these features are
extracted from the whole image of the digit while others are extracted from different regions (4
regions) of the image by first applying a uniform grid sampling to the image. Hence, we notice
Primary segmentation
Digit recognition-verification
Secondary segmentation
Handwritten digit
string image
HVP
Segmented
Component Analysis
Contour
Detection
Contour
Analysis
SWRT
Grouped Component
Analysis
Digit
Recognition
Digit
Verification
Digit
Non Digit
Accept
Reject
SC
GC
SC

75
that the uniform grid sampling and used features are well suited for discriminating isolated
digits from under-segmented ones.
4.4.1.2. Design of the SVM classifiers on isolated digits
The classification module is based on the well known Binary Support Vector Machine (B-SVM)
classifiers implemented through the One-Against-All (OAA) architecture. Ten (10) B-SVMs are
then used for recognizing an isolated digit from the set = { , . . ,
}
such that
is the class
label of a digit j = 0, . . ,9. Each B-SVM classifier is trained using the radial basis function kernel
for separating a digit class from other classes.
More precisely, each digit is associated to a SVM classifier providing a response termed
f (x ), j = 0, . .9 such that x
is the feature vector of the Segmented Component (SC).
Therefore, SC is assigned to one the classes according to the following decision rule:
f
(x ) = max f (x ); j = 0, . . ,9
(4.2)
f
(x ) is the maximal value selected from 10 responses provided by SVM classifiers.
4.4.1.3. Digit verification
The maximal value selected from 10 SVM responses does not ensure that SC is a true isolated
digit. Indeed, in some cases, SC submitted to the recognition can be a connected or overlapped
digit. In order to ensure a better recognition of the digit, we propose to add a threshold termed
t as a verifier using the following decision rule: x
is accepted as a digit if f
(x )
t
otherwise it is rejected. Conversely, when x
is rejected (i.e. f
(x ) < ), SC is then
considered as a non digit (overlapped and/or connected digit).
The threshold t is deduced during training of the SVMs by taking the minimum of the SVM
maximum outputs from all isolated digits, let:
t = min {max f (x ); j = 0, . .9 ; i = 1, . . N} (4.3)
x is the feature vector of the isolated digit and N is the number of isolated digits used for
training the 10 SVM classifiers.
4.4.2. Spaced digit recognition-verification
The spaced digit recognition-verification is performed using successively the HVP, SCA and DRV.
The HVP allows producing multiple SCs each one is analyzed by SCA in order to decide if it is a
digit or non digit. In our case, SC is considered as an isolated digit by exploiting the ratio
between its height and width using the following decision rule:

76
(4.4)
Such that R
= Height/Width and t
is a fixed decision threshold. Consequently, SC is
considered as an isolated digit if R
t
otherwise it is considered as an overlapped or
connected digit. When SC is detected as an isolated digit, it is submitted to the DRV for its
recognition otherwise it is submitted to the secondary segmentation module. The ratio of SC is
considered as additional information that helps the classifier to avoid the under-segmentation
problem.
In some cases, connected digits are verified as isolated digits by SCA since their ratio is greater
than a threshold (i.e. R
t
) as shown for instance in Figure 4.11. To avoid
misrecognition, SC is accepted definitively using the decision rule:
x
Accept if R
t
and f
(x )
t
Reject otherwise
(4.5)
When SC is rejected, it is submitted to the secondary segmentation module.
Figure 4.11. Some samples of connected digits considered by SCA as isolated digits.
4.4.3. Overlapped digit recognition-verification
The overlapped digit recognition-verification is performed when the ratio R
of SC is less than
a threshold (t
). In this case, SC is considered as non digit and can contain more than one
overlapped digits. Therefore, a contour detection is performed on SC for their separation using
a fixed threshold (t ) and some specific rules (see 4.3.2). The resulting sub-components are
verified by SCA for deciding if each one is a digit or non digit. When it is detected as a digit,
DRV is used for accepting or rejecting it. In the case of rejecting, the segmentation method
based on SWRT and GCA are performed to separate two or more connected digits according to
the connection type.
4.4.4. Connected digit recognition-verification
The connected digit recognition-verification uses conjointly SWRT method, GCA and DRV. When
using SWRT, the connected digits are split into a sequence of segments each one is considered
as a segmentation hypothesis, which is expected to contain a digit or a fragment of a digit. In
this case, all segmentation hypotheses can be represented through a segmentation graph
[08,42]. More specifically, the recognition-verification of connected digits can be performed into
two steps:

77
Generate all possible segmentation hypotheses on the connected digit string using SWRT.
Each hypothesis defines an isolated digit or a fragment of a digit, which is verified by DRV.
Perform GCA in order to enable the duplication of segments.
Hence, GCA allows grouping the sequence of segments for generating Grouped Components
(GC), which are submitted to DRV for accepting or rejecting. If GC is accepted then it is
considered as a digit otherwise it is considered as a non digit. The grouping of segments using
conjointly GCA and DRV is performed according to the following heuristic rules:
GC is submitted to DRV only if GC intersects the median line.
If GC intersects the median line and not accepted by DRV then it is grouped with the previous
or next segment and tested with DRV using the following decision rule:
x
Accept if GC intersects median line and f
(x )
t
Reject otherwise
(4.6)
The reject of GC is considered as an unlikely segmented component. In this case, it is
removed from the list of hypotheses.
When GC does not intersect the median line then it is grouped with the previous or next
segment component and tested with DRV, and so on.
Figure 4.12 shows an example of using the heuristic rules for grouping the segments while
Figure 4.13 depicts an illustrative example of segmenting a digit string image.
Figure 4.12. Connected digit recognition-verification (a) Original connected digits (b) Scan IPs (c)
Fixing sliding window Radon transform around IP (d) Segmentation paths (e) Final decision.
Upper BP
IP
(a) (b) (c)
Lower BP
(d)
(e)

78
Figure 4.13. Segmentation example of our proposed digit string recognition system.
44.5. Experimental results
In order to show the effect of using multiple segmentation methods on unknown-length of the digit
string, the well known NIST SD19 database [56] is used containing isolated digits and digit strings.
Experimental results are conducted for evaluating the primary and secondary segmentations as well
as the overall segmentation. Since the segmentation of the connected digits affects considerably the
performance of the digit string recognition, a careful evaluation is conducted to study the influence of
the Sliding Window Radon Transform parameters. All best obtained results are then after compared to
the state-of-art.
4.5.1. Databases and evaluation criteria
For a fairly evaluation and comparison with the existing similar algorithms, we use the same
database NIST SD19 as used in Refs. [59, 85, 88]. Hence, two digit sample sets are extracted
from NIST SD19. The first one containing 90,000 handwritten isolated digits is used for building
DRV since it has more variations in their shapes and styles. While, the second one containing
12802 digit string images is extracted from the hsf_7 series NIST NSTRING SD19 database,
which is distributed into six categories according to their lengths: 2-digit (2370), 3-digit (2385),
4-digit (2345), 5-digit (2316), 6-digit (2169), and 10-digit (1217) strings, respectively. These
data exhibit different difficulties such as sloping, overlapping and connecting as well as different
Accept Reject
Reject
Accept
0
1 2 3 4 5
6 7 8
9
Reject Accept
Digit string
Segmentation and
recognition-verification of
spaced digits.
Segmentation and
recognition-verification of
overlapped digits
Segmentation and
recognition-verification of
connected digits
Segmented and recognized
digit string

79
string lengths. Table 4.1 shows the distribution of the database according to the various
difficulties.
String length
#Strings
#SD
#C-OD
SD (%)
C-OD (%)
2-digit
2370
1656
714
69.87
30.13
3-digit
2385
1303
1082
54.63
45.37
4-digit
2345
1219
1126
51.98
48.02
5-digit
2316
1125
1191
48.58
51.42
6-digit
2169
1202
967
55.42
44.58
10-digit
1217
679
538
55.79
44.21
Total
12802
7184
5618
56.12
43.88
Table 4.1. Number of digit string samples (#Strings) distributed according to the Numbers of
Spaced Digits (#SD) and Connected and/or Overlapped Digits (#C-OD) expressed also in %.
The evaluation criterion used for conducting experiments and comparison is based on the
recognition rate (Rec. Rate) performed on the six digit string lengths: 2-digit, 3-digit, 4-digit, 5-
digit, 6-digit and 10-digit strings, respectively.
All recognized segmented components provided by the proposed system are compared to the
ground truth. If the segmented component is assigned to the corresponding ground truth then
the segmentation is considered as successful.
The evaluation of the proposed system is conducted by providing in each time the recognition
rate obtained for primary and secondary segmentations as well as the overall rate for each
string length.
4.5.2 Experimental setup
The performance of the proposed digit string recognition system depends on the robust design
of DRV on isolated digits, the appropriate selection of the decision threshold (t ) used for
accepting or rejecting the digit, the threshold t
used for deciding if SC is a digit or non digit,
the distance (t ) between two overlapped digits, used for adjusting the width of the sliding
window (W
) and the adjustment of the Radon transform parameters (range of the
projection angle and the angular step). t and t
are deduced from isolated digits whereas t
and
are fixed from the digit string images. Therefore, all these parameters are fixed as
follows:
t = 0.2: t is selected by taking the minimum value of the maximum SVM outputs among
recognized 90,000 isolated digits.
t
= 2: t
is deduced from 90,000 isolated digits by taking the maximal ratio of R
.
t = 5 pixels: t is fixed by examining the overlapped digits contained into NSTRING SD19.

80
= 5: W
is selected as the height fifth of connected digits by examining all connected
digits contained into NSTRING SD19.
4.5.3. Design of SVM classifiers on isolated digits
The performance of a digit string recognition system depends in particular on the design of a
robust SVM classifier on isolated digits. In our case, the recognition module is based on the
SVM multi-class approach using the One-Against-All implementation [30]. Two important
parameters are required for training the SVM classifiers, which are the regularization parameter
(C) and the Radial Basis Function (RBF) kernel parameter ( ). Hence, SVM classifiers are
trained and their parameters are adjusted by using the NIST SD19 handwritten isolated digit
database.
The efficiency of the SVM classifiers also depends on the number of samples used for building a
robust DRV. Thus, 90,000 handwritten digits are divided into sub-sets, which are used for
training (60,000 images) and validating (30,000 images) the SVM classifier parameters,
respectively. When parameters are found, the SVM classifiers are retrained by considering all
handwritten isolated digit images (90,000 images). More precisely, training and retraining of the
SVM classifiers are performed according to the two following steps:
Training: During this step, C and
are varied in the interval [1, 100] and [1, 50],
respectively. The optimal couple is selected when the validation rate is maximal. In our case,
the optimal C and parameters are set to 18 and 11, respectively.
Retraining: After training, SVM classifiers are retrained using all samples (training and
validation, i.e 90,000 samples) by keeping the same parameters (C=18 and = 11).
Table 4.2 reports the performance of the proposed segmentation and recognition system when
performing training and retraining SVM classifiers. The recognition rates (%) are reported for
the primary and secondary segmentations as well as the overall recognition for each string
length. The projection angle of the Radon transform is ranged in [1, 179] while the value of
the angular step is set to 10°.

81
Rec. Rate when Training SVMs
(%)
Rec. Rate when Retraining SVMs
(%)
String length
#Strings Primary
Secondary
Overall
Primary
Secondary
Overall
2-digit
2370
99.73
79.55
93.65
99.88
80.46
94.03
3-digit
2385
99.49
85.67
93.22
99.87
86.94
94.00
4-digit
2345
98.69
86.92
93.04
99.20
88.45
94.04
5-digit
2316
97.69
88.38
92.90
98.31
89.76
93.91
6-digit
2169
98.13
87.81
93.53
98.68
90.06
94.84
10-digit
1217
97.81
86.36
92.74
98.38
88.64
94.08
Overall
-
98.59
85.78
93.18
99.05
87.38
94.15
Table 4.2. Recognition rate (%) obtained when training and retraining SVM classifiers.
We clearly can notice that the retraining the SVM classifiers improves the best overall
recognition rate against the training of the SVM classifiers. Indeed, the overall recognition rate
when retraining the SVM classifiers significantly increases whatever the string length. This
improvement is due to the robustness of SVM classifiers for building an efficient DRV.
Furthermore, we can observe that the best improvement is specifically offered for the
secondary segmentation compared to the primary segmentation. This means that the
recognition of spaced digits is more simple than the recognition of overlapped or/and connected
digits.
4.5.4. Evaluation of the digit segmentation
The performance of the segmentation is strongly influenced by the presence of connected digits
into the digit string. In our system, the segmentation of connected digits is performed using the
Radon transform and its appropriate parameters for finding the best cutting path. The two
parameters are respectively the range of the projection angle ( ) and the value of the angular
step. In the following, these two parameters are studied for evaluating their impact of the digit
string recognition.
4.5.4.1. Influence of the range of the projection angle
Selecting the range of the projection angle plays an important role for finding the best cutting
path. Figure 4.14 depicts an illustrative example showing the influence of selecting the range of
the projection angle when segmenting two connected digits using SWRT. As we can see, when
the range of the projection angle is reduced, the cut is well performed.

82
The recognition rates for the primary and secondary segmentations are reported in Figure 4.15
according to the range of the projection angle when the value of the angular step is fixed to
10°. Table 4.3 reports the average rates for all digit string lengths derived from different ranges
of the projection angle. The last column reports the overall recognition rate for all string lengths
provided by the proposed system with respect to the range of the projection angle (°) while the
value of angular step is set to 10°.
Figure 4.14. Impact of selecting the range of the projection angle for detecting the cutting path from
[1, 179] to [80, 100].
(a)
70
72
74
76
78
80
82
84
86
88
90
92
94
96
98
100
1
2
3
4
5
6
7
8
9
Rec.
R
at
e
(%)
Index of the range of the projection angle
Primary segmentation
Secondary segmentation
[1, 179]
[10, 170]
[20, 160]
[30, 150]
[40, 140]
[50, 130]
[60, 120]
[70, 110]
[80, 100]

83
(b)
(c)
(d)
70
72
74
76
78
80
82
84
86
88
90
92
94
96
98
100
1
2
3
4
5
6
7
8
9
Rec.
R
at
e
(%)
Index of the range of the projection angle
Primary segmentation
Secondary segmentation
70
72
74
76
78
80
82
84
86
88
90
92
94
96
98
100
1
2
3
4
5
6
7
8
9
Rec.
R
at
e
(%)
Index of the range of the projection angle
Primary segmentation
Secondary segmentation
70
72
74
76
78
80
82
84
86
88
90
92
94
96
98
100
1
2
3
4
5
6
7
8
9
Rec.
R
at
e
(%)
Index of the range of the projection angle
Primary segmentation
Secondary segmentation

84
(e)
(f)
Range
[1, 179]
[10, 170]
[20, 160]
[30, 150]
[40, 140]
[50, 130]
[60, 120]
[70, 110]
[80, 100]
Index
1
2
3
4
5
6
7
8
9
Figure 4.15. Influence of selecting the range of the projection angle (°) for different digit string
lengths: (a) 2-Digit string length (b) 3-Digit string length (c) 4-Digit string length (d) 5-Digit string
length (e) 6-Digit string length (f) 10-Digit string length.
70
72
74
76
78
80
82
84
86
88
90
92
94
96
98
100
1
2
3
4
5
6
7
8
9
Rec.
R
at
e
(%)
Index of the range of the projection angle
Primary segmentation
Secondary segmentation
70
72
74
76
78
80
82
84
86
88
90
92
94
96
98
100
1
2
3
4
5
6
7
8
9
Rec.
R
at
e
(%)
Index of the range of the projection angle
Primary segmentation
Secondary segmentation

85
Rec. Rate (%)
Range of the
projection angle (°)
Primary
Secondary Overall
[1,179]
99.05
86.98
93.99
[10,170]
99.05
87.74
94.29
[20,160]
99.05
86.80
93.92
[30,150]
99.05
87.38
94.15
[40,140]
99.05
87.74
94.29
[50,130]
99.05
86.98
93.99
[60,120]
99.05
88.79
94.71
[70,110]
99.05
90.90
95.56
[80,100]
99.05
94.18
96.87
Table 4.3. Average rates obtained for all digit string lengths derived from different ranges of the
projection angle (°)
As we can see, the selection of the projection angle affects the performance of the system
more specifically for the secondary segmentation whatever the length of the digit string.
Moreover, the recognition rates for each digit string length and the average rates for all digit
string lengths are improved when the range of the projection angle is reduced.
The best performance is obtained when the range of the projection angle is fixed in the interval
[80, 100]. For the primary segmentation, the selected range does not affect the performance of
the system since SWRT is performed only on the secondary segmentation. In contrast, this
selected range allows improving the recognition rate till 94.18% for the secondary
segmentation providing an overall recognition rate of 96.87%. The good performance obtained
for this range of angles is fairly predictable since the writing generated by a writer is generally
inclined around at an angle about 90°. Hence, this range of angles for SWRT can be considered
as an optimum for all digit strings.
4.5.4.2. Adjustment of the angular step
The adjustment of the angular step for SWRT is also considered as a crucial parameter for
achieving a robust system. Its value is a trade-off between the computation time and the
recognition rate. In order to study its influence, the proposed system is evaluated for different
values of the angular step ranged from 1° to 10°. The range of the projection angle is fixed in
[80,100] according to the best performance obtained previously. Figure. 4.16 depicts the
recognition rate versus the angular step for different digit string lengths. Table 4.4 reports the
average rates for each angular step computed for all digit string lengths according to the
primary and secondary segmentations, respectively. The last column reports the overall

86
recognition rate for all string lengths provided by the proposed system with respect to the
angular step.
(a)
(b)
(c)
90
91
92
93
94
95
96
97
98
99
100
1
2
3
4
5
6
7
8
9
10
Rec.
R
at
e
(%)
Angular step (°)
Primary segmentation
Secondary segmentation
90
91
92
93
94
95
96
97
98
99
100
1
2
3
4
5
6
7
8
9
10
Rec.
R
at
e
(%)
Angular step (°)
Primary segmentation
Secondary segmentation
90
91
92
93
94
95
96
97
98
99
100
1
2
3
4
5
6
7
8
9
10
Rec.
R
at
e
(%)
Angular step (°)
Primary segmentation
Secondary segmentation

87
(d)
(e)
(f)
Figure 4.16. Influence of adjusting the angular step (°) for different digit string lengths: (a) 2-Digit
string length (b) 3-Digit string length (c) 4-Digit string length (d) 5-Digit string length (e) 6-Digit string
length (f) 10-Digit string length.
90
91
92
93
94
95
96
97
98
99
100
1
2
3
4
5
6
7
8
9
10
Rec.
R
at
e
(%)
Angular step (°)
Primary segmentation
Secondary segmentation
90
91
92
93
94
95
96
97
98
99
100
1
2
3
4
5
6
7
8
9
10
Rec.
R
at
e
(%)
Angular step (°)
Primary segmentation
Secondary segmentation
90
91
92
93
94
95
96
97
98
99
100
1
2
3
4
5
6
7
8
9
10
Rec.
R
at
e
(%)
Angular step (°)
Primary segmentation
Secondary segmentation

88
Rec.Rate (%)
Angular step (°)
Primary
Secondary
Overall
1
99.05
93.12
96.45
2
99.05
94.29
96.91
3
99.05
94.29
96.91
4
99.05
94.24
96.89
5
99.05
94.18
96.87
6
99.05
93.83
96.73
7
99.05
94.24
96.89
8
99.05
94.24
96.89
9
99.05
94.24
96.89
10
99.05
94.18
96.87
Table 4.4. Recognition rates according to the angular step for each string length.
As we can see, adjusting the angular step from 1 to 10 (°) does not affect considerably the
performance of the proposed system whatever the length of the digit string (Figure 4.16).
However, the best improvement is obtained when the angular step is fixed to 2 or 3 providing
an average recognition rate 94.29% for the secondary segmentation and an overall recognition
rate of 96.91% (Table 4.4).
In order to show the impact of selecting the appropriate angular step, Figure 4.17 illustrates the
correct and incorrect segmentation when the angular step is fixed to 3 and 6°, respectively.
When the angular step is correctly adjusted (Figure 4.17a), the Upper BP and Lower BP are well
detected into the SWRT and therefore the separation of the two adjacent digits is well
performed. In contrast, when the angular step is incorrectly adjusted (Figure 4.17b), the Upper
BP and Lower BP are not well detected, which leads to an incorrect segmentation since the
SWRT is fixed around IP on the wrong orientation angle.
(a) (b)
Figure 4.17. Impact of selecting the angular step for detecting the cutting path: (a) Correct
segmentation when the angular step is fixed to 3°. (b) Incorrect segmentation when the angular step
is fixed to 6°.
Upper BP
IP
Lower BP
W
SWRT

89
4.5.5. Comparative analysis
In order to compare our system with others systems, we also used the same string lengths
images database (NIST NSTRING SD19) for our experiments. This database is used in Refs.
[85, 59,88-89], so we recall these systems which will compare with our system in terms of
correct segmentation and recognition. Table 4.5 shows that the results of our method compare
to other methods in the literature.
String length
#Strings Oliveira et al.
method [59]
Britto et al.
[85]
Oliveira and
Sabourin [88]
Sadri et al.
[89]
Our
approach
2-Digit
2370
96.88
94.80
97.67
95.05
99.01
3-Digit
2385
95.38
91.60
96.26
91.43
97.30
4-Digit
2345
93.38
91.30
94.28
91.07
96.56
5-Digit
2316
92.40
88.30
94.00
88.05
95.95
6-Digit
2169
93.12
89.10
93.80
88.69
96.65
10-Digit
1217
90.24
86.90
91.38
86.13
96.01
Overall
-
93.57
90.33
94.57
90.07
96.91
Table 4.5. Comparative analysis of various segmentation systems for the unknown-length string
performed on NIST NSTRING SD19
In this table, we compared the results of our proposed system by considering unknown-length
digit strings with the best results of similar systems. Moreover, all parameters are fixed. The
proposed system achieved overall average rates of 96.91% for all string length. It attempted to
find the best segmentation on digit strings database. The correct recognition rate for string
lengths of 2,3,4,5,6 and 10 was 99.01%, 97.30%, 96.56%, 95.95%, 96.65%, and 96.01%
respectively. Relative results clearly show how our proposed system outperforms the state-of-
the-art results. The benefit of this technique is that it can properly segments the spaced,
sloped, overlapped and connected digits in digit string as well as unknown-length digit string.
However, apply successively CA, SCA and DRV for analyzing and verifying the SC allow
segmenting the digit string without any information about the length of the string.
The DRV on isolated digits is designed in order to recognize single digits one at a time, so they
do not use contextual knowledge in their decisions such as the comparison of connected or
overlapped components, or comparing the relative sizes (length digit strings) or the relative
positions of the various components.
Some examples of correctly and incorrectly recognized strings of our system on handwritten
digit strings from NSTRING SD19 indicate three possible results as output: correct segmentation

90
and correct recognition, wrong segmentation and/or wrong recognition. Table.4.6 presents
some examples of correctly and incorrectly recognized strings.
Digit string
Class labels
Segmented
digits
Assigned
classes
Correct
segmentation and
recognition
03283
03283
75434
75434
45537
45537
5601
5601
Incorrect
segmentation
9898
90898
84097
84297
Incorrect recognition
13211
13210
72937
12931
Table 4.6. Examples of the correct and incorrect segmentation-recognition produced by the proposed
system from the NSTRING SD19 Database.
4.5.6. Computation cost of the proposed system
The proposed digit string recognition system allows resolving different segmentation problems
in order to select the most appropriate one. Since this system can be deployed in real
environment, it is interesting to compute the cost required for treating all situations. Hence, the
computation cost is measured by considering three situations:
Spaced digit: the total Number of Segmentation Hypotheses using HVP (N
) is the same as
the number of the spaced digits.
Overlapped digit: Let N
be the Number of Grouped Broken Component provided by CA when
the broken component exist and N
the total Number of Isolated Digit provided by CA, then,
the total Number of Segmentation Hypotheses from CA (N ) is N
= N
+ N
Connected digit: Let N
be the Number of GC and N
the total Number of SC, then the total
Number of Segmentation Hypotheses from SWRT (N
) is calculated as N
= N
+ N .
Finally, the Total Number of Segmentation Hypotheses (N ) is the sum of all numbers of
segmentation hypotheses taking the following form:
N
= N (N
+ N
+ N
) (4.7)

91
Where N
is the number of digits contained into a string. By substituting N
, N
and N
,
Eq. 4.7 becomes:
N
= N . (N
+ N
+ N
+ N
+ N ) (4.8)
Eq. 4.8 clearly indicates that the computation cost depends of the digit string length and the
complexity of the connected digits for finding the best cutting path.
44.6. Summary
The goal of this paper was to present a full explicit segmentation system for handwritten digit
strings based on a combination of several segmentation methods depending on the
configuration link between digits. Hence, three segmentation methods have been combined,
which are HVP, CA and SWRT used conjointly with SCA and DRV for verification and
recognition.
The benefit of the proposed system is its ability to properly segment the spaced, sloped,
overlapped and connected digits without any information about the length of the string when
performing successively CA, SCA and DRV for analyzing and verifying the SC.
The obtained performance of the proposed segmentation-recognition system of digit strings
depends on three different factors: the accuracy of SVM-OAA and robustness of the DRV, the
used segmentation methods and, finally, the combination strategy with some heuristic rules.
In some cases, the unknown-length of the digit strings makes the task for classifiers more
difficult to recognize many outliers leading to treat, for example, digit strings as valid digits
[89]. However, by combining HVP, CA and SWRT, the proposed system allows reducing the
confusion of the segmentation-recognition.

92
CC
ONCLUSION
The main focus of this thesis was the development of a Automatic Handwritten Digit String
Recognition (AHDSR), where the main efforts were focused towards the problems of
segmenting adjacent digits inherent to the variability of the digits, broken digits, overlapping
digits, connected digits, unknown length of the string, under-segmentation and over-
segmentation. The improvement of the performance of such a system is not a trivial problem
at all.
We have proposed a SVM-based segmentation-verification system for segmenting two
connected handwritten digits using the oriented sliding window and SVM classifiers in order to
find the best cut for isolating two adjacent digits. Hence, the design of the proposed
segmentation-verification system can be conducted into four main steps: segmentation of
connected digits by searching the interconnection points using the oriented sliding window,
generation of features on isolated digits, recognition of the segmented digits using the SVM-
OAA implementation and verification for accepting or rejecting the digit recognition. More
precisely, the segmentation of connected digits uses two complementary structural features
(contour and skeletal points) for detecting the interconnection and base points and finding the
optimal path. Experimental results showed that the performance of the system depends on the
selection of the appropriate orientation angle. Indeed, obtained results clearly show that the
recognition rate increases when selecting the appropriate angle. Therefore, choosing an optimal
angle is a crucial step for cutting optimally two connected digits. When comparing the proposed
approach to the state of the art, it can note that it constitutes a tradeoff between the correct
segmentation and the number of the segmentation cuts. However, the proposed system does
not provide satisfactory results in some cases caused by two main problems:
The first problem is caused by the robustness of the classifier. Indeed, although, the
segmentation is well performed, the classifier does not allow recognizing the segmented
digit due to its particular shape. To resolve this problem, it is possible to add this particular
digit into the training set for improving the performance of SVM classifier.
The second problem is caused by "no obvious segmentation point". In this case, the
assumption 3 (No IP) is often applied for segmenting connected digits. This means that the
segmentation path in the middle does not always provide acceptable results.
In order to extend the previous system to recognize the digit string for the unknown-length, a
new design is proposed on the combination of several explicit segmentation methods according
on the configuration link between digits. Three segmentation methods are combined based on
the histogram vertical projection, contour analysis and the Sliding Window Radon Transform
(SWRT) for recognizing unknown-length handwritten digit strings with the help of verification

93
strategy. The main idea of the proposed segmentation algorithm is to introduce a super cutting
path in order to avoid the over-segmentation and the under-segmentation. The verification
strategy uses the recognition module for evaluating each segmented component according to
the configuration link in order to provide optimal decisions in difficult situations.
Generally, the first results obtained are encouraging since the proposed system allows
managing all the possible ways of segmentation. This combination uses few rules and has the
advantage of providing a correct segmentation without using the contextual knowledge.
Future Works
Although this works proposes various techniques for improving the performance of the
segmentation, some segmentation problems remain difficult to resolve. Hence, some future
works can be outlined such as:
Include the investigation of other features as well as a combination of different classifiers to
further improve the recognition rates. A feature selection mechanism to identify the most
appropriate subset of features for this problem would also be interesting to explore.
Reduce the number of segmentation hypotheses in order to reduce the complexity of the
system and therefore becomes more faster.
Explore the way to use a filter for eliminating the unnecessary segmentation hypothesis in
order to reduce the computation time.

94
RR
EFERENCES
[1]. G. Dimauro, S. Impedovo, G. Pirlo and A. Salzo, Automatic Bankcheck processing. A New Engineered
System, International Journal of Pattern Recognition and Artificial Intelligence 11(4) (1997) 467-504.
[2]. M. Cheriet, Y. Al-Ohali, N. E. Ayat and C. Y. Suen , Arabic Cheque Processing System: Issues and
Future Trend. Advances in Pattern Recognition, ed. B.B. Chaudhuri (Springer Verlag, 2007), pp. 213­
232.
[3]. M. Suwa, Segmentation of connected handwritten numerals by graph representation, in Proc. Eighth
International Conference on Document Analysis and Recognition (ICDAR'05), Seoul, 2005, vol. 2, pp.
750­754.
[4]. J. R. Ward and T. Kuklinski, A model for variability effects in handprinted with implication for the
design of handwritten character recognition system, IEEE Trans. Man Cybernetics 18(1988), pp.438­
451.
[5]. M. Shridhar and A. Badreldin, Recognition of Isolated and Simply Connected Handwritten Numerals,
Pattern Recognition 19 (1) (1986) 1­12.
[6]. M. Shridhar and A.Badreldin, Context-directed segmentation algorithm for handwritten numeral
strings, Image and Vision Computing 5 (1) (1987) 3­9.
[7]. B. K. Jang and R. T. Chin, One-pass parallel thinning: Analysis, properties, and quantitative
evaluation, IEEE Transactions on Pattern Analysis and Machine Intelligence 14 (11) (1992) 1129­
1140.
[8]. E. Vellasques, L. S. Oliveira, Jr. A. S. Britto, A. L. Koerich and R. Sabourin, Filtering segmentation
cuts for digit string recognition, Pattern Recognition 41(10) (2008) 3044­3053.
[9]. F. C. Ribas, L. S. Oliveira, Jr. A. S. Britto and R. Sabourin, Handwritten digit segmentation: a
comparative study, International Journal on Document Analysis and Recognition 16 (2) (2012) 127­
137.
[10]. E. Lecolinet and O. Barret. Cursive word recognition: methods and survey. In S. Empedovo Editor,
Fundamentals in Handwriting Recognition, volume 24 of NATO_Advanced Institute, Series F
Springer-Verlag, pp.12-166.
[11]. P.M. Lallican, C. Viard-Gaudin, and S. Knerr, From Off-Line to On-Line Handwriting Recognition, Proc.
Int'l Workshop Frontiers in Handwriting Recognition, Amsterdam, 2000, pp. 303-312.
[12]. J. Kanai and A. D. Bagdanov, Projection profile based skew estimation algorithm for JPIG compressed
images, Int. J. Document Anal. Recognit., 1 (1) (1998) 43­51.
[13]. B. Yu and A. K. Jain, A robust and fast skew detection algorithm for generic documents, Pattern
Recognition 29 (10) (1996) 1599­ 1630.
[14]. A. Hashizume, P. S. Yeh, and A. Rosenfeld, A method of detecting the orientation of aligned
components, Pattern Recognit. Lett. 4 (1986) 125­132.
[15]. R. M. Bozinovic and S. N. Srihari, Off-line cursive script word recognition, IEEE Trans. Pattern Anal.
Machine Intell. 11 (1989) 68­83.
[16]. D. Guillevic and C. Y. Suen, Cursive script recognition: A sentence level recognition scheme, in Proc.
Int. Workshop Frontiers in Handwriting Recognit., 1994, pp. 216­223.
[17]. Kornai, K. M. Mohiuddin, and S. D. Connell, "Recognition of cursive writing on personal checks," in
Proc. of the 5th International Workshop on Frontiers in Handwriting Recognition, Essex, U.K., 1996,
pp.373­378.
[18]. M. Cheriet, N. Kharma, C.L. Liu and C.Y. Suen, Character Recognition Systems, a Textbook for
Students and Practitioners, Edition John Wiley & Sons Inc. p.326. 2004.

95
[19]. N. W. Strathy. A method for segmentation of touching handwritten numerals.Master's thesis,
Concordia University, Montreal - Canada, Setember 1993.
[20]. G.Nemeth and K. Palagyi, ,Topology Preserving Parallel Thinning Algorithm, International Journal of
Imaging System and Technology, 21 (1) (2011) 37-44.
[21]. P.A. Devijver and J. Kittler,Pattern recognition, a statistical approach, Prentice Hall, London, pp. 480,
1982.
[22]. K. K.Kim, J. H. Kim, and C. Y. Suen, Segmentation-based recognition of handwritten touching pairs
of digits using structural features, Pattern Recognition 23 (1) (2002) 13-24.
[23]. D.Y.H. Yan, Separation of touching handwritten multi-numeral strings based on morphological
structural features, Pattern Recognition 34 (3) (2001) 587-599.
[24]. J. Mantas, Methodologies in pattern recognition and image analysis - a brief survey, Pattern
Recognition 20 (1) (1987) 1­6.
[25]. V. N. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, London, UK, 1995.
[26]. R. Payam, T. Lei and L. Huan, Cross Validation. In Encyclopedia of Database Systems, ed. Springer,
(2009), pp. 532­538.
[27]. J. Weston and C. Watkins, Multi-class support vector machines for multi-class pattern recognition, in
Proc. European Symposium on Artificial Neural Networks. Bruges, Belgium, 1999, pp. 219­224.
[28]. C. Hsu and C. Lin, A comparison of methods for multi-class support vector machines, IEEE Trans. on
Neural Networks 13 (2) (2002) 415­425.
[29]. Y. Guermeur, A. Elisee and H. PaugamMoisy, A New multiclass SVM based on a uniform convergence
result, in Proc. of International Joint Conference on Neural Networks. Como, Italy, 2000, vol.4, pp.
183-188.
[30]. V. N. Vapnik, An Overview of Statistical Learning Theory, IEEE Trans. on Neural Networks. 10 (5)
(1999) 988­999.
[31]. J. C. Platt, N. Cristianini, and J. Shawe-Taylor. Large margin DAGs for multiclass classification. In S.
A. Solla, T. K. Leen, and K.-R. Müller, editors, Advances in Neural Information Processing Systems,
The MIT Press, Cambridge, MA, 2000, pp. 547­553.
[32]. Y. LeCun , L. Jackel , L. Bottou , A. Brunot , C. Cortes , J. Denker , H. Drucker , I. Guyon , U. Müller ,
E. Säckinger , P. Simard and V. Vapnik, Comparison of learning algorithms for handwritten digit
recognition, in: F. Fogelman-Soulié, P.Gallinari (Eds.), Proceedings of the International Conference on
Artificial Neural Networks, Nanterre, France, pp. 53­60, 1995.
[33]. L. Heutte, T. Paquet, J.V. Moreau, Y. Lecourtier and C. Olivier, A structural/statistical feature based
vector for handwritten character recognition, Pattern Recognition Lett., vol. 19, n°7, pp. 629­641,
1998.
[34]. J. Cai, and Z.Q. Liu, Integration of structural and statistical information for unconstrained
handwritten numeral recognition, IEEE Trans. on Pattern Analysis and Machine Intelligence, 21 (3)
(1999) 263-270,.
[35]. J.X. Dong, A. Krzyzak and C.Y. Suen, A multi-net learning framework for pattern recognition,
Proceedings of the Sixth International Conference on Document Analysis and Recognition, Seattle,
pp. 328­332, 2001.
[36]. L. N. Teow and K. F. Loe, Robust vision-based features and classification schemes for off-line
handwritten digit recognition, Pattern Recognition 35 (11) (2002) 2355­ 2364.
[37]. S. Belongie, J. Malik and J. Puzicha, Shape matching and object recognition using shape contexts,
IEEE Trans. Pattern Anal. Mach. Intell. 24 (4) (2002) 509­522.
[38]. F. Lauer, C. Y. Suen and G. Bloch, A trainable feature extractor for handwritten digit recognition,
Pattern Recognition 40 (6) (2007) 1816­1824.
[39]. C. L. Liu, K. Nakashima, H. Sako, and H. Fujisawa, Handwritten Digit Recognition: Benchmarking of
State-of-the-Art Techniques, Pattern Recognition 36 (10) (2003) 2271-2285.
[40]. M. Diem, S. Fiel, A. Garz, M. Keglevic, F. Kleber and R. Sablatnig, ICDAR 2013 Competition on
Handwritten Digit Recognition (HDRC 2013), In Proc. of the 12th Int. Conference on Document
Analysis and Recognition (ICDAR) 2013, pp. 1454-1459, 2013.

96
[41]. A. Britto, R. Sabourin, F. Bortolozzi, C.Y.Suen. « Complementary features combined in an HMM-based
system to recognize handwritten digits ». In 12th International Conference on Image Analysis and
Processing(ICIAP), Mantova, Italy, pp. 670-675, 2003.
[42]. Y. Chen and J.F. Wang, Segmentation of Single- or Multiple-Touching Handwritten Numeral String
Using Background and Foreground Analysis, IEEE Transactions on Pattern Analysis Machine
Intelligence 22 (11) (2000) 1304-1317.
[43]. G. Aubertet, and P. Kornprobst, Traitement des images numériques, In J. Akoka and I. Comyn-
Wattiau, editors, Encyclopédie de l'informatique et des systèmes d'information, Vuibert, vol. 18, n°6,
pp. 861­879,2006.
[44]. H. Nishida, Curve description based on directional features and quasi-convexity/concavity. Pattern
Recognition 28 (7) (1995) 1045-1051.
[45]. M.K. Hu, Visual problem recognition by moment invariant, IRE Trans. Inform. Theory, IT-8 (1962)
179-187.
[46]. M.K. Hu, Pattern recognition by moment invariants, proc. IRE, vol. 49, p.1428, 1961.
[47]. A.V.N. Manjunath, K. G. Hemantha and S. Noushath, Robust Unconstrained Handwritten Digit
Recognition using Radon Transform, in International Conference on Signal Processing,
Communications and Networking (ICSCN), 22-24 Feb, 2007. pp. 626-629.
[48]. O. R. Terrades and E. Valveny, Radon Transform for Linear Symbol Representation , Proc. of the
Seventh International Conference on Document Analysis and Recognition (ICDAR 2003) ,vol.1,
Edinburgh, Scotland, 3-6 August, pp. 195-199, 2003.
[49]. E.C. Kintner, On the mathematical properties of the Zernike polynomials, Opt. Acta. 23 (8) (1976)
679­680.
[50]. L.S. Oliveira, Automatic recognition of handwritten numerical strings, Thesis of PhD, Ecole de
technologie superieure, University of Quebec. Canada, pp. 56-58,184, 2003.
[51]. E.J. Candès, and D.L. Donoho, Ridgelets: A Key to Higher-dimensional Intermittency, Philosophical
Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 357 (1760)
(1999) 2495­2509.
[52]. G. Y. Chen, , T. D. Bui, and A. Krzyzak, "Rotation invariant feature extraction using Ridgelet and
Fourier transforms," Journal of Pattern Analysis and Application, vol. 9, pp.83-93, 2006.
[53]. S. Arivazhagan, S.S. Priyadharshini and J.R. Sekar, Iris recognition using Ridgelet transform,
International Conference on Recent Advancements in Electrical, Electronics and Control Engineering
(ICONRAEeCE), pp. 286 ­ 290 ,2011.
[54]. J. Favata and G. Srikantan, A Multiple Feature/Resolution Approach To Handprinted Digit and
Character Recognition, International journal of imaging systems and technology 7 (4) (1996) 304­
311.
[55]. A. Gattal and Y. Chibani, Combination of features generation methods for recognition of normalized
and not normalized isolated handwritten digits, Business Intelligence and Mobile Technology
Research: An Information Systems Engineering Perspective, ed. (Laouar and Eom), Cambridge
Scholars Publishing, (2013), pp. 118­132.
[56]. P. J. Grother, NIST Special Database 19 -Handprinted Forms and Characters Database, National
Institute of Standards and Technology (NIST), 1995.
[57]. A. Gattal, Y. Chibani, C. Djeddi and I. Siddiqi, Improving Isolated Digit Recognition using a
Combination of Multiple Features, in Proc. of 14th International Conference on Frontiers in
Handwriting Recognition (ICFHR-2014), Crete Island, Greece ,2014, pp. 446­451.
[58]. J. Kittler and J. Illingworth, Minimum Error Thresholding, Pattern Recognition 19 (1) (1986) 41-47.
[59]. L. S. Oliveira, R. Sabourin, F. Bortolozzi and C. Y. Suen, Automatic Recognition of Handwritten
Numerical Strings: A Recognition and Verification Strategy, IEEE Transactions on Pattern Analysis
and Machine Intelligence ­ PAMI 24 (11) (2002) 1438­1454.
[60]. X. Wang, V. Govindaraju and S. Srihari, Holistic recognition of touching digits, in Proc. Sixth
International Workshop on Frontiers in Handwriting Recognition. Daejeon, 1998, pp. 295­303.

97
[61]. G. Congedo, G. Dimauro, S. Impedovo and G. Pirlo, Segmentation of Numeric Strings, in Proc. of
Third Int. Conf. on Document Analysis and Recognition. Montreal, Canada, 1995, vol. 2, pp. 1038­
1041.
[62]. H. Fujisawa, Y. Nakano and K. Kurino, Segmentation Methods for Character Recognition: From
Segmentation to Document Structure Analysis, Proceedings of the IEEE 80 (7) (1992) 1079­1091.
[63]. Z. Shi and V. Govindaraju, Segmentation and recognition of connected handwritten numeral strings,
Pattern Recognition 30 (9) (1997) 1501­1504.
[64]. A. Elnagar and R. Alhajj, Segmentation of connected handwritten numeral strings, Pattern
Recognition 36 (3) (2003) 625­634.
[65]. U. Pal, A. Belaid and C. Choisy, Touching numeral segmentation using water reservoir concept,
Pattern Recognition Letters 24 (2003) 261­272.
[66]. L. S. Oliveira, E. Lethelier, F. Bortolozzi, R. Sabourin, A new approach to segment handwritten digits,
in Proc. 7th International Workshop on Frontiers in Handwriting Recognition. Amsterdam,
Netherlands, 2000, pp. 577­582.
[67]. R. Fenrich and S. Krishnamoorthy, Segmenting diverse quality handwritten digit strings in near real-
time, in Proc. of 5th USPS Advanced Technology Conference. Washington, D.C., 1990, pp. 523­537.
[68]. K.K. Kyung, K. J. Ho and Y. S. Ching, Segmentation-based recognition of handwritten touching pairs
of digits using structural features, Pattern Recogn. Lett. 23 (1­3) (2002) 13­24.
[69]. Y. Lei, C. S. Liu, X. Q. Ding and Q. Fu, A recognition based system for segmentation of touching
handwritten numeral strings. Proc. of 9th International Workshop on Frontiers in Handwriting
Recognition (IWFHR-09), Tokyo, Japan, 2004, pp.294-299.
[70]. R. Ma, Y. Zhao, Y. Xia and Y. Yan, A Touching Pattern-oriented Strategy for Handwritten Digits
Segmentation, Proc. of the Intl. Conf. on Computational Intelligence and Security (CIS), Suzhou,
China, vol.1, 2008, pp. 174­179.
[71]. Y.J. Wang, X.B. Liu and Y.D. Jia, Statistical Modeling and Learning for Recognition-Based
Handwritten Numeral String Segmentation, Proc. of 10th International Conference on Document
Analysis and Recognition (ICDAR-10), Barcelona, Spain, 2009, pp. 421­425.
[72]. B. L. Everton and A. B. M. Carlos, Segmentation of connected handwritten digits using Self-
Organizing Maps, Expert systems with applications 40 (2013) 5867-5877.
[73]. A. Gattal and Y. Chibani, Segmentation Strategy of Handwritten Connected Digits (SSHCD), in Proc.
of 16th International Conference on Image Analysis and Processing (ICIAP'11) Ravenna, Italy, 2011,
pp. 248­254.
[74]. A. Gattal and Y. Chibani, Segmentation and recognition strategy of handwritten connected digits
based on the oriented sliding windows, in Proc. of International Conference on Frontiers in
Handwriting Recognition. Bari, Italy, 2012, pp. 297-301.
[75]. K. M. Hussein, A. Agarwal, A. Gupta and P. S. P. Wang, A Knowledge-based algorithm for enhanced
recognition of handwritten courtesy amounts, Pattern Recognition 32 (1999) 305­316.
[76]. A. S. Britto, R. Sabourin, F. Bortolozzi and C. Y. Suen, Foreground and Background Information in an
HMM-based Method for Recognition of Isolated Characters and Numeral Strings, in Proc.
International Workshop on Frontiers in Handwriting Recognition. Tokyo, Japan, 2004, pp. 371­376.
[77]. A. Ko, R. Sabourin and Jr. A. Britto, Ensemble of HMM classifiers based on the Clustering Validity
Index for a Handwritten Numeral Recognizer, Pattern Analysis and Applications 12 (01) (2009) 21­
35.
[78]. A. Gattal and Y. Chibani, SVM-Based Segmentation-Verification of Handwritten Connected Digits
Using the Oriented Sliding Window, International Journal of Computational Intelligence and
Applications (IJCIA), 14 (1) (2015) 1­17
[79]. L. S. Oliveira, N. Benahmed, R. Sabourin, F. Bortolozzi and C. Y. Suen, Feature Subset Selection
Using Genetic Algorithm for Handwritten Digit Recognition, in Proc. XIV Brazilian Symposium on
Computer Graphics and Image Processing. Florianopolis, Brazil, 2001, pp. 362-369.
[80]. B. Al-Badr and S. A. Mahmoud, Survey and bibliography of Arabic optical text recognition, Signal
Processing. 41 (1) (1995) 49­77.

98
[81]. A. S. Britto, R. Sabourin, F. Bortolozzi and C. Y. Suen, The recognition of handwritten numeral strings
using a two-stage HMM-based method, International Journal on Document Analysis and Recognition
5 (2003) 102­117.
[82]. R. M. Suresh, S. Arumugam, Fuzzy technique based recognition of handwritten characters, Image
and Vision Computing 25 (2) (2007) 230­239.
[83]. V.K. Madasu, M.H.M. Yusof, M. Hanmandlu, K. Kubik, Automatic extraction of signatures from bank
cheques and other documents, Proc. of the Seventh International Conference on Digital Image
Computing: Techniques and Applications (DICTA-07), Sydney, Australia, 2003, pp.591­600.
[84]. L. Lam, C. Y. Suen, Structural classification and relaxation matching of totally unconstrained
handwritten zip-code numbers, Pattern Recognition, 21 (1) (1988) 19-31.
[85]. A.S. Britto, R. Sabourine, F. Bortolozzi, C.Y. Suen, Recognition of handwritten numeral strings using a
two-stage HMM-based method, Int. J. Doc. Anal. Recognition 5 (2­3) (2003) 102­117.
[86]. V. Alessandro, J. Luettin, A new normalization technique for cursive handwritten words. Pattern
Recognition Letters. 22 (9) (2001) 1043-1050.
[87]. Z. Lu, Z. Chi, W.C. Siu, P. Shi, A background-thinning-based approach for separating and recognizing
connected handwritten digit strings, Pattern Recognition, 32 (6) (1999) 921­933.
[88]. L.S. Oliveira, R. Sabourin, Support vector machines for handwritten numerical string recognition,
Proc. of the International Workshop on Frontiers in Handwriting Recognition (IWFHR-09), Tokyo,
Japan, 2004, pp. 39­44.
[89]. J. Sadri, C.Y. Suen, T.D. Bui, A genetic framework using contextual knowledge for segmentation and
recognition of handwritten numeral strings. Pattern Recognition, 40 (3) (2007) 898­919.
[90]. T. Kanungo, R. M. Haralick, Character recognition using mathematical morphology. Proc. of USPS
Fourth Advanced Technology Conference, Washington, D.C., 1990, Vol. 2, pp. 973-986.
[91]. T.M. Ha, M. Zimmermann, H. Bunke, Off-line handwritten numeral string recognition by combining
segmentation-based and segmentation-free methods. Pattern Recognition, 31 (3) (1998) 257­272.

99
AA
BOUT THE AUTHOR
B
IOGRAPHY
Abdeljalil Gattal was born in Algeria. He received his BS degree in
Computer Science from University of Skikda (Algeria) in 2004, and
MS degree in Computer Science "Information and knowledge
systems" from Abbes Laghrour University of Khenchela (Algeria) in
2009. Currently, He is a PhD student at the Ecole nationale
Supérieure d'Informatique (ESI-Algeria) in Computer Science. He is
working as an Assistant Professor at the Department of Computer
and Mathematics Science in University of Tebessa (Algeria) since
2009. He participated in the organization of several national and
international conferences, which were held at the University of
Tebessa. He collaborated, as member in several research projects
CNEPRU and PNR. He supervised many Master and License
students. He has published a number of papers. His research interests include Image Analysis,
Pattern Recognition and Recognition of handwriting. Currently, he is member of
Laboratoire de Mathématiques, d'Informatique et des Systèmes (LAMIS) of the University of
Tebessa.
SCIENTIFIC CONTRIBUTIONS
1. Journal Papers
Abdeljalil Gattal and Youcef Chibani," SVM-Based Segmentation-Verification of Handwritten
Connected Digits Using the Oriented Sliding Window", International Journal of
Computational Intelligence and Applications (IJCIA), Vol. 14, N° 01, March 2015.
Information about the journal IJCIA
ISSN: 1469-0268
Frequency of publication: 4 numbers per year.
Process reviewing: peer reviewed.
Published by: Brijesh Verma, in all World Scientific and Imperial College Press journals.
Web Site: http://www.worldscientific.com/worldscinet/ijcia
Indexed in: Compendex, CompuScience, DBLP Computer Science Bibliography, Ebsco
Discovery Service, ExLibris Primo Central , Google Scholar ,OCLC WorldCat® , Proquest
Summon , Scopus.

100
22. Book Chapters:
Abdeljalil Gattal, Youcef Chibani, "Combination of features generation methods for recognition of
normalized and not normalized isolated handwritten digits", Business Intelligence and Mobile
Technology Research: An Information Systems Engineering Perspective, ed (Laouar and Eom).
published by Cambridge Scholars Publishing, pp. 118-132.2013
3. Papers published in Lecture Notes
A.GATTAL, Y.CHIBANI, "Segmentation Strategy of Handwritten Connected Digits (SSHCD)", 16th
International Conference on Image Analysis and Processing, ICIAP 11, 14-16 September 2011,
Ravenna, Italy. Proceedings, Part II. LNCS 6979 Springer 2011, ISBN 978-3-642-24087-4.
4. Conference Papers
1. Abdeldjalil GATTAL, Youcef CHIBANI, Hassiba NEMOUR, Mohamed CHERIET: « Combinaison de
méthodes de segmentation automatique des chiffres manuscrits pour la reconnaissance du
montant numérique des cheques bancaires algériens », Journées sur la Gestion Electronique
de Documents et Réseaux Informatiques 2009, Annaba, 20-21 Mai 2009.
2. A. Gattal, Y. Chibani, H. Nemmour et M. Cheriet : « combinaison de méthodes de segmentation
automatique des chiffres manuscrits pour la reconnaissance du montant numérique des chèques
bancaires algériens », The 6th International School on Signal The 6th International Summer
School on Signal Processing & its Applications, ISSSPA 09, 4-8 October 2009, Oran, Algeria.
3. A.Gattal, Y. Chibani, H. Nemmour, « Combinaison de méthodes de segmentation explicite pour la
reconnaissance automatique des chiffres manuscrits », 1ére édition de la conférence Maghrébine
d'Extraction et de Gestion de Connaissances, EGC-M'2010, 13-14 décembre 2010, Alger.
4. A.GATTAL, Y.CHIBANI, « Combining Multiple Segmentation Methods for Handwritten Digit
Recognition of Algerian Bank Checks ». 2nd International Conference on Information Systems
and Technologies (ICIST'2012) March 24-26 2012, Sousse, Tunisia. ISBN 978-9938-9511-2-7
5. A.GATTAL, Y.CHIBANI, « Segmentation and recognition strategy of handwritten connected digits
based on the oriented sliding window ». 13th International Conference on Frontiers in Handwriting
Recognition (ICFHR-2012) September, 2012, Bari, Italy, IEEE Computer Society. ISBN 978-0-
7695-4774-9
6. A.GATTAL, Y.CHIBANI, « Combination of features generation methods to improve recognition of
isolated handwritten digits ». 3rd International Conference on Information Systems and
Technologies - ICIST'2013, March , 2013, Tangier, Morocco, awarded the best paper.
7. Abdeljalil Gattal, " Segmentation Algerian postal check: Automatic extraction of numeral
amounts", The 2nd International Conference on Software Engineering and New Technologies
"ICSENT - 2013", Hammamet, Tunisia, December 21-23, 2013.
8. Chawki Djeddi, Labiba-Souici Meslati, Imran Siddiqi, Abdelllatif Ennaji, Haikal El Abed, Abdeljalil
Gattal," Evaluation of Texture Features for Offline Arabic Writer Identification", 11th IAPR
International Workshop on Document Analysis Systems (DAS-2014) ,2014. pp. 106-110
9. Abdeljalil Gattal, Youcef Chibani, Chawki Djeddi and Imran Siddiqi, "Improving Isolated Digit
Recognition using a Combination of Multiple Features", 14th International Conference on Frontiers
in Handwriting Recognition (ICFHR-2014) September, 2014, Crete Island, Greece, IEEE
Computer Society. ISBN 978-1-4799-4335-7, pp. 446- 451
10. Chawki Djeddi, Abdeljalil Gattal, Labiba Souici-Meslati, Imran Siddiqi, Youcef Chibani and Haikal El
Abed, "LAMIS-MSHD: A Multi-Script offline Handwriting Database", 14th International Conference
on Frontiers in Handwriting Recognition (ICFHR-2014) September, 2014, Crete Island, Greece,
IEEE Computer Society. ISBN 978-1-4799-4335-7,pp. 93-97

101
11. Chawki Djeddi, Somaya Al-Máadeed, Abdeljalil Gattal, Imran Siddiqi, Labiba Souici-Meslati, Haikal
El Abed, "ICDAR2015 competition on Multi-script Writer Identification and Gender Classification
using 'QUWI' Database", 13th International Conference on Document Analysis and Recognition,
(ICDAR 2015) August, 2015, Tunis, Tunisia, pp. 1191-1195
12. Abdeljalil Gattal, Youcef Chibani, Bilal Hadjadji, Hassiba Nemmour, Imran Siddiqi, Chawki Djeddi,
"Segmentation-verification based on fuzzy integral for connected handwritten digit recognition",
International Conference on Image Processing Theory, Tools and Applications, (IPTA 2015),
Orleans, France, November, 2015, pp. 588-591
13. Abdeljalil Gattal, Chawki Djeddi, Youcef Chibani, Imran Siddiqi, "Isolated Handwritten Digit
Recognition Using oBIFs and Background Features", 12th IAPR International Workshop on
Document Analysis Systems (DAS-2016), Santorini, Greece, 2016. pp. 305-310
55. Research projects
2012-2014: PNR Project: Traitement Automatique des Chèques Postaux Algériens.
2011-2014: CNEPRU Project: Développement d'un système pour la reconnaissance de l'écriture
manuscrite dans les documents images. Project Code: J0200220100015
2015-2017: CNEPRU Project: Techniques d'analyse biométrique des documents manuscrits.
Project Code: J0200220140009
P
ARTICIPATION IN SPECIALIZED COMPETITION
2013
Eighth position for The ICDAR 2013"Competition on Handwritten Digit Recognition". The results of
the competition was presented in the following article:
M. Diem, S. Fiel, A. Garz, M. Keglevic, F. Kleber, R. Sablatnig. "ICDAR2013 Competition on
Handwritten Digit Recognition (HDRC 2013)". In Proceedings of the 12th International Conference
on Document Analysis and Recognition, Washington DC, USA, pp. 1454-1459, 2013.
2014
Fourth position for The ICFHR 2014(Second in CVL dataset, fifth in CAR A Database, fifth in CAR B
Database)"Competition on Handwritten Digit String Recognition in Challenging Datasets ". The
results of the competition was presented in the following article:
M.Diem, S. Fiel, F. Kleber, R. Sablatnig, J.M. Saavedra, D. Contreras,J. M. Barrios, and L.S.
Oliveira. "ICFHR 2014 Competition on Handwritten Digit String Recognition in Challenging Datasets
(HDSRC 2014)". 4th International Conference on Frontiers in Handwriting Recognition (ICFHR-
2014) September, Crete Island, Greece, pp. 779-784, 2014.
Excerpt out of 114 pages

Details

Title
Segmentation-Verification for Handwritten Digit Recognition
College
National Higher School Of Computer Engineering
Author
Year
2016
Pages
114
Catalog Number
V374168
ISBN (eBook)
9783668518094
ISBN (Book)
9783668518100
File size
2336 KB
Language
English
Keywords
Handwritten Digit Recognition, Segmentation, SVM
Quote paper
Abdeljalil Gattal (Author), 2016, Segmentation-Verification for Handwritten Digit Recognition, Munich, GRIN Verlag, https://www.grin.com/document/374168

Comments

  • No comments yet.
Look inside the ebook
Title: Segmentation-Verification for Handwritten Digit Recognition



Upload papers

Your term paper / thesis:

- Publication as eBook and book
- High royalties for the sales
- Completely free - with ISBN
- It only takes five minutes
- Every paper finds readers

Publish now - it's free