A novel prototype generation technique for handwriting digit recognition

doi:10.1016/j.patcog.2013.04.016

Pattern Recognition

Volume 47, Issue 3, March 2014, Pages 1002-1010

https://doi.org/10.1016/j.patcog.2013.04.016 Get rights and content

Highlights

•
A novel prototype generation technique for handwriting digit recognition.
•
An evolutionary approach for improving prototype based classification.
•
Prototype synthesis to build a reduced training set.

Abstract

The aim of this paper is to introduce a novel prototype generation technique for handwriting digit recognition. Prototype generation is approached as a two-stage process. The first stage uses an Adaptive Resonance Theory 1 (ART1) based algorithm to select an effective initial solution, while the second one executes a fine tuning designed to generate the best prototypes.

To this end, the second stage deals with an optimization problem, in which the objective function to be minimized is the cost function associated to the classification. A naive evolution strategy is used to generate the prototype set able to reduce classification time, without greatly affecting the accuracy. Moreover, as the ART1 based algorithm has incremental learning capability, the first stage is also useful for selecting the prototype set according to variations in handwriting style. The classification task is performed by the k-nearest neighbor classifier.

Experimental tests on the MNIST dataset demonstrated that our technique represents a good trade-off among accuracy, classification speed and robustness to handwriting style changes.

Introduction

Handwriting digit recognition has received remarkable attention in the field of character recognition. To meet industry demands, handwriting digit recognition systems must have good accuracy, acceptable classification times and robustness to variations in handwriting style. Currently, several approaches are able to reach competitive performance in terms of accuracy, including the ones based on multilayer neural networks [34], support vector machines (SVMs) [31] and nearest neighbor (NN) methods [5], [23], [29]. Neural networks require huge amounts of training data and time to learn effective models, but their feed-forward nature makes them very efficient during runtime. SVMs, using recent progresses in convex optimization theory to train classifiers, show a simpler training phase than neural networks, and in the test phase SVMs have a complexity which is only a fraction of a brute force k-nearest neighbor (k-NN) model as the number of support vectors generally is a small fraction of the training data. The main issue of these approaches is their low incremental learning capacity. As a matter of fact, conventional neural networks and SVMs must be retrained in order to learn new patterns. Furthermore, when new prototypes need to be learned, these systems generally forget the previous prototypes. Thus, the retraining process should involve all the new samples as well as the old ones in order to guarantee a relatively high level of recognition performance. It means that it should be necessary to combine new and old samples into a unique and large dataset and use it for retraining the classifier. Unfortunately, this is not efficient in terms of both time and space. Hence, in order to avoid retraining, a prototype based classification using the k-NN rule may be the best option for the classification method. In other words, the k-NN classifier combined with a suited prototype generation technique may be able to provide a trade-off among recognition accuracy, classification speed and robustness to handwriting style changes.

When the k-NN classifier is adopted, classifying an unknown input vector basically consists in finding the top k similar vectors in the given training set and identifying the predominant class among these k neighbors.

Therefore, the traditional k-NN rule requires the storage of the whole training set and performs classification based on the closest training samples in the feature space. So, there is a need for a small representative set of prototypes as k-NN algorithms have zero training time, but are usually expensive during runtime. In particular, for large data sets, the k-NN rule can lead to excessive amount of storage and large computation time in the classification stage. A way to mitigate these drawbacks is given by prototype optimization techniques [18], [21], [27]. They are aimed at achieving a representative training set with a lower size compared to the original training set and with a similar or even higher classification accuracy for unknown input patterns. In the literature, two main categories of strategies can be identified: prototype selection and prototype generation. The first category strives to merge the samples from the training set into a small group of prototypes so that the performance of the k-NN rule is optimized. Examples of such techniques are the learning vector quantization algorithm [24], [39], the k-means algorithm [15] and, more recently, for example the works of Garain [20] and Nanni and Lumini [30].

The second category attempts to reduce the initial training set and/or increase the generalization capability of the k-NN classifier. To this purpose, many editing and condensing algorithms have been proposed. Editing algorithms [9], [19], [41] remove those representatives that lead to the misclassification error. This can be done, for example, by removing “outlier” patterns or those patterns that are surrounded mainly by others from different classes. Condensing algorithms [1], [17], [22], [42] try to build a small subset of patterns that is a part of the training set, leaving the nearest neighbor decision boundary substantially unchanged.

This paper presents a new prototype generation technique for improving handwriting digit recognition using the k-NN classifier. It is based on a two-stage process for finding the best prototypes to reduce the k-NN classification time, without greatly affecting the accuracy.

In summary the representative set, we are looking for, should be able to

1)
drastically reduce the k-NN classification time since it depends just on the number of prototypes for each class (and, of course, this number is much smaller than the training data size);
2)
allow the k-NN classification using only the prototypes that have been previously synthesized;
3)
be incrementally adapted to changes in the writing styles by adding new prototypes or modifying the previous ones.

The experimental tests, that have been performed on the MNIST dataset using histograms of oriented gradients as image features and the Sokal and Michener dissimilarity as distance measure [36], demonstrate the effectiveness of the proposed solution compared to other strategies for building reduced prototype sets.

The remaining part of this paper is organized as follows. Section 2 presents the framework of the prototype based classification, focusing on our choice for feature extraction and the way we designed the distance measure. Section 3 describes the ART1 based algorithm to build an initial solution for our prototype generation approach. The naïve evolution strategy for synthesizing prototypes is illustrated in Section 4. The experimental tests and the results are discussed in Section 5. The conclusions are drawn in Section 6.

Section snippets

Prototype based classification

In the feature space representation, each sample consists of a feature vector v. Supposing a distance measure d (d is required to be nonnegative and to fulfill the reflexivity condition: d(v, v)=0, but it might be non-metric [43]), we call v′∈{v₁,…, v_n} a nearest neighbor to v if min d( v, v_i)=d(v, v′) where i=1,…, n. The NN rule chooses to classify v into the class to which the nearest neighbor v′ belongs: v′∈c_n→v∈c_n.

For the k-NN rule, the predicted class of the unknown vector v is set equal

The Adaptive Resonance Theory 1 based algorithm

The Adaptive Resonance Theory (ART) [6] was developed to avoid the stability–plasticity dilemma in competitive networks learning. The stability–plasticity dilemma addresses how to keep learning from new inputs without forgetting previously learned information. ART includes a set of different neural architectures. The first and most basic architecture is ART1 [6]. It is an unsupervised learning model especially designed for working with binary patterns. ART1 systems are robust and have an

Prototype synthesis

Prototype synthesis builds new artificial prototypes from a given collection of data. The process of finding representative samples from a dataset is classified as an NP-hard problem by several authors [13], [44], because there is no polynomial algorithm for solving this problem. Many prototype generation methods have been proposed in the literature [37] and they are useful for different purposes. Here, we are interested in a prototype set able to efficiently enhance the k-NN classification. In

Experimental results

The experiments have been carried out using the MNIST handwritten digit database provided by LeCun et al. [26]. The MNIST training set consists of 60,000 samples with a half from NIST's Special Database 3 (SD-3) and another half from Special Database 1 (SD-1). The MNIST test set consists of 5000 samples from SD-3 and 5000 samples from SD-1. In the original dataset, all digit images are size normalized and centered in a fixed size dimension of 28×28 pixels.

Conclusion

In this paper, we presented a novel prototype generation technique in order to improve handwriting digit recognition with the k-NN classifier. Our technique consists of a two stage method, which first takes advantage of the Adaptive Resonance Theory 1 (ART1) to determine the number of prototypes and select an effective initial solution and, then, uses a naïve evolution strategy to generate the final solution. Based on the built representative set, the k-NN classifier reaches a recognition

Conflict of interest statement

The Authors declare that there is no conflict of interest.

References (44)

G.A. Carpenter et al.
A massively parallel architecture for a self organizing neural pattern recognition machine
Computer Vision, Graphics, and Image Processing
(1987)
O. Déniz et al.
Face recognition using histograms of oriented gradients
Pattern Recognition Letters
(2011)
H.A. Fayed et al.
Self-generating prototypes for pattern classification
Pattern Recognition
(2007)
M. Lozano et al.
Experimental study on prototype optimisation algorithms for prototype-based classification in vector spaces
Pattern Recognition
(2006)
L. Nanni et al.
Particle swarm optimization for prototype reduction
Neurocomputing
(2009)
X.X. Niu et al.
A novel hybrid CNN–SVM classifier for recognizing handwritten digits
Pattern Recognition
(2012)
F. Angiulli
Fast nearest neighbor condensation for large data sets classification
IEEE Transactions on Knowledge and Data Engineering
(2007)
T. Back
Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolution Programming, Genetic Algorithms
(1996)
H. Beiping et al.
Fast human detection using motion detection and histogram of oriented gradients
Journal of Computers
(2011)

S. Belongie et al.

Shape matching and object recognition using shape contexts

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

(2002)

H.-G. Beyer et al.

Evolution strategies: a comprehensive introduction

Journal: Natural Computing

(2002)

G.A. Carpenter et al.

Adaptive resonance theory

R. Chang et al.

A modified editing k-nearest neighbor rule

Journal of Computers

(2011)

D.C. Ciresan, U. Meier, L.M. Gambardella, J. Schmidhuber, Deep Big Simple Neural Nets Excel on Handwritten Digit...

P. Cunningham

A taxonomy of similarity mechanisms for case-based reasoning

IEEE Transactions on Knowledge and Data Engineering

(2009)

N. Dalal, B. Triggs, Histogram of oriented gradients for human detection, in: Proceedings of the IEEE Computer Society...

M. Danesh et al.

Data clustering based on an efficient hybrid of K-harmonic means, PSO and GA

Transactions on Computational Collective Intelligence

(2011)

R.O. Duda et al.

Pattern Classification

(2001)

R.P.W. Duin, P. Juszczak, D. de Ridder, P. Paclík, E. Pękalska, D.M.J. Tax, PR-Tools, A Matlab Toolbox for Pattern...

H.A. Fayed et al.

A novel template reduction approach for the K-nearest neighbor method

IEEE Transactions on Neural Networks

(2009)

R. Gil-Pita et al.

Evolving edited K-nearest neighbor classifiers

International Journal of Neural Systems

(2008)

Cited by (28)

Fast data reduction by space partitioning via convex hull and MBR computation
2022, Pattern Recognition
Citation Excerpt :
Hence, we decided to include RHC in the experimental study of the present paper. In [16], Impedovo et al. present a PG algorithm for handwriting digit recognition. Their technique involves two stages.
Large volumes of training data introduce high computational cost in instance-based classification. Data reduction algorithms select or generate a small (condensing) set of representative training prototypes from the available training data. The Reduction by Space Partitioning algorithm is one of the most well-known prototype generation algorithms that repetitively divides the original training data into subsets. This partitioning process needs to identify the diameter of each subset, i.e., its two farthest instances. This is a costly process since it requires the calculation of all distances between the instances in each subset. The paper introduces two new very fast variations that, instead of computing the actual diameter of a subset, choose a pair of distant-enough instances. The first variation uses instances belonging to an exact 3d convex hull of the subset, while the second one uses instances belonging to the minimum bounding rectangle of the subset. Our experimental study shows that the new variations vastly outperform the original algorithm without a penalty in classification accuracy and reduction rate.
An adaptive fractional-order BP neural network based on extremal optimization for handwritten digits recognition
2020, Neurocomputing
Citation Excerpt :
As a widely studied challenging issue in the field of handwriting recognition, handwritten digits recognition has been used to test the performance of different pattern recognition algorithms such as support vector machines and neural network [1–8].
The optimal generation of initial connection weight parameters and dynamic updating strategies of connection weights are critical for adjusting the performance of back-propagation (BP) neural networks. This paper presents an adaptive fractional-order BP neural network abbreviated as PEO-FOBP for handwritten digit recognition problems by combining a competitive evolutionary algorithm called population extremal optimization and a fractional-order gradient descent learning mechanism. Population extremal optimization is introduced to optimize a large number of initial connection weight parameters and fractional-order gradient descent learning mechanism is designed to update these connection weight parameters adaptively during the evolutionary process of fractional-order BP neural network. The extensive experimental results for a well-known MNIST handwritten digits dataset have demonstrated that the proposed PEO-FOBP outperforms the original fractional-order BP neural network and the traditional integer-order BP neural network in terms of training and testing accuracies.
Personal digital bodyguards for e-security, e-learning and e-health: A prospective survey
2018, Pattern Recognition
Citation Excerpt :
Duplicated or partially duplicated strokes often occur on tablets, since a user duplicates or adds strokes when a stroke is not well captured. Compared to isolated character [90] or word recognition [109–111], text recognition faces the difficulty of word segmentation [112,113] or character segmentation [114]. There are two approaches for segmentation: implicit segmentation and explicit segmentation.
The widespread availability of hand-held devices like tablets, phablets and smart phones, along with their new handwriting digitizing and their increased computing powers, enable these to process the graphomotor dimension and the lognormal trends of human handwriting. By exploiting such capacity, it becomes possible to extend these mobile devices into Personal Digital Bodyguards (PDBs). PDBs will be able to supplement people's sensitive data protection with signature verification, equipment use security with writer authentication and handwritten CAPTCHAs processing (e-security), and to enhance human-machine interaction performances through words spotting and handwriting recognition (e-recognition). For young children, these tools will turn into interactive toys helping them to learn and master their fine motor control and become better writers. For advanced students they will enable sophisticated systems for (e-learning) and (e-testing). Moreover, PDBs will also be able to provide the user with fine motor control monitoring, which can detect stress, aging and health problems (e-health).
This paper presents a prospective survey of various projects dealing with these five e-fields of investigation, focussing on state of the art results and providing directions in research and development, under the theoretical umbrella of the Kinematic Theory of human movements and its Lognormality Principle. From a practical point of view, the concept of lognormality provides a fundamental common thread, an integrative psychophysical standpoint to track the graphomotor problems of signature verification, writer identification, handwriting generation, recognition and learning.
Handwriting recognition of digits, signs, and numerical strings in Persian
2016, Computers and Electrical Engineering
Citation Excerpt :
The recognition rate was improved due to its handwriting synthesis system. In [25] a prototype generation technique is proposed for handwriting digit recognition. An evolutionary approach is used for improving prototype-based classification.
This paper presents an important step towards the standardization of research works on Optical Character Recognition in Persian language. It describes the formations of a standard handwritten database, including isolated digits, isolated signs, multi-digit numbers, numerical strings, courtesy amounts, and postal codes. In this regard, binary images of 72,180 samples were extracted from the designed forms. These forms were filled by 180 writers selected from different ages, genders, and jobs. Then these forms were scanned at 300 dpi with a high-speed scanner. Finally, forms are segmented into samples and are stored in bitmap format. This database is named PHOND, Persian Handwritten Optical Numbers & Digits, and it is available to the research community. Comparisons with the previous related databases illustrate the advantages of PHOND against other databases. Different experiments are done using PHOND database and the results are compared with other research works in handwritten recognition.
Structural Off-line Handwriting Character Recognition Using Approximate Subgraph Matching and Levenshtein Distance
2015, Procedia Computer Science
This research paper proposed alternatives method for off-line handwriting character recognition in structural approach. The task of character segmentation and normalisation is quite costly when processing large data, especially on off-line handwriting recognition with structural approach. The main idea of this paper is to model a handwritten character into string graph representation. The purpose of those model is to provide ability in improving recognition accuracy without relying in normalisation technique. The graph consists of several edges that indicate the connected vertices. The vertices are the curves that make the character. The curve is extracted by analyzing the character's chain code, and it's string feature is created using certain rules. In this paper, the similarity distance between graph is measured using approximate subgraph matching and string edit distance method. The recognition experiment conducted by comparing both methods on alphabet and number character images taken from ETL-1 AIST Database. We also did recognition accuracy comparison with another related works in number and alphabet handwritten character recognition. The recognition accuracy of levenshtein distance is better than approximate subgraph matching. It also has competitive performance with another method in same class.
Very fast variations of training set size reduction algorithms for instance-based classification
2023, ACM International Conference Proceeding Series

View all citing articles on Scopus

Sebastiano Impedovo was born in Putignano (Bari, Italy) on May 17, 1947. He obtained his degree in Physics with honours at the University of Bari in 1972. He soon became Assistant Professor in Electronics, then Associate Professor in Cybernetics in 1981, and in 1987 Full Professor in Operating Systems at the University of Bari. He is an IAPR Fellow, IEEE Senior Member, and a member of ACM, IGS, AICA, S.I.e-L and ANIPLA societies. He has published more than three hundred papers and seven books in the field of handwriting recognition and intelligent systems for document analysis and e-learning. He is a member of the editorial board of the International Journal of Pattern Recognition and Artificial Intelligence and of the International Journal of Document Analysis and Recognition. Sebastiano Impedovo has organized many international conferences, workshops, schools, NATO ASI and international panels on Image Analysis, Document Processing, Tele-teaching and Tele-working. The International Community of Document Analysis and Recognition the past year, in Beijing, proposed S. Impedovo to be Honorary Chair of the next ICDAR 2015. Sebastiano Impedovo was the founder and the first Director of the Computer Science Department of the Bari University, then he was the President of the Computer Science Degree Course and the Coordinator of the Ph.D. course in Computer Science, approved and financed by the European Union. He was also a Member of the Bari University “Senate”. Using consistent founds of the Italian Government and of the European Union, he built the Rete Puglia Centre, a Centre for Tele-teaching and e-learning, where he is serving as President. Sebastiano Impedovo has also been initially a member of the Administrative Council and, then, member of the Scientific Committee of the Tecnopolis Consortium. He has been the President of the Directors Department Council of the Bari University for 10 years since 1996. Now he is also President of the E-learning Committee and President of the Rete Puglia Centre.

Francesco Maurizio Mangini was born in Bari, Italy, on February 11, 1971. He received the Electronic Engineering Degree from Polytechnic of Bari in 1997, “summa cum laude”, with a thesis on beamed microwave power transmission. From November 2000 until March 2010 he worked for IBM as IT Specialist. Currently enrolled as a PhD student for the Department of Computer Science at the University of Bari, his research interests include handwriting character and word recognition.

Donato Barbuzzi received the Computer Science degree “cum laude” in 2011 from University of Bari “Aldo Moro”. He worked from September to December 2011 as a collaborator in the Interfaculty Center “Rete Puglia”. Since 2012 he is a Ph.D. student for the Department of Computer Science at the University of Bari. His current research interest is in the field of multi-expert systems for patter recognition.

View full text

A novel prototype generation technique for handwriting digit recognition

Highlights

Abstract

Introduction

Section snippets

Prototype based classification

The Adaptive Resonance Theory 1 based algorithm

Prototype synthesis

Experimental results

Conclusion

Conflict of interest statement

Computer Vision, Graphics, and Image Processing

Pattern Recognition Letters

Pattern Recognition

Pattern Recognition

Neurocomputing

Pattern Recognition

Fast nearest neighbor condensation for large data sets classification

IEEE Transactions on Knowledge and Data Engineering

Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolution Programming, Genetic Algorithms

Fast human detection using motion detection and histogram of oriented gradients

Journal of Computers

Shape matching and object recognition using shape contexts

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

Evolution strategies: a comprehensive introduction

Journal: Natural Computing

Adaptive resonance theory

A modified editing k-nearest neighbor rule

Journal of Computers

A taxonomy of similarity mechanisms for case-based reasoning

IEEE Transactions on Knowledge and Data Engineering

Data clustering based on an efficient hybrid of K-harmonic means, PSO and GA

Transactions on Computational Collective Intelligence

Pattern Classification

A novel template reduction approach for the K-nearest neighbor method

IEEE Transactions on Neural Networks

Evolving edited K-nearest neighbor classifiers

International Journal of Neural Systems