Contour-based handwritten numeral recognition using multiwavelets and neural networks
Introduction
Handwritten numeral recognition is an important problem of optical character recognition (OCR) [1]. Among all existing techniques, one important approach is to extract the outer contour of the handwritten numeral. Since the contour is periodic, it is well suited for Fourier-based methods. Zahn and Roskies [2] defined the cumulative angular function φ(l) as the net amount of angular bend between the starting point and the point with arc length l. They normalized φ(l) so that it is periodic. Suppose the Fourier expansion isThey defined rotation- and mirror-invariant features and rotation-invariant features Fkj=jαk−kαj, where αk=tan−1(bk/ak). However, they warned that αk is unreliable when Ak is very small. Therefore, Fkj may be unreliable under this condition. Granlund [3] generated a complex-valued function u(t) for the contour. Suppose the Fourier coefficients an is defined asthen he defined scale- and rotation-invariant featuresandwhere k is the greatest common divisor of m and n. He also defined features that are scale-invariant, but depend on rotation.
Spline curve approximation is also frequently used on contour. Paglieroni [4] represented contours in the spatial domain by far fewer B-spline control points than the contour samples. Under proper conditions, there exists a fast transform from contours to these control points. It was shown that discriminant analysis between pairs of normalized and similarly modelled contours can be efficiently performed directly from control points. Taxt et al. [5] approximated the contour by a parametric spline curve. The curvatures are calculated and the values measured at regular intervals of these derived spline curves were used as descriptors in a statistical classification scheme. The features are translation-invariant by nature, but are dependent on rotation. Sekita et al. [6] used splines to approximate the contour and defined the breakpoints as the maximum points of curvature functions of the contours. The contours between adjacent breakpoints are extracted in the form of directed curves which compose the features of a character.
Recently wavelet descriptor is used for printed handwritten character recognition. Wunsch et al. [7] proposed to use wavelet features in combination with feed-forward neural networks. Because they employed the wavelet transform with down-sampling, the wavelet features can get quite different coefficients even if the contour is shifted very little. In addition, they extracted wavelet features from x and y coordinates independently.
In this paper, we present a novel shape descriptor for the recognition of handwritten numerals. The descriptor is derived from the multiwavelet shell expansion of an object's contours. The motivation to use multiwavelet basis is threefold. First, multiwavelets provide a localized frequency representation, which can reflect local properties much better than Fourier-based method. Second, orthonormal multiwavelets provide a natural hierarchical multiresolution representation, and there is substantial evidence that the human visual system use similar multiscale representations. Third and more importantly, the contour is represented by two streams of data (x,y)T, so it is natural to use multiwavelet transform instead of the scalar one. Since we decompose the contour into orthonormal multiwavelet shell, the bad behaviour of down-sampling in pattern recognition can be avoided. To make it more clear, for wavelet transform with down-sampling we can get quite different wavelet coefficients even if we only shift the input signal a few sample points. On the contrary, the orthonormal shell decomposition does not have this problem. In fact, we have successfully used scalar orthonormal shell and Fourier transform to extract invariant features in Ref. [8]. We trained the neural network with 4000 handwritten numerals. The test dataset consists of 2000 handwritten numerals. The handwritten numeral databases are from the Centre for Pattern Recognition and Machine Intelligence at Concordia University (CENPARMI). Experimental results show that our proposed method is better than the wavelet neural network method in Ref. [7].
The paper is organized as follows. Section 2 reviews multiwavelet transforms. Section 3 gives an introduction to orthonormal multiwavelet expansion. Section 4 presents the orthonormal multiwavelet neural network descriptor. Section 5 shows some experimental results. And finally Section 6 gives the conclusion of this paper.
Section snippets
Discrete multiwavelet transform
Multiwavelets are generalization of scalar wavelets [9], [10], [11], [12]. Multiwavelet basis uses translations and dilations of M⩾2 scaling functions {ϕm(t)}1⩽m⩽M and M mother wavelet functions {ψm(t)}1⩽m⩽M. If we write Φ(t)=(ϕ1(t),ϕ2(t),…,ϕM(t))T and Ψ(t)=(ψ1(t),ψ2(t),…,ψM(t))T, then we haveandwhere {Hl}0⩽l⩽L−1 and {Gl}0⩽l⩽L−1 are M×M filter matrices.
As an example, we give the most commonly used multiwavelets developed by Geronimo et al. [9]. Let
The orthonormal shell expansion
In this section we generalize the orthonormal shell expansion, developed in Ref. [14] for scalar wavelet, to the multiwavelet case. Basically speaking, a shell is a multiresolution wavelet decomposition of the original signal where no down-sampling is performed. That means we get the same number of wavelet coefficients for every decomposition scale. It should be mentioned that the coefficients of orthogonal multiwavelet expansions are not shift-invariant. However, if all the multiwavelet
Orthonormal multiwavelet neural network descriptor
Feature selection is the critical step in the recognition process, and what distinguishes OCR methodologies from each other are the types of features selected for representation. In general, good features must satisfy the following requirements: First, intraclass variance must be small, which means that features derived from different samples of the same class should be close. Secondly, the interclass separation should be large, i.e. features derived from samples of different classes should
Experimental results
In order to evaluate the performance of our proposed recognition system, we use a 3-layer feed-forward neural network in our experiments. The number of nodes in each layer is given by 40×20×10. The input layer is not given here because it depends on the input feature size. Our experiments are performed on the CENPARMI handwritten numeral database. This database contains 6000 unconstrained handwritten numerals originally collected from dead letter envelopes by the US postal service at different
Conclusion
In this paper, we introduced a novel set of features that is well-suited for representing digitized handwritten numerals. The features are derived from the multiwavelet shell expansion of the numeral contour. The numeral contours are represented by fine-to-coarse approximations at different resolution levels by means of orthonormal multiwavelet expansion. It is suggested to use the low to intermediate levels of the shell coefficients since these features are relatively insensitive to the shape
Acknowledgements
This work was supported by research grants from the Natural Sciences and Engineering Research Council of Canada and by the Fonds pour la Formation de Chercheurs et l'Aide à la Recherche of Quebec. We would like to thank the referees for careful reading of the manuscript which leads to further improvement of the paper.
About the Author—G.Y. CHEN received the B.Sc. in Applied Mathematics, the M.Sc. in Computing Mathematics, and the M.Sc. in Computer Science as well. He was a research associate at Shenyang Institute of Computing Technology of the Chinese Academy of Sciences during 1989–1994. He worked in Matrox Graphics Inc. during 1999–2001. Currently he is a Ph.D. student at Concordia University, Montreal, Canada, where he received the FCAR and NSERC fellowships. His research interests include pattern
References (18)
- et al.
Fast classification of discrete shape contours
Pattern Recognition
(1987) - et al.
Recognition of handwritten symbols
Pattern Recognition
(1990) - et al.
Wavelet descriptors for multiresolution recognition of handprinted characters
Pattern Recognition
(1995) - et al.
Fractal functions and wavelet expansions based on several scaling functions
J. Approximation Theory
(1994) - et al.
Feature extraction methods for character recognitiona survey
Pattern Recognition
(1996) - et al.
Fourier descriptors for plane closed curves
IEEE Trans. Comput.
(1972) Fourier processing for hand print character recognition
IEEE Trans. Comput.
(1972)- et al.
Feature extraction of handwritten Japanese characters by spline functions for relaxation matching
Pattern Recognition
(1990) - et al.
An orthonormal-shell-Fourier descriptor for rapid matching of patterns in image database
Int. J. Pattern Recognition Artif. Intell.
(2001)
Cited by (40)
Invariant pattern recognition using contourlets and AdaBoost
2010, Pattern RecognitionCitation Excerpt :Their experiments were done for handprinted characters. Chen et al. [4] developed a descriptor by using multiwavelets and neural networks. The multiwavelet features are also extracted from the outer contour of the handwritten numerals and fed into neural networks.
Invariant pattern recognition using radon, dual-tree complex wavelet and Fourier transforms
2009, Pattern RecognitionCitation Excerpt :Their experiments were done on handprinted characters. Chen et al. [7] developed a descriptor by using multiwavelets and neural networks. The multiwavelet features are also extracted from the outer contour of the handwritten numerals and fed into a neural network.
Auto-correlation wavelet support vector machine
2009, Image and Vision ComputingInvariant pattern recognition using ridgelet packets and the Fourier transform
2009, International Journal of Wavelets, Multiresolution and Information ProcessingContour-based feature extraction using dual-tree complex wavelets
2007, International Journal of Pattern Recognition and Artificial Intelligence
About the Author—G.Y. CHEN received the B.Sc. in Applied Mathematics, the M.Sc. in Computing Mathematics, and the M.Sc. in Computer Science as well. He was a research associate at Shenyang Institute of Computing Technology of the Chinese Academy of Sciences during 1989–1994. He worked in Matrox Graphics Inc. during 1999–2001. Currently he is a Ph.D. student at Concordia University, Montreal, Canada, where he received the FCAR and NSERC fellowships. His research interests include pattern recognition, image processing, and neural networks.
About the Author—T.D. BUI is a full professor in the Department of Computer Science, Concordia University, Montreal, Canada. He was the Chair of the Department from 1985 to 1990, and an Associate Vice-Rector Research at the same University from 1992 to 1996. Dr. Bui has published more than 120 papers in many different areas in scientific journals and conference proceedings. He was an invited professor at the Institute per le Applicazioni del Calcolo in Rome under the auspices of the National Research Council of Italy (1978–1979), and a visiting professor at the Department of Mechanical Engineering, the University of California at Berkeley (1983–1984). He has received many research grants and contracts from government and industries. His research interest includes scientific computing, wavelet transforms and applications, and image processing. Dr. Bui is a Fellow of the British Physical Society, a Senior Member of the Society for Computer Simulation, and a Member of the IEEE. He is an associate editor of the Journal Simulation and currently an associate editor of the Transactions on Computer Simulation and Modelling. He has served as a program committee member of the International Conference on Wavelet Analysis and its Applications Dec. 15–20, 2001 in Hong Kong.
About the Author—A. KRZYZAK received the M.Sc. and Ph.D. degrees in Computer Engineering from the Wroclaw University of Technology, Poland, in 1977 and 1980, respectively, and the D.Sc. degree (habilitation) in Computer Engineering from the Warsaw University of Technology, Poland in 1998. In 1980, he became an assistant professor at the Institute of Engineering Cybernetics, Wroclaw University of Technology, Poland. From November 1982 until July 1983, he was a postdoctorate fellow receiving the International Scientific Exchange Award in the School of Computer Science, McGill University, Montreal, PQ, Canada. Since August 1983, he has been with the Department of Computer Science, Concordia University, Montreal, where he is currently a professor. In 1991, he held the Vineberg Memorial Fellowship at Technion-Israel Institute of Technology and, in 1992, the Humboldt Research Fellowship at the University of Erlangen-Nurnberg, Germany. He visited the University of California at Irvine, the Information Systems Laboratory at Stanford University, the Riken Frontiers Research Laboratory, Japan, Stuttgard University, and Technical University of Berlin, Germany. His research interests include pattern recognition, image processing, computer vision, neural networks, and nonparametric estimation. He has been associate editor of IEEE Transactions on Neural Networks and is presently on editorial board of Pattern Recognition Journal and Journal of Neural, Parallel and Scientific Computations. He was coeditor of the book Computer Vision and Pattern Recognition (Singapore: World Scientific, 1989) and is coauthor of the book A Distribution-Free Theory of Nonparametric Regression by Springer-Verlag. He has served on the program committees of Vision Interface’88, Vision Interface’94, Vision Interface’95, Vision Interface’99, 1995 International Conference on Document Processing and Applications, and First International Workshop on Computer Vision, Pattern Recognition, and Image Processing, 1998. He co-organized a workshop at NIPS’94 Conference and was a session organizer at the Third World Congress of Nonlinear Analysis, Catania, Italy, 2000.