Elsevier

Pattern Recognition

Volume 36, Issue 7, July 2003, Pages 1597-1604
Pattern Recognition

Contour-based handwritten numeral recognition using multiwavelets and neural networks

https://doi.org/10.1016/S0031-3203(02)00252-2Get rights and content

Abstract

In this paper, we develop a handwritten numeral recognition descriptor using multiwavelets and neural networks. We first trace the contour of the numeral, then normalize and resample the contour so that it is translation- and scale-invariant. We then perform multiwavelet orthonormal shell expansion on the contour to get several resolution levels and the average. Finally, we use the shell coefficients as features to input into a feed-forward neural network to recognize the handwritten numerals. The main advantage of the orthonormal shell decomposition is that it decomposes a signal into multiresolution levels, but without down-sampling. Wavelet transforms with down-sampling can give very different coefficients when the input signal is shifted. This is the main limitation of wavelet transforms in pattern recognition. For the shell expansion, we prefer multiwavelets to scalar wavelets because we have two coordinates x and y for each point on the contour. If we extract features from x and y separately, just as Wunsch et al. did (Pattern Recognition 28 (1995) 1237), then we may not get the best features. In addition, we know that multiwavelets have advantages over scalar wavelets, such as short support, orthogonality, symmetry and higher order of vanishing moments. These properties allow multiwavelets to outperform scalar wavelets in some applications, e.g. signal denoising (IEEE Trans. Signal Process. 46 (12) (1998) 3414). We conducted experiments and found that it is feasible to use multiwavelet features in handwritten numeral recognition.

Introduction

Handwritten numeral recognition is an important problem of optical character recognition (OCR) [1]. Among all existing techniques, one important approach is to extract the outer contour of the handwritten numeral. Since the contour is periodic, it is well suited for Fourier-based methods. Zahn and Roskies [2] defined the cumulative angular function φ(l) as the net amount of angular bend between the starting point and the point with arc length l. They normalized φ(l) so that it is periodic. Suppose the Fourier expansion isφ(t)=μ0+∑(akcoskt+bksinkt).They defined rotation- and mirror-invariant features Ak=ak2+bk2 and rotation-invariant features Fkj=kj, where αk=tan−1(bk/ak). However, they warned that αk is unreliable when Ak is very small. Therefore, Fkj may be unreliable under this condition. Granlund [3] generated a complex-valued function u(t) for the contour. Suppose the Fourier coefficients an is defined asan=1T0Tu(t)ejn2πt/Tdtthen he defined scale- and rotation-invariant featuresbn=a1+na1−na12andDmn=a1+mn/ka1−nm/k,where k is the greatest common divisor of m and n. He also defined features that are scale-invariant, but depend on rotation.

Spline curve approximation is also frequently used on contour. Paglieroni [4] represented contours in the spatial domain by far fewer B-spline control points than the contour samples. Under proper conditions, there exists a fast transform from contours to these control points. It was shown that discriminant analysis between pairs of normalized and similarly modelled contours can be efficiently performed directly from control points. Taxt et al. [5] approximated the contour by a parametric spline curve. The curvatures are calculated and the values measured at regular intervals of these derived spline curves were used as descriptors in a statistical classification scheme. The features are translation-invariant by nature, but are dependent on rotation. Sekita et al. [6] used splines to approximate the contour and defined the breakpoints as the maximum points of curvature functions of the contours. The contours between adjacent breakpoints are extracted in the form of directed curves which compose the features of a character.

Recently wavelet descriptor is used for printed handwritten character recognition. Wunsch et al. [7] proposed to use wavelet features in combination with feed-forward neural networks. Because they employed the wavelet transform with down-sampling, the wavelet features can get quite different coefficients even if the contour is shifted very little. In addition, they extracted wavelet features from x and y coordinates independently.

In this paper, we present a novel shape descriptor for the recognition of handwritten numerals. The descriptor is derived from the multiwavelet shell expansion of an object's contours. The motivation to use multiwavelet basis is threefold. First, multiwavelets provide a localized frequency representation, which can reflect local properties much better than Fourier-based method. Second, orthonormal multiwavelets provide a natural hierarchical multiresolution representation, and there is substantial evidence that the human visual system use similar multiscale representations. Third and more importantly, the contour is represented by two streams of data (x,y)T, so it is natural to use multiwavelet transform instead of the scalar one. Since we decompose the contour into orthonormal multiwavelet shell, the bad behaviour of down-sampling in pattern recognition can be avoided. To make it more clear, for wavelet transform with down-sampling we can get quite different wavelet coefficients even if we only shift the input signal a few sample points. On the contrary, the orthonormal shell decomposition does not have this problem. In fact, we have successfully used scalar orthonormal shell and Fourier transform to extract invariant features in Ref. [8]. We trained the neural network with 4000 handwritten numerals. The test dataset consists of 2000 handwritten numerals. The handwritten numeral databases are from the Centre for Pattern Recognition and Machine Intelligence at Concordia University (CENPARMI). Experimental results show that our proposed method is better than the wavelet neural network method in Ref. [7].

The paper is organized as follows. Section 2 reviews multiwavelet transforms. Section 3 gives an introduction to orthonormal multiwavelet expansion. Section 4 presents the orthonormal multiwavelet neural network descriptor. Section 5 shows some experimental results. And finally Section 6 gives the conclusion of this paper.

Section snippets

Discrete multiwavelet transform

Multiwavelets are generalization of scalar wavelets [9], [10], [11], [12]. Multiwavelet basis uses translations and dilations of M⩾2 scaling functions {ϕm(t)}1⩽mM and M mother wavelet functions {ψm(t)}1⩽mM. If we write Φ(t)=(ϕ1(t),ϕ2(t),…,ϕM(t))T and Ψ(t)=(ψ1(t),ψ2(t),…,ψM(t))T, then we haveΦ(t)=2l=0L−1HlΦ(2t−l)andΨ(t)=2l=0L−1GlΦ(2t−l),where {Hl}0⩽lL−1 and {Gl}0⩽lL−1 are M×M filter matrices.

As an example, we give the most commonly used multiwavelets developed by Geronimo et al. [9]. LetH0=

The orthonormal shell expansion

In this section we generalize the orthonormal shell expansion, developed in Ref. [14] for scalar wavelet, to the multiwavelet case. Basically speaking, a shell is a multiresolution wavelet decomposition of the original signal where no down-sampling is performed. That means we get the same number of wavelet coefficients for every decomposition scale. It should be mentioned that the coefficients of orthogonal multiwavelet expansions are not shift-invariant. However, if all the multiwavelet

Orthonormal multiwavelet neural network descriptor

Feature selection is the critical step in the recognition process, and what distinguishes OCR methodologies from each other are the types of features selected for representation. In general, good features must satisfy the following requirements: First, intraclass variance must be small, which means that features derived from different samples of the same class should be close. Secondly, the interclass separation should be large, i.e. features derived from samples of different classes should

Experimental results

In order to evaluate the performance of our proposed recognition system, we use a 3-layer feed-forward neural network in our experiments. The number of nodes in each layer is given by 40×20×10. The input layer is not given here because it depends on the input feature size. Our experiments are performed on the CENPARMI handwritten numeral database. This database contains 6000 unconstrained handwritten numerals originally collected from dead letter envelopes by the US postal service at different

Conclusion

In this paper, we introduced a novel set of features that is well-suited for representing digitized handwritten numerals. The features are derived from the multiwavelet shell expansion of the numeral contour. The numeral contours are represented by fine-to-coarse approximations at different resolution levels by means of orthonormal multiwavelet expansion. It is suggested to use the low to intermediate levels of the shell coefficients since these features are relatively insensitive to the shape

Acknowledgements

This work was supported by research grants from the Natural Sciences and Engineering Research Council of Canada and by the Fonds pour la Formation de Chercheurs et l'Aide à la Recherche of Quebec. We would like to thank the referees for careful reading of the manuscript which leads to further improvement of the paper.

About the Author—G.Y. CHEN received the B.Sc. in Applied Mathematics, the M.Sc. in Computing Mathematics, and the M.Sc. in Computer Science as well. He was a research associate at Shenyang Institute of Computing Technology of the Chinese Academy of Sciences during 1989–1994. He worked in Matrox Graphics Inc. during 1999–2001. Currently he is a Ph.D. student at Concordia University, Montreal, Canada, where he received the FCAR and NSERC fellowships. His research interests include pattern

References (18)

There are more references available in the full text version of this article.

Cited by (40)

  • Invariant pattern recognition using contourlets and AdaBoost

    2010, Pattern Recognition
    Citation Excerpt :

    Their experiments were done for handprinted characters. Chen et al. [4] developed a descriptor by using multiwavelets and neural networks. The multiwavelet features are also extracted from the outer contour of the handwritten numerals and fed into neural networks.

  • Invariant pattern recognition using radon, dual-tree complex wavelet and Fourier transforms

    2009, Pattern Recognition
    Citation Excerpt :

    Their experiments were done on handprinted characters. Chen et al. [7] developed a descriptor by using multiwavelets and neural networks. The multiwavelet features are also extracted from the outer contour of the handwritten numerals and fed into a neural network.

  • Auto-correlation wavelet support vector machine

    2009, Image and Vision Computing
  • Invariant pattern recognition using ridgelet packets and the Fourier transform

    2009, International Journal of Wavelets, Multiresolution and Information Processing
  • Contour-based feature extraction using dual-tree complex wavelets

    2007, International Journal of Pattern Recognition and Artificial Intelligence
View all citing articles on Scopus

About the Author—G.Y. CHEN received the B.Sc. in Applied Mathematics, the M.Sc. in Computing Mathematics, and the M.Sc. in Computer Science as well. He was a research associate at Shenyang Institute of Computing Technology of the Chinese Academy of Sciences during 1989–1994. He worked in Matrox Graphics Inc. during 1999–2001. Currently he is a Ph.D. student at Concordia University, Montreal, Canada, where he received the FCAR and NSERC fellowships. His research interests include pattern recognition, image processing, and neural networks.

About the Author—T.D. BUI is a full professor in the Department of Computer Science, Concordia University, Montreal, Canada. He was the Chair of the Department from 1985 to 1990, and an Associate Vice-Rector Research at the same University from 1992 to 1996. Dr. Bui has published more than 120 papers in many different areas in scientific journals and conference proceedings. He was an invited professor at the Institute per le Applicazioni del Calcolo in Rome under the auspices of the National Research Council of Italy (1978–1979), and a visiting professor at the Department of Mechanical Engineering, the University of California at Berkeley (1983–1984). He has received many research grants and contracts from government and industries. His research interest includes scientific computing, wavelet transforms and applications, and image processing. Dr. Bui is a Fellow of the British Physical Society, a Senior Member of the Society for Computer Simulation, and a Member of the IEEE. He is an associate editor of the Journal Simulation and currently an associate editor of the Transactions on Computer Simulation and Modelling. He has served as a program committee member of the International Conference on Wavelet Analysis and its Applications Dec. 15–20, 2001 in Hong Kong.

About the Author—A. KRZYZAK received the M.Sc. and Ph.D. degrees in Computer Engineering from the Wroclaw University of Technology, Poland, in 1977 and 1980, respectively, and the D.Sc. degree (habilitation) in Computer Engineering from the Warsaw University of Technology, Poland in 1998. In 1980, he became an assistant professor at the Institute of Engineering Cybernetics, Wroclaw University of Technology, Poland. From November 1982 until July 1983, he was a postdoctorate fellow receiving the International Scientific Exchange Award in the School of Computer Science, McGill University, Montreal, PQ, Canada. Since August 1983, he has been with the Department of Computer Science, Concordia University, Montreal, where he is currently a professor. In 1991, he held the Vineberg Memorial Fellowship at Technion-Israel Institute of Technology and, in 1992, the Humboldt Research Fellowship at the University of Erlangen-Nurnberg, Germany. He visited the University of California at Irvine, the Information Systems Laboratory at Stanford University, the Riken Frontiers Research Laboratory, Japan, Stuttgard University, and Technical University of Berlin, Germany. His research interests include pattern recognition, image processing, computer vision, neural networks, and nonparametric estimation. He has been associate editor of IEEE Transactions on Neural Networks and is presently on editorial board of Pattern Recognition Journal and Journal of Neural, Parallel and Scientific Computations. He was coeditor of the book Computer Vision and Pattern Recognition (Singapore: World Scientific, 1989) and is coauthor of the book A Distribution-Free Theory of Nonparametric Regression by Springer-Verlag. He has served on the program committees of Vision Interface’88, Vision Interface’94, Vision Interface’95, Vision Interface’99, 1995 International Conference on Document Processing and Applications, and First International Workshop on Computer Vision, Pattern Recognition, and Image Processing, 1998. He co-organized a workshop at NIPS’94 Conference and was a session organizer at the Third World Congress of Nonlinear Analysis, Catania, Italy, 2000.

View full text