Abstract
A novel method based on multi-modal discriminant analysis is proposed to reduce feature dimensionality. First, each class is divided into several clusters by the k-means algorithm. The optimal discriminant analysis is implemented by multi-modal mapping. Our method utilizes only those training samples on and near the effective decision boundary to generate a between-class scatter matrix, which requires less CPU time than other nonparametric discriminant analysis (NDA) approaches [Fukunaga and Mantock in IEEE Trans PAMI 5(6):671–677, 1983; Bressan and Vitria in Pattern Recognit Lett 24(5):2473–2749, 2003]. In addition, no prior assumptions about class and cluster densities are needed. In order to achieve a high verification performance of confusing handwritten numeral pairs, a hybrid feature extraction scheme is developed, which consists of a set of gradient-based wavelet features and a set of geometric features. Our proposed dimensionality reduction algorithm is used to congregate features, and it outperforms the principal component analysis (PCA) and other NDA approaches. Experiments proved that our proposed method could achieve a high feature compression performance without sacrificing its discriminant ability for classification. As a result, this new method can reduce artificial neural network (ANN) training complexity and make the ANN classifier more reliable.
Similar content being viewed by others
References
Suen CY, Nadal C, Legault R, Mai TA, Lam L (1992) Computer recognition of unconstrained handwritten numerals. Proc IEEE 80(7):1162–1180
Liu C-L, Nakashima K, Sako H, Fujisawa H (2004) Handwritten digit recognition: investigation of normalization and feature extraction techniques. Pattern Recognit 37(2):265–279
Chen G, Bui TD (1999) Invariant Fourier-wavelet descriptor for pattern recognition. Pattern Recognit 32(7):1083–1088
Pudil P, Novovicova J, Kittler J (1994) Floating search methods in feature selection. Pattern Recognit Lett 15(11):119–1125
Narendra P, Fukunaga K (1977) A branch and bound algorithm for feature subset selection. IEEE Trans Comput C-26(9):917–922
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Machine Learn Res 3(3):1157–1182
Oh I-S, Lee J-S, Suen CY (1999) Analysis of class separation and combination of class-dependent features for handwriting recognition. IEEE Trans PAMI 21(10):1089–1094
Kudo M, Sklansky J (2000) Comparison of algorithms that select features for pattern classifiers. Pattern Recognit 33(1):25–44
Raymer ML, Punch WF, Goodman ED, Kuhn LA, Jain AK (2000) Dimensionality reduction using genetic algorithms. IEEE Trans Evol Comput 4(2):164–171
Kim G, Kim S (2000) Feature selection using genetic algorithms for handwritten character recognition. In: Proceedings of the 7th international workshop on frontiers in handwriting recognition (IWFHR 2000), Amsterdam, The Netherlands, September 2000, pp 103–110
Oliveria LS, Sabourin R, Bortolozzi F, Suen CYA (2003) Methodology for feature selection using multi-objective genetic algorithms for handwritten digit string recognition. Int J Pattern Recognit Artif Intell (IJPRAI) 16(6):903–929
Bahlmann C, Haasdonk B, Burkhardt H (2002) On-line handwriting recognition with support vector machines—a Kernal approach. In: Proceedings of the 8th international workshop on frontiers in handwriting recognition (IWFHR 2202), Ontario, Canada, August 2002
Bi J, Bennett K, Embrechts M, Breneman C, Song M (2003) Dimensionality reduction via sparse support vector machines. J Machines Learn Res 3:1299–1243
Fukunaga K (1990) Introduction to statistical pattern recognition, 2nd edn. Academic Press, New York
Krishnaiah PR (ed) (1980) Analysis of variance. North-Holland, New York
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188
Fukunaga K, Mantock JM (1983) Nonparametric discriminant analysis. IEEE Trans PAMI 5(6):671–677
Lee C, Langdgrebe DA (1993) Feature extraction based on decision boundaries. IEEE Trans PAMI 5(4):388–400
Hastie T, Tibshirani R, Buja A (1995) Flexible discriminant and mixture models. In: Kay J, Titterington D (eds) Neural networks and statistics. Oxford University Press, Oxford
Hastie T, Tibshirani R (1996) Discriminant analysis by Gaussian mixtures. J Roy Stat Soc B 58(1):155–176
Hastie T, Tibshirani R (1996) Discriminant adaptive nearest neighbor classification. IEEE Trans PAMI 18(6):607–616
Torkkola K (2003) Feature extraction by non-parametric mutual information maximization. J Machine Learn Res 3:1415–1438
Fukumizu K, Bach FR, Jordan MI (2004) Dimensionality reduction for supervised learning with reproducing kernel Hilbert spaces. J Machine Learn Res 5:73–99
Bressan M, Vitria J (2003) Nonparametric discriminant analysis and nearest neighbor classification. Pattern Recognit Lett 24(5):2743–2749
Ingrid D (1992) Ten lectures on wavelets. Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania
Oliveira LS, Sabourin R, Bortolozzi F, Suen CY (2002) Impacts of verification on a numeral string recognition system. Pattern Recognit Lett 24(7):1023–1031
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Acknowledgements
The authors thank the referees of this paper for their constructive comments and the editor handling this paper. Financial support from NSERC and FCAR of Canada is very much appreciated.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Zhang, P., Bui, T.D. & Suen, C.Y. Feature dimensionality reduction for the verification of handwritten numerals. Pattern Anal Applic 7, 296–307 (2004). https://doi.org/10.1007/s10044-004-0226-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-004-0226-1