Abstract
This paper focuses on the applicability of the features inspired by the visual ventral stream for handwritten character recognition. A set of scale and translation invariant C2 features are first extracted from all images in the dataset. Three standard classifiers kNN, ANN and SVM are then trained over a training set and then compared over a separate test set. In order to achieve higher recognition rate, a two stage classifier was designed with different preprocessing in the second stage. Experiments performed to validate the method on the well-known MNIST database, standard Farsi digits and characters, exhibit high recognition rates and compete with some of the best existing approaches. Moreover an analysis is conducted to evaluate the robustness of this approach to orientation, scale and translation distortions.
Similar content being viewed by others
References
Fukushima K (1980) Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern 36(4): 193–202 doi:10.1007/BF00344251
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W et al. (1990) Handwritten digit recognition with a back-propagation network. In: Touretzky D (ed) Advances in Neural Information Processing Systems 2 (NIPS 89)
Al-Omari FA, Al-Jarrah O (2004) Handwritten Indian numerals recognition system using probabilistic neural networks. Adv Eng Inform 18(1): 9–16 doi:10.1016/j.aei.2004.02.001
Salah AA, Alpaydin E, Akarun L (2002) A selective attention-based method for visual pattern recognition with application to handwritten digit recognition and face recognition. IEEE Trans Pattern Anal Mach Intell 24(3): 420–425 doi:10.1109/34.990146
Liu CL, Nakashima K, Sako H, Fujisawa H (2003) Handwritten digit recognition: Benchmarking of state-of-the-art techniques. Pattern Recognit 36(10): 2271–2285 doi:10.1016/S0031-3203(03)00085-2
Shi M, Fujisawa Y, Wakabayashi T, Kimura F (2002) Handwritten numeral recognition using gradient and curvature of gray scale image. Pattern Recognit 35(10): 2051–2059 doi:10.1016/S0031-3203(01)00203-5
Teow LN, Loe KF (2002) Robust vision-based features and classification schemes for off-line handwritten digit recognition. Pattern Recognit 35(11): 2355–2364 doi:10.1016/S0031-3203(01)00228-X
Cheung K, Yeung D, Chin RT (1998) A Bayesian framework for deformable pattern recognition with application to handwritten character recognition. IEEE Trans Pattern Anal Mach Intell 29(12): 1382–1388 doi:10.1109/34.735813
Tsang IJ, Tsang IR, Dyck DV (1998). Handwritten character recognition based on moment features derived from image partition. In: International conference on image processing, vol 2, pp 939–942
Soltanzadeh H, Rahmati M (2004) Recognition of Persian handwritten digits using image profiles of multiple orientations. Pattern Recognit Lett 25(14): 1569–1576 doi:10.1016/j.patrec.2004.05.014
Said FN, Yacoub RA, Suen CY (1999). Recognition of English and Arabic numerals using a dynamic number of hidden neurons. In: Proceedings of the fifth international conference on document analysis and recognition, pp 237–240
Sadri J, Suen CY, Bui TD (2003) Application of support vector machines for recognition of handwritten Arabic/Persian digits. In: Second Iranian conference on machine vision and image processing, vol 1, pp 300–307
Khosravi H, Kabir E (2007) Introducing a very large dataset of handwritten Farsi digits and a study on their varieties. Pattern Recognit Lett 28(10): 1133–1141 doi:10.1016/j.patrec.2006.12.022
Dehghan M, Faez K, Ahmadi M, Shridhar M (2001) Handwritten Farsi (Arabic) word recognition: a holistic approach using discrete HMM. Pattern Recognit 34(5): 1057–1063 doi:10.1016/S0031-3203(00)00051-0
Kouh M, Riesenhuber M (2003) Investigating Shape Representation in Area V4 with HMAX: orientation and grating selectivities. CBCL Paper #231/AIM #2003-021, Massachusetts Institute of Technology, Cambridge, MA
Serre T, Kouh M, Cadieu C, Knoblich U, Kreiman G, Poggio T (2005). A theory of object recognition: computations and circuits in the feedforward path of the ventral stream in primate visual cortex. AI Memo 2005-036/CBCL Memo 259, Massachusetts Institute of Technology, Cambridge, MA
Knoblich U, Bouvrie J, Poggio T (2007) Biophysical models of neural computation: max and tuning circuits. CBCL paper, Cambridge, MA
Quiroga RQ, Reddy L, Kreiman G, Koch C, Fried I (2005) Invariant visual representation by single neurons in the human brain. Nature 435: 1102–1107 doi:10.1038/nature03687
Serre T, Oliva A, Poggio T (2007) A feedforward architecture accounts for rapid categorization. Proc Natl Acad Sci USA 104(15):6424–6429. PNAS. doi:10.1073/pnas.0700622104
Serre T, Wolf L, Bileschi S, Riesenhuber M, Poggio T (2007) Object recognition with cortex like mechanisms. IEEE Trans Pattern Anal Mach Intell 29(3): 411–426 doi:10.1109/TPAMI.2007.56
Gabor D (1946) Theory of communication. J Inst Electr Eng 93(26): 429–457
Hubel D, Wiesel T (1965) Receptive fields and functional architecture in two nonstriate visual areas (18 and 19) of the cat. J Neurophysiol 28: 229–289
Serre T, Riesenhuber M (2004) Realistic modeling of simple and complex cell tuning in the hmax model, and implications for invariant object recognition in cortex. Technical Report CBCL Paper 239/AI Memo 2004- 017, Massachusetts Institute of Technology, Cambridge, MA
Serre T, Wolf L, Poggio T (2004) A new biologically motivated framework for robust object recognition. Technical Report CBCL Paper 243/AI Memo 2004- 026, Massachusetts Institute of Technology, Cambridge, MA
Keysers D, Deselaers T, Gollan C, Ney H (2007) Deformation models for image recognition. IEEE Trans Pattern Anal Mach Intell 29(8): 1422–1435 doi:10.1109/TPAMI.2007.1153
Zhang P, Bui TD, Suen CY (2007) A novel hierarchical ensemble classifier system with a high recognition performance on handwritten digits. Pattern Recognit 40(12): 3415–3429 doi:10.1016/j.patcog.2007.03.022
Marc’Aurelio R, Poultney C, Chopra C, LeCun Y (2006) Efficient learning of sparse representations with an energy-based model. In: Platt J et al (eds) Advances in Neural Information Processing Systems (NIPS 2006). MIT Press
Kussul EM, Baidyk TN, Wunsch DC II, Makeyev O, Martin A (2006) Permutation coding technique for image recognition systems. IEEE Trans Neural Netw 17(6): 1566–1579 doi:10.1109/TNN.2006.880676
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11): 2278–2324 doi:10.1109/5.726791
Rumelhart DE, McClelland JL (1986) Parallel distributed processing, vol 1 & 2. MIT, Cambridge
Vapnik VN (1995) The nature of statistical learning theory. Springer-Verlag, NewYork
Serre T, Wolf L, Poggio T (2005) Object recognition with features inspired by visual cortex. In: Proceedings of IEEE conference computer vision and pattern recognition, Massachusetts Institute of Technology
Dehghan M, Faez K (1997) Farsi handwritten character recognition with moment invariants. In: Proceedings of the 13th international conference of digital signal processing, vol 2, issues 2–4, pp 507–510
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Borji, A., Hamidi, M. & Mahmoudi, F. Robust Handwritten Character Recognition with Features Inspired by Visual Ventral Stream. Neural Process Lett 28, 97–111 (2008). https://doi.org/10.1007/s11063-008-9084-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-008-9084-y