Abstract
Handwritten character recognition is the most widely used branch of study in image pattern recognition. Tamil, the official language of Tamil Nadu in South India, Sri Lanka, Singapore and Malaysia, has a script which contains many loops and compound characters, with small differences between character classes. Most of the research on offline Tamil handwritten character recognition system was done only on few character classes as it is very difficult to distinguish between minute dissimilarities of large character classes. It is important to design a complete recognition system that can process all character classes of Tamil and distinguish natural variability between inter-class images. Unlike conventional machine learning approaches for pattern recognition problems, we have proposed a nearest interest point classifier, which can choose sufficient and necessary subset of features from a variable length high dimensional feature vector. Since this is a practical problem, in this work, a study on image to image matching is included through feature analysis without using machine learning approaches. The proposed algorithm gave a good recognition accuracy for all the character classes on the standard database available for Tamil, HP Labs offline Tamil handwritten character database. Our proposed classifier produced a recognition accuracy of 90.2% while including the whole dataset. The method has been compared with the standard classifiers and has been proved to be a state-of-the-art performance in recognition of accuracy over the previous results given in the literature.
Similar content being viewed by others
Notes
Lipi Toolkit from HP Labs is an open-source tool for data collection and is freely available for download at http://lipitk.sourceforge.net.
References
Pal U, Chaudhuri BB (2004) Indian script character recognition: a survey. Pattern Recogn 37:1887–1899
Plamondon R, Srihari SN (2000) Online and off-line handwriting recognition: a comprehensive survey. IEEE Trans Pattern Anal Mach Intell 22(1):63–84
Joshi N, Sita G, Ramakrishnan AG, Madhvanath S (2004) Comparison of elastic matching algorithms for online Tamil handwritten character recognition. In: Ninth international workshop on frontiers in handwriting recognition
Lorigo LM, Govindaraju V (2006) Offline Arabic handwriting recognition: a survey. IEEE Trans Pattern Anal Mach Intell 28(5):712–724
Subashini A, Kodikara ND (2011) A novel SIFT-based codebook generation for handwritten Tamil character recognition. In: 6th international conference on industrial and information systems
Kimura F, Takashina K, Tsuruoka S, Miyake Y (1987) Modified quadratic discriminant functions and the application to Chinese character recognition. IEEE Trans Pattern Anal Mach Intell. 1:149–153
Liu CL, Nakashima K, Sako H, Fujisawa H (2003) Handwritten digit recognition: benchmarking of state-of-the-art techniques. Pattern Recogn 36:2271–2285
Kim HY, Kim JH (2001) Hierarchical random graph representation of handwritten characters and its application to hangul recognition. Pattern Recogn 34:187–201
Gillies AM, Hepp D, Gader PD (1992) A system for recognizing handwritten words. Technical report submitted to the United States postal service, office of advanced technology, Nov. 1992
The Unicode Consortium (2000) The Unicode Standard 3.0. Addison Wesley publishers, Harlow
BBC (2004) India sets up classical languages. August 17, 2004. http://news.bbc.co.uk/2/hi/south_asia/3667032.stm
The Hindu (2005) Sanskrit to be declared classical language. October 28, 2005. Retrieved on 2007-08-16. http://www.hindu.com/2005/10/28/stories/2005102809281200.htm
Isolated Handwritten Tamil Character Dataset, hpltamil-iso-char http://www.hpl.hp.com/india/research/penhw/resources/tamil-iso-char.html
Raj R, Antony M, Abirami S (2016) Offline tamil handwritten character recognition using statistical based quad tree. Aust J Basic Appl Sci 10(2):103–109
Sundaram S, Urala KB, Ramakrishnan AG (2012) Language models for online handwritten tamil word recognition. In: Proceeding of the workshop on document analysis and recognition, December 16–16, 2012, Mumbai, India
Connell SD, Jain AK (2001) Template-based online character recognition. Pattern Recogn 34:1–14
Kunwar R, Ramakrishnan AG (2011) Tamil online handwriting recognition using fractal features. Tamil Internet 2011, At University of Pennsylvania, USA
Sundaresan CS, Keerthi SS (1999) A study of representations for pen based handwriting recognition of Tamil characters. In: Proceedings of the fifth international conference on document analysis and recognition
Aparna KH, Subramanian V, Kasirajan M, Prakash GV, Chakravarthy VS, Madhvanath S (2004) Online handwriting recognition for Tamil. In: Ninth international workshop on frontiers in handwriting recognition
Chinnuswamy P, Krishnamoorthy SG (1980) Recognition of hand printed tamil characters. Pattern Recogn 12:141–152
Suresh RM, Arumugam S, Ganesan L (1999) Fuzzy approach to recognize handwritten Tamil characters. In: Proceedings third international conference on computational intelligence and multimedia applications
Hewavitharana S, Fernand HC (2002) A two stage classification approach to tamil handwriting recognition. Tamil Internet, California
Sutha J, Ramaraj N (2007) Network based offline Tamil handwritten character recognition. In: System international conference on computational intelligence and multimedia applications
Wahi A, Sundaramurthy S, Poovizhi P (2013) Handwritten Tamil character recognition. In: Fifth international conference on advanced computing
Kannan RJ, Prabhakar R (2008) Off-line cursive handwritten Tamil character recognition. WSEAS Trans Signal Process Arch 4(6):351–360
Pal U, Sharma N, Wakabayashi T, Kimura F (2008) Handwritten character recognition of popular south indian scripts. In: Doermann D, Jaeger S (eds) Arabic and chinese handwriting recognition. SACH 2006. Lecture Notes in Computer Science, vol 4768. Springer, Berlin, Heidelberg
Shanthi N, Duraiswami K (2010) A novel SVM-based handwritten Tamil character recognition system. Pattern Anal Appl 13(2):173–180
Vijayaraghavan P, Misha S (2015). Handwritten Tamil recognition using a convolutional neural network. NEML Poster 2015
Bhattacharya U, Ghosh SK, Parui SK (2007) A two stage recognition scheme for handwritten Tamil characters. In: Proceedings of the ninth international conference on document analysis and recognition (ICDAR 2007). IEEE Computer Society, Washington, DC, 511–515
Ashlin Deepa RN, Rao RR (2016) An efficient offline Tamil handwritten character recognition system using zernike moments and diagonal-based features. Int J Appl Eng Res 11(4):2607–2610
Ashlin Deepa RN, Rao RR (2017) An eigen characters method for recognition of handwritten tamil character recognition. In: Proceedings of the first international conference on intelligent computing and communication, advances in intelligent systems and computing
Ashlin Deepa RN, Rao RR (2017) A modified GA classifier for offline Tamil handwritten character recognition. Int J Appl Pattern Recognit 4(1):89–105
Bharath A, Madhvanath S (2012) HMM-based lexicon driven and lexicon-free word recognition for online handwritten indic scripts. IEEE Trans Pattern Anal Mach Intell 34(4):670–682
John J, Pramod KV, Balakrishnan K (2011) Handwritten character recognition of South Indian scripts: a review. In: National conference on Indian language computing, Kochi, Feb 19–20
Paulpandian T, Ganapathy V (1993) Translation and scale invariant recognition of handwritten Tamil characters using hierarchical neural networks. In: IEEE international symposium on circuits and systems
Jain Anil K, Taxt Torfinn (1996) Feature extraction methods for character recognition-a survey. Pattern Recognit 29(4):641–662
Bay H, Ess A, Tuytelaars T, Gool LV (2008) Speeded-up robust features (SURF). Comput Vis Image Underst 110(3):346–359
Lia A, Jianga W, Yuana W, Daia D, Zhanga S, Weia Z (2017) An improved FAST + SURF fast matching algorithm. Proced Comput Sci 107:306–312
Vinay A, Vasukib V, Bhatb S, Jayanth KS, Murthya KNB, Natarajan S (2016) Two dimensionality reduction techniques for SURF based face recognition. Proced Comput Sci 85:241–248
Mehrotra H, Pankaj KS, Majhi B (2013) Fast segmentation and adaptive SURF descriptor for iris recognition. Math Comput Model 58:132–146
Li C, Khan L, Prabhakaran B, (2007). Feature selection for classification of variable length multiattribute motions. In: Multimedia data mining and knowledge discovery, pp 116–137
Bandyopadhyay S, Murthy CA, Pal SK (1998) Pattern classification using genetic algorithms: determination of H. Pattern Recognit Lett 19:1171–1181
Bhatia N (2010) Survey of nearest neighbor techniques. Int J Comput Sci Inf Secur, pp 302–305
Duda RO, Hart PG, Stork DE (2001) Pattern classification. Wiley, New York
Cover T, Hart P (1967) Nearest neighbour pattern classification. IEEE Trans Inf Theory 13:21–27
Bailey T, Jain A (1978) A note on distance-weighted k-nearest neighbour rules. IEEE Trans Syst Man Cybern 8:311–313
Dudani S (1976) The distance-weighted k-nearest-neighbor rule. IEEE Trans Syst Man Cybern 6:325–327
Beyer K, Goldstein J, Ramakrishnan R, Shaft U (1999) When is nearest neighbour meaningful? In: Proceedings of the 7th international conference of database theory ICDT 99, Lecture Notes in Computer Science, Jerusalem, Israel, January 10–12, pp 217–235
Houle ME, Kriegel HP, Krger P, Schubert E, Zimek A (2010) Can shared-neighbor distances defeat the curse of dimensionality? In: Proceedings of the SSDBM, pp 482–500
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(2005):1226–1238
Weinberger KQ, Saul LK (2009) Distance metric learning for large margin nearest neighbour classification. J Mach Learn Res 10:207–244
Xing EP, Ng AY, Jordan MI, Russell S (2002) Distance metric learning, with application to clustering with side-information. In: Advances in neural information processing systems NIPS 2001, Vancouver, Canada, December 10–12, pp 521–528
Goldberger J, Roweis S, Hinton G, Salakhutdinov R (2004) Neighbourhood component analysis. In: Advances in neural information processing systems NIPS, pp 513–520
James AP, Dimitrijev S (2012) Nearest neighbor classifier based on nearest feature decisions. Comput J 55:1072–1087
Zuo W, Zhang D, Wang K (2008) On kernel difference weighted k-nearest neighbor classification. Pattern Anal Appl 11:247–257
Shakhnarovich G, Darrell T, Indyk P (2006) Nearest-neighbor methods in learning and vision: theory and practice. MIT Press, Cambridge
Akkus A, Guvenir AH (1996) K nearest neighbor classification on feature projections. In: Proceedings of the ICML, Bari, Italy, July 3–6, pp 12–19
Demirz G, Guvenir AH (1997) Classification by voting feature intervals. In: Proceedings of the ECML-97, Prague, Czech Republic, April 23–25, pp 85–92. Springer
Zhang H, Liu G, Chow TWS, Liu W (2011) Textual and visual content-based anti-phishing: a Bayesian approach. IEEE Trans Neural Netw 22(10):1532–1546
Acknowledgements
We extend our gratitude to HP Labs for providing database for our research.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ashlin Deepa, R.N., Rajeswara Rao, R. A novel nearest interest point classifier for offline Tamil handwritten character recognition. Pattern Anal Applic 23, 199–212 (2020). https://doi.org/10.1007/s10044-018-00776-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-018-00776-x