Abstract
In many documents digits/numerals may touch each other and hence digit string recognition is necessary as segmentation of individual numeral from the touching string is difficult. In this paper, we propose a digit string recognition system for four Indian popular scripts. Here we consider strings of Kannada, Oriya, Tamil and Telugu scripts for our experiment. This paper has two contributions: (i) we have developed 4 datasets of digit string for each of these four scripts. Each dataset has 20000 numeral string samples for training and 30000 samples for testing. As there is no such dataset available, it will be helpful to the community (ii) we apply a RNN free CNN (Convolutional Neural Network) and CTC (Connectionist Temporal Classifica-tion) based architecture for numeral string recognition. Unlike normal text string, in string of digits has no contextual information among the digits and hence a digit may be followed by an arbitrary digit in a digit string. Because of such behaviors we apply a CNN and CTC based architecture without RNN for numeral string recognition. We tested our scheme on our different test datasets and results are provided.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Plamondon, R., Srihari, S.N.: On-line and off-line handwritten recognition: a comprehensive survey. IEEE Trans. PAMI 22, 62–84 (2000)
Pal, U., Chaudhuri, B.: Indian script character recognition: a survey. Pattern Recogn. 37, 1887–1899 (2004)
Bhowmick, T., et al.: An HMM based recognition scheme for handwritten Oriya numerals. In: Proceedings of the 9th International conference on Information Technology, pp. 105–110 (2006)
Sharma, N., Pal, U., Kimura, F.: Recognition of handwritten Kannada numerals. In: Proceedings of the 9th International Conference on Information Technology, pp. 133–136 (2006)
Hanmandlu, M., Ramana Murthy, O.: Fuzzy model based recognition of handwritten numerals. Pattern Recogn. 40, 1840–1854 (2007)
Wen, Y., Lu, Y., Shi, P.: Handwritten Bangla numeral recognition system and its appli-cation to postal automation. Pattern Recogn. 40, 99–107 (2007)
Bajaj, R., Dey, L., Chaudhury, S.: Devnagari numeral recognition by combining deci-sion of multiple connectionist classifiers. Sadhana 27, 59–72 (2002)
Kumar, S., Singh, C.: A study of Zernike moments and its use in Devnagari handwrit-ten character recognition. In: Proceedings of the International conference on Cognition and Recognition, pp. 514–520 (2005)
Bhattacharya, U., et al.: Neural combination of ANN and HMM for handwritten Devnagari numeral recognition. In: Proceedings of the 10th International Workshop on Frontiers of Handwriting Recognition, pp. 613–618 (2006)
Otsu, N.: A Threshold selection method from grey level histogram. IEEE Trans. SMC 9, 62–66 (1979)
Kimura, F., et al.: Modified quadratic discriminant function and the application to Chinese character recognition. IEEE Trans. PAMI 9, 149–153 (1987)
Huang, G., Liu, Z., Weinberger, K., Maaten, L.: Densely connected convolutional networks (2016). arXiv preprint arXiv:1608.06993
Graves, A., Fernndez, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd International Conference on Machine learning, pp. 369–376 (2006)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the 29th IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of International Conference on Machine Learning, pp. 448–456 (2015)
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks, In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, pp. 315–323 (2011)
Hinton, G., et al.: Improving neural networks by preventing co-adaptation of feature detectors (2012). arXiv preprint arXiv:1207.0580
Pal, U., Roy, K., Kimura, F.: Bangla handwritten pin code string recognition for indian postal automation. In: Proceedings of International Conference on Frontiers in Handwriting Recognition, pp. 290–295 (2008)
Pal, U., Roy, K., Kimura, F., Indian multi-script full pincode string recognition for postal automation, In: Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR), pp. 456–460 (2009)
Jia, Y., et al.: Caffe: convolutional architecture for fast fea-ture embedding (2014). arXiv preprint arXiv:1408.5093
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhan, H., Chowdhury, P.N., Pal, U., Lu, Y. (2020). Handwritten Digit String Recognition for Indian Scripts. In: Palaiahnakote, S., Sanniti di Baja, G., Wang, L., Yan, W. (eds) Pattern Recognition. ACPR 2019. Lecture Notes in Computer Science(), vol 12047. Springer, Cham. https://doi.org/10.1007/978-3-030-41299-9_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-41299-9_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41298-2
Online ISBN: 978-3-030-41299-9
eBook Packages: Computer ScienceComputer Science (R0)