Abstract
A significant issue in the domain of optical character recognition is handwritten text recognition. Here, two novel feature extraction techniques are proposed using a fixed-size sliding window, and also an edit distance-based architecture is suggested to recognize the off-line characters. These feature extraction techniques are designed for text recognition from the text images. It’s an off-line approach, that is why data from scanned documents or natural scenes are taken as input. In this paper, the freely available datasets, known as Chars74k and MNIST for English alphabets and digits are used. The proposed feature extraction technique for the off-line text images of characters as well as numbers generates the features successfully. The impact of the proposed method on text recognition accuracy is computed using several state-of-the-art machine learning algorithms. After that, these are again compared with the proposed Edit distance based text recognition system with the help of different conducted experiments. The proposed model has reached an accuracy of more than 96% for the MNIST dataset.















Similar content being viewed by others
References
Ahamed H, Alam I, Islam MM (2019) Handwritten digit recognition system based on lrm and svm algorithm. In: International conference on engineering research and education
Ali M, Foroosh H (2016) Character recognition in natural scene images using rank-1 tensor decomposition. In: 2016 IEEE International conference on image processing (ICIP), pp 2891–2895. IEEE
Arica N, Yarman-Vural FT (2002) Optical character recognition for cursive handwriting. IEEE Trans Pattern Anal Mach Intell 24(6):801–813
Baruch O (1988) Line thinning by line following. Pattern Recogn Lett 8(4):271–276
Beigi HS (1993) An overview of handwriting recognition. In: Proceedings of the 1st annual conference on technological advancements in developing countries, Columbia University, New York, pp. 30–46. Citeseer
Ben S, Soua M, Kachouri R, Akil M (2017) A comparison study between mlp and convolutional neural network models for character recognition. In: Real-time image and video processing, SPIE proceedings, vol 10223, pp 1–11
Bengio E, Wen Y (2016) Ruan. S., Handwritten digits classification
Bu YJ, Xie M (2013) A new method for license plate characters recognition based on sliding window search. In: 2013 IEEE 11th International conference on dependable, autonomic and secure computing, pp 304–307. IEEE
Canny J (1986) A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell 6:679–698
Choudhary A, Rishi R, Ahlawat S (2013) A new character segmentation approach for off-line cursive handwritten words. Procedia Comput Sci 17:88–95
Das D, Nayak DR, Dash R, Majhi B (2019) An empirical evaluation of extreme learning machine: application to handwritten character recognition. Multimed Tools Appl 78(14):19495–19523
De Campos TE, Babu BR, Varma M et al (2009) Character recognition in natural images. VISAPP (2) 7. http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/
Dey R, Balabantaray RC (2019) A novel sliding window approach for offline handwritten character recognition. In: 2019 International conference on information technology (ICIT), pp 178–183. IEEE
Dey S, Nicolaou A, Lladós J, Pal U (2019) Evaluation of word spotting under improper segmentation scenario. Int J Docum Anal Recogn (IJDAR) 22 (4):361–374
Dutt A, Dutt A (2017) Handwritten digit recognition using deep learning. Int J AdvRes Comput Eng Technol 6(7):990–997
Fayyad U, Irani K (1993) Multi-interval discretization of continuous-valued attributes for classification learning. Machine Learning
Fenu G, Marras M (2017) Leveraging continuous multi-modal authentication for access control in mobile cloud environments. In: International conference on image analysis and processing, pp 331–342. Springer
Fenu G, Marras M, Boratto L (2018) A multi-biometric system for continuous student authentication in e-learning platforms. Pattern Recogn Lett 113:83–92
Ghosh SK, Valveny E (2015) A sliding window framework for word spotting based on word attributes. In: Iberian conference on pattern recognition and image analysis, pp 652–661. Springer
Graves A, Liwicki M, Fernández S, Bertolami R, Bunke H, Schmidhuber J (2008) A novel connectionist system for unconstrained handwriting recognition. IEEE Trans Pattern Analy Mach Intell 31(5):855– 868
Grimsdale R, Sumner F, Tunis C, Kilburn T (1959) A system for the automatic recognition of patterns. Proce IEE-Part B Radio Elect Eng 106 (26):210–221
Hambal AM, Pei Z, Ishabailu FL (2017) Image noise reduction and filtering techniques. Int J Sci Res(IJSR) 6(3):2033–2038
Hochuli AG, Oliveira LS, Britto Jr A, Sabourin R (2018) Handwritten digit segmentation: Is it still necessary? Pattern Recogn 78:1–11
Ignat A, Aciobanitei B (2016) Handwritten digit recognition using rotations. In: 2016 18th International symposium on symbolic and numeric algorithms for scientific computing (SYNASC), pp. 303–306. IEEE
Ilmi N, Budi WTA, Nur RK (2016) Handwriting digit recognition using local binary pattern variance and k-nearest neighbor classification. In: 2016 4th International conference on information and communication technology (ICoICT), pp 1–5. IEEE
Islam N, Islam Z, Noor N (2017) A survey on optical character recognition system. arXiv:1710.05703
Kandaswamy C, Silva LM, Alexandre LA, Santos JM, de Sá JM (2014) Improving deep neural network performance by reusing features trained with transductive transference. In: International conference on artificial neural networks, pp 265–272. Springer
Kaur RP, Kumar M, Jindal MK (2019) Newspaper text recognition of gurumukhi script using random forest classifier. Multimedia Tools and Applications 1–14
Kavallieratou E, Likforman-Sulem L, Vasilopoulos N (2018) Slant removal technique for historical document images. J Imaging 4(6):80
Kowsalya S, Periasamy P (2019) Recognition of tamil handwritten character using modified neural network with aid of elephant herding optimization. Multimed Tools Appl 78(17):25043–25061
Krishnan P, Dutta K, Jawahar C (2018) Word spotting and recognition using deep embedding. In: 2018 13th IAPR International workshop on document analysis systems (DAS), pp 1–6. IEEE
Krishnan P, Jawahar C (2016) Matching handwritten document images. In: European Conference on Computer Vision, pp 766–782. Springer
Kusetogullari H, Yavariabdi A, Cheddad A, Grahn H, Hall J (2019) Ardis: a swedish historical handwritten digit dataset. Neural Comput Applic 1–14
Lam SW, Bhate A, Srihari SN (1995) Sliding window technique for word recognition. In: Document Recognition II, vol 2422, pp 38–46. International society for optics and photonics
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition, vol 86. http://yann.lecun.com/exdb/mnist/
Li H, Wang W, Lv K (2019) N-ftrn: Neighborhoods based fully convolutional network for chinese text line recognition. Multimed Tools Appl 78(16):22249–22268
Lu Q, Liu Y, Huang J, Yuan X, Hu Q (2019) License plate detection and recognition using hierarchical feature layers from cnn. Multimed Tools Appl 78(11):15665–15680
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66
Panwar NNS (2012) Handwritten text recognition system based on neural network. J Comput Inform Technol 2(2):95–103
Priya A, Mishra S, Raj S, Mandal S, Datta S (2016) Online and offline character recognition: A survey. In: 2016 International conference on communication and signal processing (ICCSP), pp 0967–0970. IEEE
Rath TM, Manmatha R (2007) Word spotting for historical documents. Int J Docum Anal Recogn(IJDAR) 9(2-4):139–152
Saha S, Basu S (2015) Nasipuri, M.: ilpr: an indian license plate recognition system. Multimed Tools Appl 74(23):10621–10656
Sonkusare M, Sahu N (2016) A survey on handwritten character recognition (hcr) techniques for english alphabets. Advances in Vision Computing: An International Journal (AVC) 3(1)
Sudholt S, Fink GA (2016) Phocnet: A deep convolutional neural network for word spotting in handwritten documents. In: 2016 15th International conference on frontiers in handwriting recognition (ICFHR), pp. 277–282. IEEE
Sun C, Si D (1997) Skew and slant correction for document images using gradient direction. In: Proceedings of the Fourth international conference on document analysis and recognition, vol 1, pp 142–146. IEEE
Sundaresan V, Lin J (1998) Recognizing handwritten digits and characters
Supardi J, Hapsari IA, Siraj MM (2014) Handwritten alphabets recognition using twelve directional feature extraction and self organizing maps. In: 2014 International conference on computer, control, informatics and its applications (IC3INA), pp 149–153. IEEE
TALREJA S (2016) Stochastically optimized handwritten character recognition system using hidden markov model. Journal of Electronics and Communication Engineering
Tay YH, Khalid M, Yusof R, Viard-Gaudin C (2003) Offline cursive handwriting recognition system based on hybrid markov model and neural networks. In: Proceedings 2003 IEEE International symposium on computational intelligence in robotics and automation. computational intelligence in robotics and automation for the new millennium (Cat. No. 03EX694), vol 3, pp 1190–1195. IEEE
Viklund A, Nimstad E (2017) Character recognition in natural images utilising tensorflow
Wang J, Bacic B, Yan WQ (2018) An effective method for plate number recognition. Multimed Tools Appl 77(2):1679–1692
Wibowo GH, Sigit R, Barakbah A (2016) Feature extraction of character image using shape energy. In: 2016 International electronics symposium (IES), pp 471–475. IEEE
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Dey, R., Balabantaray, R.C. & Mohanty, S. Sliding window based off-line handwritten text recognition using edit distance. Multimed Tools Appl 81, 22761–22788 (2022). https://doi.org/10.1007/s11042-021-10988-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-10988-9