Skip to main content
Log in

Sliding window based off-line handwritten text recognition using edit distance

  • 1169: Interdisciplinary Forensics: Government, Academia and Industry Interaction
  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

A significant issue in the domain of optical character recognition is handwritten text recognition. Here, two novel feature extraction techniques are proposed using a fixed-size sliding window, and also an edit distance-based architecture is suggested to recognize the off-line characters. These feature extraction techniques are designed for text recognition from the text images. It’s an off-line approach, that is why data from scanned documents or natural scenes are taken as input. In this paper, the freely available datasets, known as Chars74k and MNIST for English alphabets and digits are used. The proposed feature extraction technique for the off-line text images of characters as well as numbers generates the features successfully. The impact of the proposed method on text recognition accuracy is computed using several state-of-the-art machine learning algorithms. After that, these are again compared with the proposed Edit distance based text recognition system with the help of different conducted experiments. The proposed model has reached an accuracy of more than 96% for the MNIST dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Ahamed H, Alam I, Islam MM (2019) Handwritten digit recognition system based on lrm and svm algorithm. In: International conference on engineering research and education

  2. Ali M, Foroosh H (2016) Character recognition in natural scene images using rank-1 tensor decomposition. In: 2016 IEEE International conference on image processing (ICIP), pp 2891–2895. IEEE

  3. Arica N, Yarman-Vural FT (2002) Optical character recognition for cursive handwriting. IEEE Trans Pattern Anal Mach Intell 24(6):801–813

    Article  Google Scholar 

  4. Baruch O (1988) Line thinning by line following. Pattern Recogn Lett 8(4):271–276

    Article  Google Scholar 

  5. Beigi HS (1993) An overview of handwriting recognition. In: Proceedings of the 1st annual conference on technological advancements in developing countries, Columbia University, New York, pp. 30–46. Citeseer

  6. Ben S, Soua M, Kachouri R, Akil M (2017) A comparison study between mlp and convolutional neural network models for character recognition. In: Real-time image and video processing, SPIE proceedings, vol 10223, pp 1–11

  7. Bengio E, Wen Y (2016) Ruan. S., Handwritten digits classification

    Google Scholar 

  8. Bu YJ, Xie M (2013) A new method for license plate characters recognition based on sliding window search. In: 2013 IEEE 11th International conference on dependable, autonomic and secure computing, pp 304–307. IEEE

  9. Canny J (1986) A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell 6:679–698

    Article  Google Scholar 

  10. Choudhary A, Rishi R, Ahlawat S (2013) A new character segmentation approach for off-line cursive handwritten words. Procedia Comput Sci 17:88–95

    Article  Google Scholar 

  11. Das D, Nayak DR, Dash R, Majhi B (2019) An empirical evaluation of extreme learning machine: application to handwritten character recognition. Multimed Tools Appl 78(14):19495–19523

    Article  Google Scholar 

  12. De Campos TE, Babu BR, Varma M et al (2009) Character recognition in natural images. VISAPP (2) 7. http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/

  13. Dey R, Balabantaray RC (2019) A novel sliding window approach for offline handwritten character recognition. In: 2019 International conference on information technology (ICIT), pp 178–183. IEEE

  14. Dey S, Nicolaou A, Lladós J, Pal U (2019) Evaluation of word spotting under improper segmentation scenario. Int J Docum Anal Recogn (IJDAR) 22 (4):361–374

    Article  Google Scholar 

  15. Dutt A, Dutt A (2017) Handwritten digit recognition using deep learning. Int J AdvRes Comput Eng Technol 6(7):990–997

    Google Scholar 

  16. Fayyad U, Irani K (1993) Multi-interval discretization of continuous-valued attributes for classification learning. Machine Learning

  17. Fenu G, Marras M (2017) Leveraging continuous multi-modal authentication for access control in mobile cloud environments. In: International conference on image analysis and processing, pp 331–342. Springer

  18. Fenu G, Marras M, Boratto L (2018) A multi-biometric system for continuous student authentication in e-learning platforms. Pattern Recogn Lett 113:83–92

    Article  Google Scholar 

  19. Ghosh SK, Valveny E (2015) A sliding window framework for word spotting based on word attributes. In: Iberian conference on pattern recognition and image analysis, pp 652–661. Springer

  20. Graves A, Liwicki M, Fernández S, Bertolami R, Bunke H, Schmidhuber J (2008) A novel connectionist system for unconstrained handwriting recognition. IEEE Trans Pattern Analy Mach Intell 31(5):855– 868

    Article  Google Scholar 

  21. Grimsdale R, Sumner F, Tunis C, Kilburn T (1959) A system for the automatic recognition of patterns. Proce IEE-Part B Radio Elect Eng 106 (26):210–221

    Article  Google Scholar 

  22. Hambal AM, Pei Z, Ishabailu FL (2017) Image noise reduction and filtering techniques. Int J Sci Res(IJSR) 6(3):2033–2038

    Google Scholar 

  23. Hochuli AG, Oliveira LS, Britto Jr A, Sabourin R (2018) Handwritten digit segmentation: Is it still necessary? Pattern Recogn 78:1–11

    Article  Google Scholar 

  24. Ignat A, Aciobanitei B (2016) Handwritten digit recognition using rotations. In: 2016 18th International symposium on symbolic and numeric algorithms for scientific computing (SYNASC), pp. 303–306. IEEE

  25. Ilmi N, Budi WTA, Nur RK (2016) Handwriting digit recognition using local binary pattern variance and k-nearest neighbor classification. In: 2016 4th International conference on information and communication technology (ICoICT), pp 1–5. IEEE

  26. Islam N, Islam Z, Noor N (2017) A survey on optical character recognition system. arXiv:1710.05703

  27. Kandaswamy C, Silva LM, Alexandre LA, Santos JM, de Sá JM (2014) Improving deep neural network performance by reusing features trained with transductive transference. In: International conference on artificial neural networks, pp 265–272. Springer

  28. Kaur RP, Kumar M, Jindal MK (2019) Newspaper text recognition of gurumukhi script using random forest classifier. Multimedia Tools and Applications 1–14

  29. Kavallieratou E, Likforman-Sulem L, Vasilopoulos N (2018) Slant removal technique for historical document images. J Imaging 4(6):80

    Article  Google Scholar 

  30. Kowsalya S, Periasamy P (2019) Recognition of tamil handwritten character using modified neural network with aid of elephant herding optimization. Multimed Tools Appl 78(17):25043–25061

    Article  Google Scholar 

  31. Krishnan P, Dutta K, Jawahar C (2018) Word spotting and recognition using deep embedding. In: 2018 13th IAPR International workshop on document analysis systems (DAS), pp 1–6. IEEE

  32. Krishnan P, Jawahar C (2016) Matching handwritten document images. In: European Conference on Computer Vision, pp 766–782. Springer

  33. Kusetogullari H, Yavariabdi A, Cheddad A, Grahn H, Hall J (2019) Ardis: a swedish historical handwritten digit dataset. Neural Comput Applic 1–14

  34. Lam SW, Bhate A, Srihari SN (1995) Sliding window technique for word recognition. In: Document Recognition II, vol 2422, pp 38–46. International society for optics and photonics

  35. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition, vol 86. http://yann.lecun.com/exdb/mnist/

  36. Li H, Wang W, Lv K (2019) N-ftrn: Neighborhoods based fully convolutional network for chinese text line recognition. Multimed Tools Appl 78(16):22249–22268

    Article  Google Scholar 

  37. Lu Q, Liu Y, Huang J, Yuan X, Hu Q (2019) License plate detection and recognition using hierarchical feature layers from cnn. Multimed Tools Appl 78(11):15665–15680

    Article  Google Scholar 

  38. Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66

    Article  Google Scholar 

  39. Panwar NNS (2012) Handwritten text recognition system based on neural network. J Comput Inform Technol 2(2):95–103

    Google Scholar 

  40. Priya A, Mishra S, Raj S, Mandal S, Datta S (2016) Online and offline character recognition: A survey. In: 2016 International conference on communication and signal processing (ICCSP), pp 0967–0970. IEEE

  41. Rath TM, Manmatha R (2007) Word spotting for historical documents. Int J Docum Anal Recogn(IJDAR) 9(2-4):139–152

    Article  Google Scholar 

  42. Saha S, Basu S (2015) Nasipuri, M.: ilpr: an indian license plate recognition system. Multimed Tools Appl 74(23):10621–10656

    Article  Google Scholar 

  43. Sonkusare M, Sahu N (2016) A survey on handwritten character recognition (hcr) techniques for english alphabets. Advances in Vision Computing: An International Journal (AVC) 3(1)

  44. Sudholt S, Fink GA (2016) Phocnet: A deep convolutional neural network for word spotting in handwritten documents. In: 2016 15th International conference on frontiers in handwriting recognition (ICFHR), pp. 277–282. IEEE

  45. Sun C, Si D (1997) Skew and slant correction for document images using gradient direction. In: Proceedings of the Fourth international conference on document analysis and recognition, vol 1, pp 142–146. IEEE

  46. Sundaresan V, Lin J (1998) Recognizing handwritten digits and characters

  47. Supardi J, Hapsari IA, Siraj MM (2014) Handwritten alphabets recognition using twelve directional feature extraction and self organizing maps. In: 2014 International conference on computer, control, informatics and its applications (IC3INA), pp 149–153. IEEE

  48. TALREJA S (2016) Stochastically optimized handwritten character recognition system using hidden markov model. Journal of Electronics and Communication Engineering

  49. Tay YH, Khalid M, Yusof R, Viard-Gaudin C (2003) Offline cursive handwriting recognition system based on hybrid markov model and neural networks. In: Proceedings 2003 IEEE International symposium on computational intelligence in robotics and automation. computational intelligence in robotics and automation for the new millennium (Cat. No. 03EX694), vol 3, pp 1190–1195. IEEE

  50. Viklund A, Nimstad E (2017) Character recognition in natural images utilising tensorflow

  51. Wang J, Bacic B, Yan WQ (2018) An effective method for plate number recognition. Multimed Tools Appl 77(2):1679–1692

    Article  Google Scholar 

  52. Wibowo GH, Sigit R, Barakbah A (2016) Feature extraction of character image using shape energy. In: 2016 International electronics symposium (IES), pp 471–475. IEEE

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Raghunath Dey.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dey, R., Balabantaray, R.C. & Mohanty, S. Sliding window based off-line handwritten text recognition using edit distance. Multimed Tools Appl 81, 22761–22788 (2022). https://doi.org/10.1007/s11042-021-10988-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-10988-9

Keywords

Navigation