Skip to main content
Log in

Off-line Bangla handwritten word recognition: a holistic approach

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Due to the cursive nature, segmentation of handwritten Bangla words into characters and also recognition of the same sometimes become a very challenging problem to the researchers. Presence of comparatively large character set along with modifiers, ascendants, descendants, and compound characters makes the segmentation task more complex. As holistic method avoids such character-level segmentation, it is generally useful for the recognition of words written in any such complex scripts. In the present work, a holistic handwritten word recognition method is developed using a feature descriptor, designed by combining different Elliptical, Tetragonal and Vertical pixel density histogram-based features. Recognition process is carried out separately using two classifiers, namely multi-layer perceptron (MLP) and support vector machine (SVM). For evaluation of the proposed method, a database of 18,000 handwritten Bangla word images, having 120 word classes, is prepared. The proposed system performs comparatively better with SVM than MLP for the prepared dataset. It has achieved 83.64% accuracy at best case and 79.38% accuracy on an average using fivefold cross-validation. The current method has also outperformed some recently reported holistic word recognition technique tested on the developed dataset. In addition to that the database, prepared in this work, is made freely available to fill the absence of a publicly available standard database for holistic Bangla word recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Notes

  1. By foreground pixels in a word image, we mean object pixels only and the rest of the pixels are considered as representing the background. In this paper, we have followed this convention.

References

  1. Chacko BP, Krishnan VRV, Raju G, Anto PB (2012) Handwritten character recognition using wavelet energy and extreme learning machine. Int J Mach Learn Cybern 3(2):149–161

    Article  Google Scholar 

  2. Prasad JR, Kulkarni U (2015) Gujrati character recognition using weighted k-NN and mean χ2 distance measure. Int J Mach Learn Cybern 6(1):69–82

    Article  Google Scholar 

  3. Pal U, Roy K, Kimura F (2009) A lexicon-driven handwritten city-name ecognition scheme for Indian postal automation. IEICE Trans Inf Syst 92(5):1146–1158

    Article  Google Scholar 

  4. Pal U, Roy RK, Kimura F (2012) Multi-lingual city name recognition for Indian postal automation. In: 2012 international conference on frontiers in handwriting recognition (ICFHR), pp 169–173

  5. Morita M, El Yacoubi A, Sabourin R, Bortolozzi F, Suen CY (2001) Handwritten month word recognition on Brazilian bank cheques. In: Sixth international conference on document analysis and recognition. Proceedings, pp 972–976

  6. Bunke H, Bengio S, Vinciarelli A (2004) Offline recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Trans Pattern Anal Mach Intell 26(6):709–720

    Article  Google Scholar 

  7. Madhvanath S, Govindaraju V, Ramanaprasad V, Lee D-S, Srihari SN (1995) Reading handwritten US census forms. In: Proceedings of the third international conference on document analysis and recognition, vol 1, pp 82–85

  8. Srihari SN, Shin YC, Ramanaprasad V, Lee D-S (1995) Name and address block reader system for tax form processing. In: Proceedings of the third international conference on document analysis and recognition, vol 1, pp 5–10

  9. Prasad JR, Kulkarni U (2015) Gujarati character recognition using adaptive neuro fuzzy classifier with fuzzy hedges. Int J Mach Learn Cybern 6(5):763–775

    Article  Google Scholar 

  10. Broumandnia A, Shanbehzadeh J, Varnoosfaderani MR (2008) Persian/arabic handwritten word recognition using M-band packet wavelet transform. Image Vis Comput 26(6):829–842

    Article  Google Scholar 

  11. El Qacimy B, Kerroum MA, Hammouch A (2015) Word-based Arabic handwritten recognition using SVM classifier with a reject option. In: 2015 15th international conference on intelligent systems design and applications (ISDA), pp 64–68

  12. Dehghan M, Faez K, Ahmadi M, Shridhar M (2001) Handwritten Farsi (Arabic) word recognition: a holistic approach using discrete HMM. Pattern Recognit 34(5):1057–1065

    Article  Google Scholar 

  13. Liu C-L, Koga M, Fujisawa H (2002) Lexicon-driven segmentation and recognition of handwritten character strings for Japanese address reading. IEEE Trans Pattern Anal Mach Intell 24(11):1425–1437

    Article  Google Scholar 

  14. Su T (2013) Chinese handwriting recognition: an algorithmic perspective. Springer, Berlin

    Book  Google Scholar 

  15. Srihari SN, Yang X, Ball GR (2007) Offline Chinese handwriting recognition: an assessment of current technology. Front Comput Sci China 1(2):137–155

    Article  Google Scholar 

  16. Koerich AL, Sabourin R, Suen CY (2005) Recognition and verification of unconstrained handwritten words. IEEE Trans Pattern Anal Mach Intell 27(10):1509–1522

    Article  Google Scholar 

  17. Bunke H (2003) Recognition of cursive Roman handwriting: past, present and future. In: Seventh international conference on document analysis and recognition. Proceedings, pp 448–459

  18. Bozinovic RM, Srihari SN (1989) Off-line cursive script word recognition. IEEE Trans Pattern Anal Mach Intell 11(1):68–83

    Article  Google Scholar 

  19. “Bengali language”. https://en.wikipedia.org/wiki/Bengali_language. Accessed 27 Dec 2017

  20. Das N, Sarkar R, Basu S, Saha PK, Kundu M, Nasipuri M (2015) Handwritten Bangla character recognition using a soft computing paradigm embedded in two pass approach. Pattern Recognit 48(6):2054–2071

    Article  Google Scholar 

  21. Rahman MM, Akhand MAH, Islam S, Shill PC, Rahman MMH (2015) Bangla handwritten character recognition using convolutional neural network. Int J Image Graph Signal Process 7(8):42

    Article  Google Scholar 

  22. Das N, Basu S, Saha PK, Sarkar R, Kundu M, Nasipuri M (2015) A GA based approach for selection of local features for recognition of handwritten Bangla numerals. arXiv Prepr. arXiv:1501.05495

  23. Plamondon R, Srihari SN (2000) Online and off-line handwriting recognition: a comprehensive survey. IEEE Trans Pattern Anal Mach Intell 22(1):63–84

    Article  Google Scholar 

  24. Tappert CC, Suen CY, Wakahara T (1990) The state of the art in online handwriting recognition. IEEE Trans Pattern Anal Mach Intell 12(8):787–808

    Article  Google Scholar 

  25. Ruiz-Pinales J, Jaime-Rivas R, Castro-Bleda MJ (2007) Holistic cursive word recognition based on perceptual features. Pattern Recognit Lett 28(13):1600–1609

    Article  Google Scholar 

  26. Dasgupta J, Bhattacharya K, Chanda B (2016) A holistic approach for Off-line handwritten cursive word recognition using directional feature based on Arnold transform. Pattern Recognit Lett 79:73–79

    Article  Google Scholar 

  27. Koerich AL, Sabourin R, Suen CY (2003) Large vocabulary off-line handwriting recognition: a survey. Pattern Anal Appl 6(2):97–121

    Article  MathSciNet  Google Scholar 

  28. Plötz T, Fink GA (2009) Markov models for offline handwriting recognition: a survey. Int J Doc Anal Recognit 12(4):269–298

    Article  Google Scholar 

  29. Madhvanath S, Govindaraju V (2001) The role of holistic paradigms in handwritten word recognition. IEEE Trans Pattern Anal Mach Intell 23(2):149–164

    Article  Google Scholar 

  30. Madhvanath S, Kleinberg E, Govindaraju V (1999) Holistic verification of handwritten phrases. IEEE Trans Pattern Anal Mach Intell 21(12):1344–1356

    Article  Google Scholar 

  31. Malakar S, Sharma P, Singh PK, Das M, Sarkar R, Nasipuri M (2017) A holistic approach for handwritten hindi word recognition. Int J Comput Vi. Image Process 7(1):59–78

    Article  Google Scholar 

  32. Tamen Z, Drias H, Boughaci D (2017) An efficient multiple classifier system for Arabic handwritten words recognition. Pattern Recognit Lett 93:123–132

    Article  Google Scholar 

  33. Pechwitz M, Maddouri SS, Märgner V, Ellouze N, Amiri H (2002) IFN/ENIT-database of handwritten Arabic words. Proc CIFED 2:127–136

    Google Scholar 

  34. Roy PP, Dey P, Roy S, Pal U, Kimura F (2014) A novel approach of Bangla handwritten text recognition using HMM. In: 2014 14th international conference on frontiers in handwriting recognition (ICFHR), pp 661–666

  35. Roy PP, Bhunia AK, Das A, Dey P, Pal U (2016) HMM-based Indic handwritten word recognition using zone segmentation. Pattern Recognit 60:1057–1075

    Article  Google Scholar 

  36. Vajda S, Roy K, Pal U, Chaudhuri BB, Belaid A (2009) Automation of Indian postal documents written in Bangla and English. Int J Pattern Recognit Artif Intell 23(8):1599–1632

    Article  Google Scholar 

  37. Bhowmik TK, Roy U, Parui SK (2012) Lexicon reduction technique for Bangla handwritten word recognition. In: 2012 10th IAPR international workshop on document analysis systems (DAS), pp 195–199

  38. Bhowmik TK, Parui SK, Roy U (2008) Discriminative HMM training with GA for handwritten word recognition. In: ICPR 2008. 19th international conference on pattern recognition, pp 1–4

  39. Bhowmik S, Roushan MG, Sarkar R, Nasipuri M, Polley S, Malakar S (2014) Handwritten Bangla word recognition using HOG descriptor. In: Proceedings—4th international conference on emerging applications of information technology, EAIT

  40. Bhowmik S, Malakar S, Sarkar R, Nasipuri M (2014) Handwritten Bangla word recognition using elliptical features. In: 2014 international conference on computational intelligence and communication networks (CICN), pp 257–261

  41. Bhowmik S, Polley S, Roushan MG, Malakar S, Sarkar R, Nasipuri M (2015) A holistic word recognition technique for handwritten Bangla words. Int J Appl Pattern Recognit 2(2):142–159

    Article  Google Scholar 

  42. Barua S, Malakar S, Bhowmik S, Sarkar R, Nasipuri M (2017) Bangla handwritten city name recognition using gradient-based feature, vol 515

  43. Ghosh M, Malakar S, Bhowmik S, Sarkar R, Nasipuri M (2017) Memetic algorithm based feature selection for handwritten city name recognition, vol 776

  44. Ban JC (2015) Neural network equations and symbolic dynamics. Int J Mach Learn Cybern 6(4):567–579

    Article  Google Scholar 

  45. Li Z, Zhou M, Lin H, Pu H (2014) A two stages sparse SVM training. Int J Mach Learn Cybern 5(3):425–434

    Article  Google Scholar 

  46. Liu CL, Koga M, Fujisawa H (2005) Gabor feature extraction for character recognition: comparison with gradient feature. In: Eighth international conference on document analysis and recognition (ICDAR’05), pp 121–125

  47. Sarkar R, Das N, Basu S, Kundu M, Nasipuri M, Basu DK (2012) CMATERdb1: a database of unconstrained handwritten Bangla and Bangla-English mixed script document image. Int J Doc Anal Recognit 15(1):71–83

    Article  Google Scholar 

  48. “CMATERdb2.1.2”. https://drive.google.com/file/d/0B8rZngAQdufXemZmYlI2M2xwdXc/view?usp=sharing

  49. Otsu N (1975) A threshold selection method from gray-level histograms. Automatica 11(285–296):23–27

    Google Scholar 

  50. Soille P (2005) Erosion and dilation. In: Morphological image analysis. Springer, pp 63–103

  51. Yang M, Kpalma K, Ronsin J (2008) A survey of shape feature extraction techniques. IN-TECH

  52. Smith TC, Frank E (2016) Introducing machine learning concepts with WEKA. Stat Genomics Methods Protoc 1418:353–378

    Article  Google Scholar 

  53. Bhunia AK, Das A, Roy PP, Pal U (2015) A comparative study of features for handwritten Bangla text recognition. In: 2015 13th international conference on document analysis and recognition (ICDAR), pp 636–640

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samir Malakar.

Ethics declarations

Conflict of interest

We declare that we do not have any conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhowmik, S., Malakar, S., Sarkar, R. et al. Off-line Bangla handwritten word recognition: a holistic approach. Neural Comput & Applic 31, 5783–5798 (2019). https://doi.org/10.1007/s00521-018-3389-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-018-3389-1

Keywords

Navigation