Abstract
Due to the cursive nature, segmentation of handwritten Bangla words into characters and also recognition of the same sometimes become a very challenging problem to the researchers. Presence of comparatively large character set along with modifiers, ascendants, descendants, and compound characters makes the segmentation task more complex. As holistic method avoids such character-level segmentation, it is generally useful for the recognition of words written in any such complex scripts. In the present work, a holistic handwritten word recognition method is developed using a feature descriptor, designed by combining different Elliptical, Tetragonal and Vertical pixel density histogram-based features. Recognition process is carried out separately using two classifiers, namely multi-layer perceptron (MLP) and support vector machine (SVM). For evaluation of the proposed method, a database of 18,000 handwritten Bangla word images, having 120 word classes, is prepared. The proposed system performs comparatively better with SVM than MLP for the prepared dataset. It has achieved 83.64% accuracy at best case and 79.38% accuracy on an average using fivefold cross-validation. The current method has also outperformed some recently reported holistic word recognition technique tested on the developed dataset. In addition to that the database, prepared in this work, is made freely available to fill the absence of a publicly available standard database for holistic Bangla word recognition.
















Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
By foreground pixels in a word image, we mean object pixels only and the rest of the pixels are considered as representing the background. In this paper, we have followed this convention.
References
Chacko BP, Krishnan VRV, Raju G, Anto PB (2012) Handwritten character recognition using wavelet energy and extreme learning machine. Int J Mach Learn Cybern 3(2):149–161
Prasad JR, Kulkarni U (2015) Gujrati character recognition using weighted k-NN and mean χ2 distance measure. Int J Mach Learn Cybern 6(1):69–82
Pal U, Roy K, Kimura F (2009) A lexicon-driven handwritten city-name ecognition scheme for Indian postal automation. IEICE Trans Inf Syst 92(5):1146–1158
Pal U, Roy RK, Kimura F (2012) Multi-lingual city name recognition for Indian postal automation. In: 2012 international conference on frontiers in handwriting recognition (ICFHR), pp 169–173
Morita M, El Yacoubi A, Sabourin R, Bortolozzi F, Suen CY (2001) Handwritten month word recognition on Brazilian bank cheques. In: Sixth international conference on document analysis and recognition. Proceedings, pp 972–976
Bunke H, Bengio S, Vinciarelli A (2004) Offline recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Trans Pattern Anal Mach Intell 26(6):709–720
Madhvanath S, Govindaraju V, Ramanaprasad V, Lee D-S, Srihari SN (1995) Reading handwritten US census forms. In: Proceedings of the third international conference on document analysis and recognition, vol 1, pp 82–85
Srihari SN, Shin YC, Ramanaprasad V, Lee D-S (1995) Name and address block reader system for tax form processing. In: Proceedings of the third international conference on document analysis and recognition, vol 1, pp 5–10
Prasad JR, Kulkarni U (2015) Gujarati character recognition using adaptive neuro fuzzy classifier with fuzzy hedges. Int J Mach Learn Cybern 6(5):763–775
Broumandnia A, Shanbehzadeh J, Varnoosfaderani MR (2008) Persian/arabic handwritten word recognition using M-band packet wavelet transform. Image Vis Comput 26(6):829–842
El Qacimy B, Kerroum MA, Hammouch A (2015) Word-based Arabic handwritten recognition using SVM classifier with a reject option. In: 2015 15th international conference on intelligent systems design and applications (ISDA), pp 64–68
Dehghan M, Faez K, Ahmadi M, Shridhar M (2001) Handwritten Farsi (Arabic) word recognition: a holistic approach using discrete HMM. Pattern Recognit 34(5):1057–1065
Liu C-L, Koga M, Fujisawa H (2002) Lexicon-driven segmentation and recognition of handwritten character strings for Japanese address reading. IEEE Trans Pattern Anal Mach Intell 24(11):1425–1437
Su T (2013) Chinese handwriting recognition: an algorithmic perspective. Springer, Berlin
Srihari SN, Yang X, Ball GR (2007) Offline Chinese handwriting recognition: an assessment of current technology. Front Comput Sci China 1(2):137–155
Koerich AL, Sabourin R, Suen CY (2005) Recognition and verification of unconstrained handwritten words. IEEE Trans Pattern Anal Mach Intell 27(10):1509–1522
Bunke H (2003) Recognition of cursive Roman handwriting: past, present and future. In: Seventh international conference on document analysis and recognition. Proceedings, pp 448–459
Bozinovic RM, Srihari SN (1989) Off-line cursive script word recognition. IEEE Trans Pattern Anal Mach Intell 11(1):68–83
“Bengali language”. https://en.wikipedia.org/wiki/Bengali_language. Accessed 27 Dec 2017
Das N, Sarkar R, Basu S, Saha PK, Kundu M, Nasipuri M (2015) Handwritten Bangla character recognition using a soft computing paradigm embedded in two pass approach. Pattern Recognit 48(6):2054–2071
Rahman MM, Akhand MAH, Islam S, Shill PC, Rahman MMH (2015) Bangla handwritten character recognition using convolutional neural network. Int J Image Graph Signal Process 7(8):42
Das N, Basu S, Saha PK, Sarkar R, Kundu M, Nasipuri M (2015) A GA based approach for selection of local features for recognition of handwritten Bangla numerals. arXiv Prepr. arXiv:1501.05495
Plamondon R, Srihari SN (2000) Online and off-line handwriting recognition: a comprehensive survey. IEEE Trans Pattern Anal Mach Intell 22(1):63–84
Tappert CC, Suen CY, Wakahara T (1990) The state of the art in online handwriting recognition. IEEE Trans Pattern Anal Mach Intell 12(8):787–808
Ruiz-Pinales J, Jaime-Rivas R, Castro-Bleda MJ (2007) Holistic cursive word recognition based on perceptual features. Pattern Recognit Lett 28(13):1600–1609
Dasgupta J, Bhattacharya K, Chanda B (2016) A holistic approach for Off-line handwritten cursive word recognition using directional feature based on Arnold transform. Pattern Recognit Lett 79:73–79
Koerich AL, Sabourin R, Suen CY (2003) Large vocabulary off-line handwriting recognition: a survey. Pattern Anal Appl 6(2):97–121
Plötz T, Fink GA (2009) Markov models for offline handwriting recognition: a survey. Int J Doc Anal Recognit 12(4):269–298
Madhvanath S, Govindaraju V (2001) The role of holistic paradigms in handwritten word recognition. IEEE Trans Pattern Anal Mach Intell 23(2):149–164
Madhvanath S, Kleinberg E, Govindaraju V (1999) Holistic verification of handwritten phrases. IEEE Trans Pattern Anal Mach Intell 21(12):1344–1356
Malakar S, Sharma P, Singh PK, Das M, Sarkar R, Nasipuri M (2017) A holistic approach for handwritten hindi word recognition. Int J Comput Vi. Image Process 7(1):59–78
Tamen Z, Drias H, Boughaci D (2017) An efficient multiple classifier system for Arabic handwritten words recognition. Pattern Recognit Lett 93:123–132
Pechwitz M, Maddouri SS, Märgner V, Ellouze N, Amiri H (2002) IFN/ENIT-database of handwritten Arabic words. Proc CIFED 2:127–136
Roy PP, Dey P, Roy S, Pal U, Kimura F (2014) A novel approach of Bangla handwritten text recognition using HMM. In: 2014 14th international conference on frontiers in handwriting recognition (ICFHR), pp 661–666
Roy PP, Bhunia AK, Das A, Dey P, Pal U (2016) HMM-based Indic handwritten word recognition using zone segmentation. Pattern Recognit 60:1057–1075
Vajda S, Roy K, Pal U, Chaudhuri BB, Belaid A (2009) Automation of Indian postal documents written in Bangla and English. Int J Pattern Recognit Artif Intell 23(8):1599–1632
Bhowmik TK, Roy U, Parui SK (2012) Lexicon reduction technique for Bangla handwritten word recognition. In: 2012 10th IAPR international workshop on document analysis systems (DAS), pp 195–199
Bhowmik TK, Parui SK, Roy U (2008) Discriminative HMM training with GA for handwritten word recognition. In: ICPR 2008. 19th international conference on pattern recognition, pp 1–4
Bhowmik S, Roushan MG, Sarkar R, Nasipuri M, Polley S, Malakar S (2014) Handwritten Bangla word recognition using HOG descriptor. In: Proceedings—4th international conference on emerging applications of information technology, EAIT
Bhowmik S, Malakar S, Sarkar R, Nasipuri M (2014) Handwritten Bangla word recognition using elliptical features. In: 2014 international conference on computational intelligence and communication networks (CICN), pp 257–261
Bhowmik S, Polley S, Roushan MG, Malakar S, Sarkar R, Nasipuri M (2015) A holistic word recognition technique for handwritten Bangla words. Int J Appl Pattern Recognit 2(2):142–159
Barua S, Malakar S, Bhowmik S, Sarkar R, Nasipuri M (2017) Bangla handwritten city name recognition using gradient-based feature, vol 515
Ghosh M, Malakar S, Bhowmik S, Sarkar R, Nasipuri M (2017) Memetic algorithm based feature selection for handwritten city name recognition, vol 776
Ban JC (2015) Neural network equations and symbolic dynamics. Int J Mach Learn Cybern 6(4):567–579
Li Z, Zhou M, Lin H, Pu H (2014) A two stages sparse SVM training. Int J Mach Learn Cybern 5(3):425–434
Liu CL, Koga M, Fujisawa H (2005) Gabor feature extraction for character recognition: comparison with gradient feature. In: Eighth international conference on document analysis and recognition (ICDAR’05), pp 121–125
Sarkar R, Das N, Basu S, Kundu M, Nasipuri M, Basu DK (2012) CMATERdb1: a database of unconstrained handwritten Bangla and Bangla-English mixed script document image. Int J Doc Anal Recognit 15(1):71–83
“CMATERdb2.1.2”. https://drive.google.com/file/d/0B8rZngAQdufXemZmYlI2M2xwdXc/view?usp=sharing
Otsu N (1975) A threshold selection method from gray-level histograms. Automatica 11(285–296):23–27
Soille P (2005) Erosion and dilation. In: Morphological image analysis. Springer, pp 63–103
Yang M, Kpalma K, Ronsin J (2008) A survey of shape feature extraction techniques. IN-TECH
Smith TC, Frank E (2016) Introducing machine learning concepts with WEKA. Stat Genomics Methods Protoc 1418:353–378
Bhunia AK, Das A, Roy PP, Pal U (2015) A comparative study of features for handwritten Bangla text recognition. In: 2015 13th international conference on document analysis and recognition (ICDAR), pp 636–640
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We declare that we do not have any conflict of interest.
Rights and permissions
About this article
Cite this article
Bhowmik, S., Malakar, S., Sarkar, R. et al. Off-line Bangla handwritten word recognition: a holistic approach. Neural Comput & Applic 31, 5783–5798 (2019). https://doi.org/10.1007/s00521-018-3389-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-018-3389-1