Skip to main content
Log in

Offline recognition of handwritten Bangla characters: an efficient two-stage approach

  • Industrial and Commercial Application
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

The present work deals with recognition of handwritten characters of Bangla, a major script of the Indian sub-continent. The main contributions presented here are (a) generation of a database of handwritten basic characters of Bangla and (b) development of a handwritten character recognition scheme suitable for scripts like Bangla consisting of many similar shaped characters for the benchmark results. The present database is a pioneering development in the context of recognition of off-line handwritten characters of this script. It has 37,858 handwritten samples and accommodates a large spectrum of handwriting style by Bangla speaking population. This database will be made available (http://www.isical.ac.in/~ujjwal/download/Banglabasiccharacter.html) free of cost to researchers for further studies. Also, we identified two major factors affecting high recognition accuracies for the present character samples, namely, (a) erratic nature of the presence of headline (shapes of Bangla characters usually contain a horizontal line in its upper part) and (b) existence of several pairs of similar shaped characters. The proposed recognition approach takes care of the above factors. It identifies any confusion in the first stage classification between a pair of similar shaped character classes and resolves the same in the second stage classification by extracting a feature vector based on a non-uniform grid.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Suen CY, Berthod M, Mori S (1980) Automatic recognition of handprinted characters—the state of the art. Proc IEEE 68(4):469–487

    Article  Google Scholar 

  2. Govindan VK, Shivaprasad AP (1990) Character recognition: a review. Pattern Recognit 7:671–683

    Article  Google Scholar 

  3. Trier OD, Jain AK, Taxt T (1996) Feature extraction methods for character recognition—a survey. Pattern Recognit 29(4):641–662

    Article  Google Scholar 

  4. Plamondon R, Srihari SN (2000) On-line and off-line handwriting recognition: a comprehensive survey. IEEE Trans Pattern Anal Mach Intell 22(1):63–84

    Article  Google Scholar 

  5. Arica N, Yarman-Vural F (2001) An overview of character recognition focused on off-line handwriting. IEEE Trans Syst Man Cybern Part C Appl Rev 31(2):216–232

    Article  Google Scholar 

  6. Cheriet M, Kharma N, Liu C-L, Suen CY (2007) Character recognition systems: a guide for students and practitioner. Wiley, New York

  7. Mori S, Suen CY, Yamamoto K (1992) Historical review of OCR research and development. Proc IEEE 80(7):1029–1058

    Article  Google Scholar 

  8. Uchida S, Sakoe H (2005) “A survey of elastic matching techniques for handwritten character recognition”. IEICE Transactions on Information and Systems E88-D(8): 1781–1790

  9. Liu C-L, Sako H, Fujisawa H (2002) Performance evaluation of pattern classifiers for handwritten character recognition. Int J Doc Anal Recognit 4(3):191–204

    Article  Google Scholar 

  10. Park H-S, Sin B-K, Moon J, Lee S-W (2001) A 2-D HMM method for offline handwritten character recognition. Int J Pattern Recognit Artif Intell 15(1):91–105

    Article  Google Scholar 

  11. Vinciarelli A, Bengio S (2002) Writer adaptation techniques in HMM based off-line cursive script recognition. Pattern Recognit Lett 23:905–916

    Article  MATH  Google Scholar 

  12. Al-Omari FA, Al-Jarrah O (2004) Handwritten Indian numerals recognition system using probabilistic neural networks. Adv Eng Inform 18(1):9–16

    Article  Google Scholar 

  13. Liu C-L, Fujisawa H (2008) Classification and learning methods for character recognition: advances and remaining problems. Stud Comput Intell (SCI) 90:139–161

    Article  Google Scholar 

  14. Kim D, Bang S-Y (2000) A handwritten numeral character classification using tolerant rough set. IEEE Trans Pattern Anal Mach Intell 22(9):923–937

    Article  Google Scholar 

  15. Parizeau M, Plamondon R (1995) A fuzzy-syntactic approach to allograph modeling for cursive script recognition. IEEE Trans Pattern Anal Mach Intell 17:702–712

    Article  Google Scholar 

  16. Hanmandlu M, Ramana Murthy OV (2007) Fuzzy model based recognition of handwritten numerals. Pattern Recognit 40(6):1840–1854

    Article  MATH  Google Scholar 

  17. Dong J-X, Krzyak A, Suen CY (2005) An improved handwritten Chinese character recognition system using support vector machine. Pattern Recognit Lett 26:1849–1856

    Article  Google Scholar 

  18. Camastra F (2007) SVM-based cursive character recognizer. Pattern Recognit 40:3721–3727

    Article  MATH  Google Scholar 

  19. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  20. Srihari SN, Cohen E, Hull JJ, Kuan L (1989) A system to locate and recognize ZIP codes in handwritten addresses. Int J Res Eng Post Appl 1(1):37–56

    Google Scholar 

  21. Marti U-V, Bunke H (2002) The IAM-database: an English sentence database for offline handwriting recognition. Int J Doc Anal Recognit 5:39–46

    Article  MATH  Google Scholar 

  22. Tang Y et al (1998) Off-line recognition of Chinese handwriting by multifeature and multilevel classification. IEEE Trans Pattern Anal Mach Intell 20:556–561

    Article  Google Scholar 

  23. Shi D, Damper RI, GUNN SR (2003) Offline handwritten Chinese character recognition by radical decomposition. ACM Trans Asian Lang Inf Process 2(1):2748

    Google Scholar 

  24. Lee SW, Park JS (1994) Nonlinear shape normalization methods for the recognition of large-set handwritten characters. Pattern Recognit 27(7):895–902

    Article  Google Scholar 

  25. Yamada H, Yamamoto K, Saito T (1990) A non-linear normalization method for handprinted Kanji character recognition—line density equalization. Pattern Recognit 23(9):1023–1029

    Article  Google Scholar 

  26. Miyao H, Maruyama M, Nakano Y, Hananoi T (2005) Off-line handwritten character recognition by SVM on the virtual examples synthesized from on-line characters. In: Proceedings of the eighth international conference on document analysis and recognition, pp 494–498

  27. Sethi IK, Chatterjee B (1977) Machine recognition of constrained handprinted Devanagari. Pattern Recognit 9(2):69–75

    Article  Google Scholar 

  28. Parui SK, Chaudhuri BB, Dutta Majumder D (1982) A procedure for recognition of connected hand written numerals. Int J Syst Sci 13:1019–1029

    Article  Google Scholar 

  29. Dutta AK, Chaudhuri S (1993) Bengali alpha-numeric character recognition using curvature features. Pattern Recognit 26:1757–1770

    Article  Google Scholar 

  30. Bhattacharya U, Das TK, Datta A, Parui SK, Chaudhuri BB (2002) A hybrid scheme for handprinted numeral recognition based on a self-organizing network and MLP classifiers. Int J Patt Recog Artif Intell 16:845–864

    Article  Google Scholar 

  31. Bhattacharya U, Chaudhuri BB (2005) Fusion of combination rules of an ensemble of MLP classifiers for improved recognition accuracy of handprinted Bangla numerals. In: Proceedings of the eighth international conference on document analysis and recognition, pp 322–326

  32. Bhattacharya U, Chaudhuri BB (2009) Handwritten numeral databases of Indian scripts and multistage recognition of mixed numerals. IEEE Trans Pattern Anal Mach Intell 31(3):444–457

    Article  Google Scholar 

  33. Rahman AFR, Rahman R, Fairhurst MC (2002) Recognition of handwritten Bengali characters: a novel multistage approach. Pattern Recognit 35:997–1006

    Article  MATH  Google Scholar 

  34. Bhowmick TK, Bhattacharya U, Parui SK (2004) Recognition of Bangla handwritten characters using an MLP classifier based on stroke features. In: Proceedings of 11th international conference on neural information processing, pp 814–819

  35. Bhattacharya U, Parui SK, Shaw B (2007) A hybrid scheme for recognition of handwritten Bangla basic characters based on HMM and MLP classifiers. In: Proceedings of 6th international conference on advances in pattern recognition, pp 101–106

  36. Hull JJ (1994) A database for handwritten text recognition research. IEEE Trans Patt Anal Mach Intell 16:550–554

    Article  Google Scholar 

  37. Khosravi H, Kabir E (2007) Introducing a very large dataset of handwritten Farsi digits and a study on their varieties. Pattern Recognit Lett 28:1133–1141

    Article  Google Scholar 

  38. Al-Maadeed S, Elliman D, Higgins CA (2002) A database for Arabic handwritten text recognition research. In: Proceedings of the eighth international workshop on frontiers in handwriting recognition, p 485

  39. Su T, Zhang T, Guan D (2007) Corpus-based HIT-MW database for offline recognition of general-purpose Chinese handwritten text. Int J Doc Anal Recognit 10:27–38

    Article  Google Scholar 

  40. Saito T, Yamada H, Yamamoto K (1985) On the database ELT9 of handprinted characters in JIS Chinese characters and its analysis (in Japanese). Trans IECEJ 68-D(4):757–764

    Google Scholar 

  41. Al-Ohali Y, Cheriet M, Suen C (2003) Databases for recognition of handwritten Arabic cheques. Pattern Recognit 36:111–121

    Article  MATH  Google Scholar 

  42. Noumi T, Matsui T, Yamashita I, Wakahara T, Tsutsumida T (1994) Tegaki Suji database ‘IPTP CD-ROM1’ no ichi bunseki (in Japanese). In: 1994 autumn meeting of IEICE, vol D-309, September 1994

  43. Bhattacharya U, Shridhar M, Parui SK (2006) On recognition of handwritten Bangla characters. In: Proceedings of 5th Indian conference on computer vision, graphics and image processing, pp 817–828

  44. Chaudhuri BB, Ghosh S (1998) A statistical study of Bangla corpus, recognition. In: Proceedings of international conference on computational linguistics, speech and document processing, Calcutta, India, pp C32–C37, February 1998

  45. Bhattacharya U, Shaw B, Parui SK (2006) Analysis of error sources towards improved form processing. In: Proceedings of the 9th Int. international conference on information technology (ICIT 2006), pp 137–138

  46. Bulacu M, Schomaker L (2007) Text-independent writer identification and verification using textural and allographic features. IEEE Trans Pattern Anal Mach Intell 29(4):701–717

    Article  Google Scholar 

  47. Otsu N (1979) A threshold selection method from grey-level histograms. IEEE Trans Syst Man Cybern 9:377–393

    Google Scholar 

  48. Liu C-L, Nakashima K, Sako H, Fujisawa H (2003) Handwritten digit recognition: benchmarking of state-of-the-art techniques. Pattern Recognit 36:2271–2285

    Article  MATH  Google Scholar 

  49. Freeman H (1974) Computer processing of line-drawing images. ACM Comput Surv 6:57–97

    Article  MATH  Google Scholar 

  50. Kimura F, Takashina K, Tsuruoka S, Miyake Y (1987) Modified quadratic discriminant functions and the application to Chinese character recognition. IEEE Trans Pattern Anal Mach Intell 9(1):149–153

    Article  Google Scholar 

  51. Cao J, Shridhar M, Kimura F, Ahmadi M (1992) Statistical and neural classification of handwritten numerals: a comparative study. In: Proceedings of 11th international conference on pattern recognition, vol II, pp 643–646

  52. Duda RO, Hart PE (1973) Pattern classification and scence analysis. Wesley, New York, p 67

  53. Duin RPW, Krose BJ (1980) On the possibility of avoiding peaking. In: Proceedings of 5th international conference pattern recognition, Miami, FL, pp 1375–1378

  54. Noumi T, Matsui T, Yamashita I, Wakahara T, Tsutsumida T (1994) Result of the second IPTP character recognition competition and studies on multi-expert handwritten numeral recognition. In: Proceedings international workshop on frontiers in handwriting recognition, pp 338–346

Download references

Acknowledgments

The authors would like to acknowledge the support of Bikash Shaw, Suman K. Ghosh and Saikat Das of the Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, Kolkata, towards the development of the database described in the present article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to U. Bhattacharya.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bhattacharya, U., Shridhar, M., Parui, S.K. et al. Offline recognition of handwritten Bangla characters: an efficient two-stage approach. Pattern Anal Applic 15, 445–458 (2012). https://doi.org/10.1007/s10044-012-0278-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-012-0278-6

Keywords

Navigation