Skip to main content
Log in

Direction histogram: novel discriminative global feature for Thai offline handwritten OCR

  • Short Paper
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

The image feature used for classification is a crucial part of a character recognition system. To achieve a high accuracy of offline handwriting recognition, the feature should capture the essence of differences including the differences between different characters and the differences between different drawings of the same character. In this paper, we present a novel image feature called direction histogram (DH) and a feature extraction algorithm called bag of histogram (BoH). Unlike the traditional pre-defined feature, DH was designed based on the nature of language and the variation of writing styles. DH is, therefore, a global feature that represents pixel density in all directions around each center. BoH was introduced as it tolerates to thickness and curve variation and ignores the curve connectivity (if any). Fifty-two datasets, each containing 30 drawings of 80 Thai characters, are used for training our neural network, and the original, thick, and distorted handwriting datasets are used for testing. The recognition system with our proposed DH and BoH feature extraction algorithm yielded higher recognition accuracy compared to the convolutional neural network.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

References

  1. Methasate I, Marukatat S, Sae-tang S, Theeramunkong T (2005) “The feature combination technique for off-line Thai character recognition system”. In: Proceedings of the 2005 Eight international conference on document analysis and recognition

  2. Airphaiboon S, Sangworasil M, Kondo S (1994) “Off-line handwritten Thai characters from word script”. In: Computer Vision & Image Processing., Proceedings of the 12th IAPR International. Conference on, 1994

  3. Phokharatkul P, Kimpan C (1998) “Recognition of handprinted Thai characters using the cavity feature of character based on neural network”. In: Circuits and Systems, 1998. IEEE APCCAS 1998. The 1998 IEEE Asia-Pacific Conference on, 1998

  4. Sankhuangaw K (2005) Off-line handwritten character recognition using ant miner algorithm, Mahidol university

  5. Saetang S (2011) A systematic study of offline recognition of Thai printed and handwritten characters, University of Southamton

  6. Mitrpanont J, Kiwprasopsak S (2002) “The Development of the Feature Extraction Algorithms for Thai Handwritten Character Recognition System,” in Lecture Notes in Computer Science. Springer, Berlin Heidelberg, pp 536–546

    MATH  Google Scholar 

  7. Limkonglap U (2006) Thai Handwritten Character Recognition System (THW-CR) Improving Feature Extraction Process by the Analysis of Contour Characteristics, Mahidol University

  8. Y. Imprasert, Off-line Thai Handwritten Character Recognition Using Heuristic Rules and Neural Network., Mahidol University, 2009

  9. Theeramunkong T, Wongtapan C (2005) “Off-line isolated handwritten Thai OCR using island-based projection with n-gram model and hidden Markov models,” Information Processing and Management: an International Journal—Special issue: an Asian digital libraries perspective 41(1):139–160

  10. Nopsuwanchai R (2003) “Discriminative training for HMM-based offline handwritten character recognition,” in Document Analysis and Recognition, 2003. Proceedings. Seventh International Conference on, 2003

  11. Z. Harris, “Distributional structure,” in Word 10 (23), 1954

  12. Alahmadi A, Joorabchi A, Mahdi AE (2013) “A new text representation scheme combining Bag-of-Words and Bag-of-Concepts approaches for automatic text classification”. In: GCC Conference and Exhibition (GCC), 2013 7th IEEE, 2013

  13. Xiong-wei L, De-cai H, Lu-ming F, Ai-jun X (2011) An image classification algorithm based on bag of visual words and multi-kernel learning. J Multimedia 9(2):269–277

    Google Scholar 

  14. Wang, Huang K (2014) How to Use bag-of-words model better for image classification. Image and Vision Computing, 12

  15. Li Z, Feng X (2013) Near duplicate image detecting algorithm based on bag of visual word model. J Multimedia 8(5):557–564, 10

  16. Zagoris K, Pratikakis I, Antonacopoulos A, Gatos B, Papamarkos N (2013) Distinction between handwritten and machine-printed text based on the bag of visual words model. Pattern Recognition 47:1051–1062, 9

  17. M. Kobos, 2013. [Online]. Available: https://github.com/mkobos/pca_transform. Accessed 1 June 2014

  18. M. O’Neill, 2006. [Online]. Available: [http://www.codeproject.com/Articles/16650/Neural-Network-for-Recognition-of-Handwritten-Digi]. Accessed 1 June 2014

  19. LeCun Y, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. In: Proceedings of the IEEE

  20. Graham B (2014) Spatially-sparse convolutional neural networks. Computer Vision and Pattern Recognition, 09

  21. Yuan A, Bai G, Jiao L, Liu Y (2012) Offline handwritten english character recognition based on convolutional neural network. In: 10th IAPR International Workshop on Document Analysis Systems

  22. Soman ST, Nandigam A, Chakravarthy VS (2013) An efficient multiclassifier system based on convolutional neural network for offline handwritten telugu character recognition. In: 2013 National Conference on Communications, Delhi

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ekawat Chaowicharat.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chaowicharat, E., Naruedomkul, K. & Cercone, N. Direction histogram: novel discriminative global feature for Thai offline handwritten OCR. Pattern Anal Applic 19, 1069–1080 (2016). https://doi.org/10.1007/s10044-016-0536-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-016-0536-0

Keywords

Navigation