Abstract
The image feature used for classification is a crucial part of a character recognition system. To achieve a high accuracy of offline handwriting recognition, the feature should capture the essence of differences including the differences between different characters and the differences between different drawings of the same character. In this paper, we present a novel image feature called direction histogram (DH) and a feature extraction algorithm called bag of histogram (BoH). Unlike the traditional pre-defined feature, DH was designed based on the nature of language and the variation of writing styles. DH is, therefore, a global feature that represents pixel density in all directions around each center. BoH was introduced as it tolerates to thickness and curve variation and ignores the curve connectivity (if any). Fifty-two datasets, each containing 30 drawings of 80 Thai characters, are used for training our neural network, and the original, thick, and distorted handwriting datasets are used for testing. The recognition system with our proposed DH and BoH feature extraction algorithm yielded higher recognition accuracy compared to the convolutional neural network.
References
Methasate I, Marukatat S, Sae-tang S, Theeramunkong T (2005) “The feature combination technique for off-line Thai character recognition system”. In: Proceedings of the 2005 Eight international conference on document analysis and recognition
Airphaiboon S, Sangworasil M, Kondo S (1994) “Off-line handwritten Thai characters from word script”. In: Computer Vision & Image Processing., Proceedings of the 12th IAPR International. Conference on, 1994
Phokharatkul P, Kimpan C (1998) “Recognition of handprinted Thai characters using the cavity feature of character based on neural network”. In: Circuits and Systems, 1998. IEEE APCCAS 1998. The 1998 IEEE Asia-Pacific Conference on, 1998
Sankhuangaw K (2005) Off-line handwritten character recognition using ant miner algorithm, Mahidol university
Saetang S (2011) A systematic study of offline recognition of Thai printed and handwritten characters, University of Southamton
Mitrpanont J, Kiwprasopsak S (2002) “The Development of the Feature Extraction Algorithms for Thai Handwritten Character Recognition System,” in Lecture Notes in Computer Science. Springer, Berlin Heidelberg, pp 536–546
Limkonglap U (2006) Thai Handwritten Character Recognition System (THW-CR) Improving Feature Extraction Process by the Analysis of Contour Characteristics, Mahidol University
Y. Imprasert, Off-line Thai Handwritten Character Recognition Using Heuristic Rules and Neural Network., Mahidol University, 2009
Theeramunkong T, Wongtapan C (2005) “Off-line isolated handwritten Thai OCR using island-based projection with n-gram model and hidden Markov models,” Information Processing and Management: an International Journal—Special issue: an Asian digital libraries perspective 41(1):139–160
Nopsuwanchai R (2003) “Discriminative training for HMM-based offline handwritten character recognition,” in Document Analysis and Recognition, 2003. Proceedings. Seventh International Conference on, 2003
Z. Harris, “Distributional structure,” in Word 10 (23), 1954
Alahmadi A, Joorabchi A, Mahdi AE (2013) “A new text representation scheme combining Bag-of-Words and Bag-of-Concepts approaches for automatic text classification”. In: GCC Conference and Exhibition (GCC), 2013 7th IEEE, 2013
Xiong-wei L, De-cai H, Lu-ming F, Ai-jun X (2011) An image classification algorithm based on bag of visual words and multi-kernel learning. J Multimedia 9(2):269–277
Wang, Huang K (2014) How to Use bag-of-words model better for image classification. Image and Vision Computing, 12
Li Z, Feng X (2013) Near duplicate image detecting algorithm based on bag of visual word model. J Multimedia 8(5):557–564, 10
Zagoris K, Pratikakis I, Antonacopoulos A, Gatos B, Papamarkos N (2013) Distinction between handwritten and machine-printed text based on the bag of visual words model. Pattern Recognition 47:1051–1062, 9
M. Kobos, 2013. [Online]. Available: https://github.com/mkobos/pca_transform. Accessed 1 June 2014
M. O’Neill, 2006. [Online]. Available: [http://www.codeproject.com/Articles/16650/Neural-Network-for-Recognition-of-Handwritten-Digi]. Accessed 1 June 2014
LeCun Y, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. In: Proceedings of the IEEE
Graham B (2014) Spatially-sparse convolutional neural networks. Computer Vision and Pattern Recognition, 09
Yuan A, Bai G, Jiao L, Liu Y (2012) Offline handwritten english character recognition based on convolutional neural network. In: 10th IAPR International Workshop on Document Analysis Systems
Soman ST, Nandigam A, Chakravarthy VS (2013) An efficient multiclassifier system based on convolutional neural network for offline handwritten telugu character recognition. In: 2013 National Conference on Communications, Delhi
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chaowicharat, E., Naruedomkul, K. & Cercone, N. Direction histogram: novel discriminative global feature for Thai offline handwritten OCR. Pattern Anal Applic 19, 1069–1080 (2016). https://doi.org/10.1007/s10044-016-0536-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-016-0536-0