Abstract
The segmentation-based approach for Optical Character Recognition (OCR) works by first segmenting a text line image into individual character images and then recognizing the characters. The approach relies heavily on the performance of the segmentation process and thus suffers from the problem of touching and broken characters. On the other hand, the unsegmented approach for OCR processes the text line image without segmenting the image into individual characters, and the approach is more suitable for languages such as Thai that contains a lot of touching characters in nature. This paper proposes an application of Long Short-Term Memory (LSTM), which is an unsegmented method, to Thai OCR. The paper also introduces a method called vertical component shifting to solve the problem of a large number of vertically occurring character combinations that occur in four-level writing system of Thai, and pose difficulty for standard LSTM networks. The experimental results demonstrate the better accuracy of our proposed method over standard LSTM networks and other commercial software for Thai OCR.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Jaeger, H.: Tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the echo state network approach. GMD Report 159, Fraunhofer Institute AIS (2002)
Hochreiter S., Bengio Y., Frasconi P., Schmidhuber J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: Kremer, S.C., Kolen, J.F. (eds.) A Field Guide to Dynamical Recurrent Neural Networks. IEEE Press (2001)
Hochreiter, S., Schmidhuber, J.: Long Short-Term Memory. Nueral Comput. 9(8), 1735–1780 (1997)
Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 855–868 (2008)
Graves, A.: Offline Arabic handwriting recognition with multidimensional recurrent neural networks. In: Guide to OCR for Arabic Scripts, pp. 297–313 (2012)
Breuel, T.M., Ul-Hasan, A., Azawi, M.A., Shafait, F.: High-performance OCR for printed English and Fraktur using LSTM networks. In: 12th International Conference on Document Analysis and Recognition, ICDAR, pp. 683–687 (2013)
ABBYY (2012). http://www.abbyy.com/ocr-sdk/
ArnThai (2008). http://arnthai-lite.software.informer.com/2.5/
Bheganan, P., Nayak, R., Xu, Y.: Thai word segmentation with hidden markov model and decision tree. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 74–85. Springer, Heidelberg (2009)
Methasate, I., Marukatut, S.: BEST 2013: Thai Printed Character Recognition Competition. National Electronics and Computer Technology Center, Image Technology Laboratory, Thailand (2013)
Siriteerakul T.: Mixed Thai-English character classification based on histogram of oriented gradient feature. In: 12th International Conference on Document Analysis and Recognition, ICDAR, pp. 847–851 (2013)
Marukatat, S., Methasate, I.: Fast nearest neighbor retrieval using randomized binary codes and approximate euclidean distance. Pattern Recogn. Lett. 34, 1101–1107 (2013)
Rashid, S.F., Shafait, F., Breuel, T.M.: Scanning Nerual Network for Text Line Recognition. In DAS, Gold Coast (2012)
Graves, A., Fernandez, S., Gomes, F., Schmidhuber, J.: Connectionist Temporal Classification: Labeling Unsegemented Sequence Data with Recurrent Nerual Networks, pp. 369–376. In ICML, Pennsylvania (2006)
Graves, A.: Supervised Sequence Labelling with Recurrent Neural Networks. TU Munchen (2008)
Graves, A., Fernandez, S., Schmidhuber, J.: Bidirectional LSTM networks for improved phoneme classification and recognition. In: International Conference on Artificial Neural Networks, Warsaw, Poland, pp. 799–804 (2005)
Breuel, T.M.: The OCRopus open source OCR system. In: DRR XV, vol. 6815, p. 68150F (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Emsawas, T., Kijsirikul, B. (2016). Thai Printed Character Recognition Using Long Short-Term Memory and Vertical Component Shifting. In: Booth, R., Zhang, ML. (eds) PRICAI 2016: Trends in Artificial Intelligence. PRICAI 2016. Lecture Notes in Computer Science(), vol 9810. Springer, Cham. https://doi.org/10.1007/978-3-319-42911-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-42911-3_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42910-6
Online ISBN: 978-3-319-42911-3
eBook Packages: Computer ScienceComputer Science (R0)