Skip to main content

Thai Scene Text Recognition with Character Combination

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2022)

Abstract

In recent years, scene text recognition(STR) that recognizing character sequences in natural images is in great demand beyond various fields. However, most STR studies only focus on popular scripts like Chinese or English, too little attention has been paid to minority languages. In this paper, we address problems on Thai STR, and introduce a novel strategy called Thai Character Combination(TCC), which explore original characteristics of Thai text. Unlike most other scripts, characters in Thai text can be written both horizontally and vertically, which brings big challenges to current sequence-based text recognition methods. In order to reduce complexity of structure and alleviate the misalignment problem in attention-based methods, TCC intends to combine Thai characters that stack vertically to independent combined characters. Furthermore, we establish a Thai Scene Text(TST) dataset that collected from multiple scenarios to evaluate the performance of our proposed character modeling strategy. We conduct abundant experiments and analyses to compare the recognition performance of models with and without TCC. The results indicate the effectiveness of the proposed method from multiple perspectives, especially, TCC benefits a lot for long text recognition, and there is a substantial improvement in the recognition accuracy of entire string-level.

C. Li and H. Zhan—These authors contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Baek, J., et al.: What is wrong with scene text recognition model comparisons? dataset and model analysis. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4715–4723 (2019)

    Google Scholar 

  2. Chaiwatanaphan, S., Pluempitiwiriyawej, C., Wangsiripitak, S.: Printed Thai character recognition using shape classification in video sequence along a line. Eng. J. 21(6), 37–45 (2017)

    Article  Google Scholar 

  3. Chamchong, R., Gao, W., McDonnell, M.D.: Thai handwritten recognition on text block-based from Thai archive manuscripts. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1346–1351. IEEE (2019)

    Google Scholar 

  4. Chamchong, R., Gao, W., McDonnell, M.D.: Thai handwritten recognition on text block-based from Thai archive manuscripts. In: 2019 International Conference on Document Analysis and Recognition, ICDAR 2019, Sydney, Australia, 20–25 September 2019, pp. 1346–1351. IEEE (2019)

    Google Scholar 

  5. Cheng, Z., Bai, F., Xu, Y., Zheng, G., Pu, S., Zhou, S.: Focusing attention: towards accurate text recognition in natural images. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5076–5084 (2017)

    Google Scholar 

  6. Emsawas, T., Kijsirikul, B.: Thai printed character recognition using long short-term memory and vertical component shifting. In: Booth, R., Zhang, M.-L. (eds.) PRICAI 2016. LNCS (LNAI), vol. 9810, pp. 106–115. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-42911-3_9

    Chapter  Google Scholar 

  7. Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 369–376 (2006)

    Google Scholar 

  8. He, P., Huang, W., Qiao, Y., Loy, C.C., Tang, X.: Reading scene text in deep convolutional sequences. In: Thirtieth AAAI Conference on Artificial Intelligence (2016)

    Google Scholar 

  9. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)

    Google Scholar 

  10. Hu, W., Cai, X., Hou, J., Yi, S., Lin, Z.: Gtc: guided training of ctc towards efficient and accurate scene text recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11005–11012 (2020)

    Google Scholar 

  11. Jaderberg, M., Vedaldi, A., Zisserman, A.: Deep features for text spotting. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 512–528. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10593-2_34

    Chapter  Google Scholar 

  12. Lee, J., Park, S., Baek, J., Oh, S.J., Kim, S., Lee, H.: On recognizing texts of arbitrary shapes with 2D self-attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 546–547 (2020)

    Google Scholar 

  13. Li, H., Wang, P., Shen, C., Zhang, G.: Show, attend and read: a simple and strong baseline for irregular text recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8610–8617 (2019)

    Google Scholar 

  14. Liao, M., et al.: Scene text recognition from two-dimensional perspective. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8714–8721 (2019)

    Google Scholar 

  15. Litman, R., Anschel, O., Tsiper, S., Litman, R., Mazor, S., Manmatha, R.: Scatter: selective context attentional scene text recognizer. In: proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11962–11972 (2020)

    Google Scholar 

  16. Liu, W., Chen, C., Wong, K.-Y.K., Su, Z., Han, J.: Star-net: a spatial attention residue network for scene text recognition. In: BMVC, vol. 2, p. 7 (2016)

    Google Scholar 

  17. Liu, Z., Li, Y., Ren, F., Goh, W.L., Yu, H.: Squeezedtext: a real-time scene text recognition by binary convolutional encoder-decoder network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)

    Google Scholar 

  18. Lyu, P., Yang, Z., Leng, X., Wu, X., Li, R., Shen, X.: 2D attentional irregular scene text recognizer. arXiv preprint arXiv:1906.05708 (2019)

  19. Phokharatkul, P., Kimpan, C.: Recognition of handprinted Thai characters using the cavity features of character based on neural network. In: IEEE. APCCAS 1998. 1998 IEEE Asia-Pacific Conference on Circuits and Systems. Microelectronics and Integrating Systems. Proceedings (Cat. No. 98EX242), pp. 149–152. IEEE (1998)

    Google Scholar 

  20. Qiao, Z., Zhou, Y., Yang, D., Zhou, Y., Wang, W.: Seed: semantics enhanced encoder-decoder framework for scene text recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13528–13537 (2020)

    Google Scholar 

  21. Sanguansat, P., Asdornwised, W., Jitapunkul, S.: Online Thai handwritten character recognition using hidden Markov models and support vector machines. In: IEEE International Symposium on Communications and Information Technology, 2004, ISCIT 2004, vol. 1, pp. 492–497. IEEE (2004)

    Google Scholar 

  22. Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2016)

    Article  Google Scholar 

  23. Wang, K., Babenko, B., Belongie, S.: End-to-end scene text recognition. In: 2011 International Conference on Computer Vision, pp. 1457–1464. IEEE (2011)

    Google Scholar 

  24. Wang, T., Wu, D.J., Coates, A., Ng, A.Y.: End-to-end text recognition with convolutional neural networks. In Proceedings of the 21st International Conference on Pattern Recognition (ICPR 2012), pp. 3304–3308. IEEE (2012)

    Google Scholar 

  25. Wang, T., et al.: Decoupled attention network for text recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 12216–12224 (2020)

    Google Scholar 

  26. Yao, C., Bai, X., Shi, B., Liu, W.: Strokelets: a learned multi-scale representation for scene text recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4042–4049 (2014)

    Google Scholar 

  27. Yue, X., Kuang, Z., Lin, C., Sun, H., Zhang, W.: RobustScanner: dynamically enhancing positional clues for robust text recognition. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12364, pp. 135–151. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58529-7_9

    Chapter  Google Scholar 

  28. Zhang, Z., Zhang, C., Shen, W., Yao, C., Liu, W., Bai, X.: Multi-oriented text detection with fully convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4159–4167 (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yue Lu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, C., Zhan, H., Zhao, K., Lu, Y. (2022). Thai Scene Text Recognition with Character Combination. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13536. Springer, Cham. https://doi.org/10.1007/978-3-031-18913-5_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-18913-5_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-18912-8

  • Online ISBN: 978-3-031-18913-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics