Benchmarked multi-script Thai scene text dataset and its multi-class detection solution

Suwanwiwat, Hemmaphan; Das, Abhijit; Saqib, Muhammad; Pal, Umapada

doi:10.1007/s11042-020-10143-w

Benchmarked multi-script Thai scene text dataset and its multi-class detection solution

Published: 07 January 2021

Volume 80, pages 11843–11863, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Hemmaphan Suwanwiwat ORCID: orcid.org/0000-0001-6371-4084¹,
Abhijit Das^2,3,
Muhammad Saqib⁴ &
…
Umapada Pal³

331 Accesses
4 Citations
Explore all metrics

Abstract

Detecting text portion from scene images can be found to be one of the prevalent research topics. Text detection is considered challenging and non-interoperable since there could be multiple scripts in a scene image. Each of these scripts can have different properties, therefore, it is crucial to research the scene text detection based on the geographical location owing to different scripts. As no work on large-scale multi-script Thai scene text detection is found in the literature, the work conducted in this study focuses on multi-script text that includes Thai, English (Roman), Chinese or Chinese-like script, and Arabic. These scripts can generally be seen around Thailand. Thai script contains more consonants, vowels, and has numerals when compared to the Roman/ English script. Furthermore, the placement of letters, intonation marks, as well as vowels, are different from English or Chinese-like script. Hence, it could be considered challenging to detect and recognise the Thai text. This study proposed a multi-script dataset which includes the aforementioned scripts and numerals, along with a benchmarking employing Single Shot Multi-Box Detector (SSD) and Faster Regions with Convolutional Neural Networks (F-RCNN). The proposed dataset contains scene images which were recorded in Thailand. The dataset consists of 600 images, together with their manual detection annotation. This study also proposed a detection technique hypothesising a multiscript scene text detection problem as a multi-class detection problem which found to work more effective than legacy approaches. The experimental results from employing the proposed technique with the dataset achieved encouraging precision and recall rates when compared with such methods. The proposed dataset is available upon email request to the corresponding authors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Telugu Scene Text Detection Using Dense Textbox

A Novel Multi-scale Deep Neural Framework for Script Invariant Text Detection

Article 10 January 2022

Tauseef Khan & Ayatullah Faruk Mollah

Text detection, recognition, and script identification in natural scene images: a Review

Article 05 July 2022

Veronica Naosekpam & Nilkanta Sahu

References

Suwanwiwat H, Das A, Ferrer M, Pal U, Blumenstein M (2018) An investigation of discrete Hidden Markov Models on handwritten short answer assessment system. Proceedings of the International Conference on Pattern Recognition and Artificial Intelligence, Montréal, Canada. (https://users.encs.concordia.ca/~icprai18/ICPRAI%202018%20Proceedings.pdf)
Bahlmann C, Zhu Y, Ramesh V, Pellkofer M, Koehler T (2005) A system for traffic sign detection, tracking, and recognition using color, shape, and motion information. In: IEEE proceedings. Intelligent vehicles symposium, 2005, pp 255–260
Chowdhury MMA, Deb K (2013) Article: extracting and segmenting container name from container images. Int J Comput Appl 74(19):18–22. Full text available
Google Scholar
Chumuang N, Ketcham M (2014) Intelligent handwriting thai signature recognition system based on artificial neuron network. In: TENCON 2014—2014 IEEE Region 10 conference, pp 1–6
Das A, Ferrer M A, Pal U, Pal S, Diaz M, Blumenstein M (2016) Multi-script versus single-script scenarios in automatic off-line signature verification. IET Biom 5(4):305–313
Article Google Scholar
Das A, Suwanwiwat H, Ferrer M A, Pal U, Blumenstein M (2018) Thai automatic signature verification system employing textural features. IET Biom 7(6):615–627
Article Google Scholar
Das A, Suwanwiwat H, Pal U, Blumenstein M (2018) Icfhr 2020 competition on short answerassessment and thai student signature and namecomponents recognition and verification (sasigcom 2020). In: 2018 16th International conference on frontiers in handwriting recognition (ICFHR). IEEE, pp 500–505
Epshtein B, Ofek E, Wexler Y (2010) Detecting text in natural scenes with stroke width transform. In: 2010 IEEE Computer Society conference on computer vision and pattern recognition. https://doi.org/10.1109/CVPR.2010.5540041, pp 2963–2970
Fahn C, Chang P (2013) Text plates detection and recognition techniques used for an autonomous robot navigating in indoor environments. In: 2013 6th IEEE conference on robotics, automation and mechatronics (RAM), pp 37–42
Fung C C, Chamchong R (2010) A review of evaluation of optimal binarization technique for character segmentation in historical manuscripts. In: 2010 third international conference on knowledge discovery and data mining, pp 236–240
He Z, Liu J, Ma H, Li P (2003) A new automatic extraction method of container identity codes. In: Proceedings of the 2003 IEEE International conference on intelligent transportation systems, vol 2, pp 1688–1691
Jirattitichareon W, Chalidabhongse T H (2006) Automatic detection and segmentation of text in low quality thai sign images. In: APCCAS 2006—2006 IEEE Asia Pacific conference on circuits and systems, pp 1000–1003
Karatzas D, Gomez-Bigorda L, Nicolaou A, Ghosh S, Bagdanov A, Iwamura M, Matas J, Neumann L, Chandrasekhar V R, Lu S, Shafait F, Uchida S, Valveny E (2015) Icdar 2015 competition on robust reading. In: 2015 13th International conference on document analysis and recognition (ICDAR), pp 1156–1160
Kobchaisawat T, Chalidabhongse T H (2014) Thai text localization in natural scene images using convolutional neural network. In: Signal and information processing association annual summit and conference (APSIPA), 2014 Asia-Pacific, pp 1–7
Lee J, Lee P, Lee S, Yuille A, Koch C (2011) Adaboost for text detection in natural scene. In: 2011 International conference on document analysis and recognition, pp 429–434. https://doi.org/10.1109/ICDAR.2011.93
Li G, Jiang D, Zhou Y, Jiang G, Kong J, Manogaran G (2019) Human lesion detection method based on image information and brain signal. IEEE Access 7:11533–11542
Article Google Scholar
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: single shot multibox detector. Lecture Notes in Computer Science, pp 21–37. https://doi.org/10.1007/978-3-319-46448-0_2
Long S, He X, Yao C (2020) Scene Text Detection and Recognition: The Deep Learning Era. Int J Comput Vis. https://doi.org/10.1007/s11263-020-01369-0, (https://link.springer.com/article/10.1007/s11263-020-01369-0)
Lucas S M (2005) Icdar 2005 text locating competition results. In: Eighth international conference on document analysis and recognition (ICDAR’05), vol 1, pp 80–84
Lucas SM, Panaretos A, Sosa L, Tang A, Wong S, Young R (2003) Icdar 2003 robust reading competitions. In: Proceedings of the seventh international conference on document analysis and recognition—volume 2, ICDAR ’03. IEEE Computer Society, Washington, DC, p 682. http://dl.acm.org/citation.cfm?id=938980.939531
Nayef N, Patel Y, Busta M, Chowdhury PN, Karatzas D, Khlif W, Matas J, Pal U, Burie JC, Lin Liu C, Ogier JM (2019) ICDAR2019 Robust Reading Challenge on Multilingual Scene Text Detection and Recognition RRC-MLT-2019. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), Sydney, Australia, pp 1582–1587. https://doi.org/10.1109/ICDAR.2019.00254, https://ieeexplore.ieee.org/abstract/document/8978096
Parkinson C, Jacobson J J, Ferguson D, Pombo S Instant translation system (U.S. Patent 9,507,772 B2, Nov. 2016)
Phokharatkul P, Kimpan C (1998) Recognition of handprinted thai characters using the cavity features of character based on neural network. In: IEEE. APCCAS 1998. 1998 IEEE Asia-Pacific conference on circuits and systems. Microelectronics and integrating systems. Proceedings (Cat. No.98EX242), pp 149–152
Phokharatkul P, Kimpan C (2002) Handwritten thai character recognition using fourier descriptors and genetic neural networks. Comput Intell 18 (3):270–293
Article Google Scholar
Phokharatkul P, Sankhuangaw K, Somkuarnpanit S, Phaiboon S, Kimpan C (2007) Off-line hand written Thai character recognition using antminer algorithm. Int J Comput Inf Syst Control Eng 1(8):2596–2601
Google Scholar
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, R. Garnett (eds). Advances in Neural Information Processing Systems. Curran Associates, Inc., pp. 91–99, https://proceedings.neurips.cc/paper/2015/file/14bfa6bb14875e45bba028a21ed38046-Paper.pdf
Sanguansat P, Asdornwised W, Jitapunkul S (2004) Online thai handwritten character recognition using hidden markov models and support vector machines. In: IEEE International symposium on communications and information technology, 2004. ISCIT 2004, vol 1, pp 492–497
Shahab A, Shafait F, Dengel A (2011) Icdar 2011 robust reading competition challenge 2: reading text in scene images. In: 2011 International conference on document analysis and recognition, pp 1491–1496
Shi C, Wang C, Xiao B, Gao S, Hu J (2014) End-to-end scene text recognition using tree-structured models. Pattern Recognit 47 (9):2853–2866. https://doi.org/10.1016/j.patcog.2014.03.023. http://www.sciencedirect.com/science/article/pii/S0031320314001216
Article Google Scholar
Suwanwiwat H, Das A, Ferrer M A, Pal U, Blumenstein M (2017) An automatic student verification system utilising off-line thai name components. In: 2017 International conference on digital image computing: techniques and applications (DICTA), pp 1–6
Suwanwiwat H, Das A, Pal U, Blumenstein M (2018) Icfhr 2018 competition on thai student signatures and name components recognition and verification (tsncrv2018). In: 2018 16th International conference on frontiers in handwriting recognition (ICFHR). IEEE, pp 500–505
Theeramunkong T, Wongtapan C, Sinthupinyo S (2002) Offline isolated handwritten thai ocr using island-based projection with n-gram models and hidden markov models. In: Lim E P, Foo S, Khoo C, Chen H, Fox E, Urs S, Costantino T (eds) Digital libraries: people, knowledge, and technology. Springer, Berlin, pp 340–351
Tzutalin: Labelimg. git code (2015). Available: https://github.com/tzutalin/labelImg, Accessed 09 Oct 2019
Wang K, Babenko B, Belongie S (2011) End-to-end scene text recognition. In: 2011 International conference on computer vision, pp 1457–1464
Wong E K, Chen M (2003) A new robust algorithm for video text extraction. Pattern Recognit 36(6):1397–1406
Article Google Scholar
Wongsirichot T, Seekaew P, Arnon P, Żołnierek A (2011) Thai character recognition using “snakecut” technique. In: Burduk R, Kurzyński M, Woźniak M (eds) Computer recognition systems 4. Springer, Berlin, pp 747–755
Woraratpanya K, Boonchukusol P, Kuroki Y, Kato Y (2013) Improved thai text detection from natural scenes. In: 2013 International conference on information technology and electrical engineering (ICITEE), pp 137–142
Woraratpanya K, Pasupa K, Suttapakti U, Boonchukusol P, Titijaroonroj T, Hokking R, Kuroki Y, Kato Y (2014) Text-background decomposition for thai text localization and recognition in natural scenes. In: 2014 6th International conference on information technology and electrical engineering (ICITEE), pp 1–6
Yang J, Chen X, Zhang J, Zhang Y, Waibel A (2002) Automatic detection and translation of text from natural scenes. In: 2002 IEEE International conference on acoustics, speech, and signal processing, vol 2, pp II–2101–II–2104
Yin XC, Yin X, Huang K, Hao HW (2014) Robust text detection in natural scene images. IEEE Trans Pattern Anal Mach Intell 36(5):970–983. https://doi.org/10.1109/tpami.2013.182
Article Google Scholar
Yuan T L, Zhu Z, Xu K, Li C J, Mu T J, Hu S M (2019) A large chinese text dataset in the wild. J Comput Sci Technol 34(3):509–521
Article Google Scholar
Zhou X, Yao C, Wen H, Wang Y, Zhou S, He W, Liang J (2017) Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 5551–5560. (https://openaccess.thecvf.com/content_cvpr_2017/html/Zhou_EAST_An_Efficient_CVPR_2017_paper.html)
Zhu Y, Yao C, Bai X (2016) Scene text detection and recognition: recent advances and future trends. Front Comput Sci 10(1):19–36
Article Google Scholar

Download references

Author information

Authors and Affiliations

James Cook University, Cairns, Australia
Hemmaphan Suwanwiwat
Inria Sophia Antipolish, Nice, France
Abhijit Das
CVPR Unit, Indian Statistical Institute, Kolkata, India
Abhijit Das & Umapada Pal
University of Technology Sydney, Sydney, Australia
Muhammad Saqib

Authors

Hemmaphan Suwanwiwat
View author publications
You can also search for this author in PubMed Google Scholar
Abhijit Das
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Saqib
View author publications
You can also search for this author in PubMed Google Scholar
Umapada Pal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Hemmaphan Suwanwiwat or Abhijit Das.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Hemmaphan Suwanwiwat and Abhijit Das have equal contribution.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Suwanwiwat, H., Das, A., Saqib, M. et al. Benchmarked multi-script Thai scene text dataset and its multi-class detection solution. Multimed Tools Appl 80, 11843–11863 (2021). https://doi.org/10.1007/s11042-020-10143-w

Download citation

Received: 30 November 2019
Revised: 17 September 2020
Accepted: 23 October 2020
Published: 07 January 2021
Issue Date: March 2021
DOI: https://doi.org/10.1007/s11042-020-10143-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Benchmarked multi-script Thai scene text dataset and its multi-class detection solution

Abstract

Access this article

Similar content being viewed by others

Telugu Scene Text Detection Using Dense Textbox

A Novel Multi-scale Deep Neural Framework for Script Invariant Text Detection

Text detection, recognition, and script identification in natural scene images: a Review

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Benchmarked multi-script Thai scene text dataset and its multi-class detection solution

Abstract

Access this article

Similar content being viewed by others

Telugu Scene Text Detection Using Dense Textbox

A Novel Multi-scale Deep Neural Framework for Script Invariant Text Detection

Text detection, recognition, and script identification in natural scene images: a Review

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation