ICDAR 2021 Competition on Script Identification in the Wild

Das, Abhijit; Ferrer, Miguel A.; Morales, Aythami; Diaz, Moises; Pal, Umapada; Impedovo, Donato; Li, Hongliang; Yang, Wentao; Ota, Kensho; Yao, Tadahito; Hung, Le Quang; Cuong, Nguyen Quoc; Kim, Seungjae; Gattal, Abdeljalil

doi:10.1007/978-3-030-86337-1_49

Abhijit Das^11,12,
Miguel A. Ferrer¹³,
Aythami Morales¹⁴,
Moises Diaz¹⁵,
Umapada Pal¹¹,
Donato Impedovo¹⁶,
Hongliang Li¹⁷,
Wentao Yang¹⁷,
Kensho Ota¹⁸,
Tadahito Yao¹⁸,
Le Quang Hung¹⁹,
Nguyen Quoc Cuong¹⁹,
Seungjae Kim²⁰ &
…
Abdeljalil Gattal²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12824))

Included in the following conference series:

International Conference on Document Analysis and Recognition

3051 Accesses

Abstract

The paper presents a summary of the 1st Competition on Script Identification in the Wild (SIW 2021) organised in conjunction with 16th International Conference on Document Analysis and Recognition (ICDAR 2021). The goal of SIW is to evaluate the limits of script identification approaches through a large scale in the wild database including 13 scripts (MDIW-13 dataset) and two different scenarios (handwritten and printed). The competition includes the evaluation over three different tasks depending of the nature of the data used for training and testing. Nineteen research groups registered for SIW 2021, out of which 6 teams from both academia and industry took part in the final round and submitted a total of 166 algorithms for scoring. Submissions included a wide variety of deep-learning solutions as well as approaches based on standard image processing techniques. The performance achieved by the participants prove the elevate accuracy of deep learning methods in comparison with traditional statistical approaches. The best approach obtained classification accuracies of 99% in all three tasks with experiments over more than 50K test samples. The results suggest that there is still room for improvements, specially over handwritten samples and specific scripts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://gpds.ulpgc.es/.
2.
56 is 224/4, and 224 is CNN’s default input width.

References

Eikvil, L.: Optical character recognition, p. 26 (1993). citeseer.ist.psu.edu/142042.html
Das, A., et al.: Multi-script versus single-script scenarios in automatic off-line signature verification. IET Biometr. 5(4), 305–313 (2016)
Article Google Scholar
Das, A., et al.: Thai automatic signature verification system employing textural features. IET Biometr. 7(6), 615–627 (2018)
Article Google Scholar
Ferrer, M.A., et al.: Multiple training-one test methodology for handwritten word-script identification. In: 2014 14th International Conference on Frontiers in Handwriting Recognition, pp. 754–759. IEEE (2014)
Google Scholar
Suwanwiwat, H., et al.: Benchmarked multi-script Thai scene text dataset and its multi-class detection solution. Multimed. Tools Appl. 80(8), 11843–11863 (2021)
Article Google Scholar
Keserwani, P., et al.: Zero shot learning based script identification in the wild. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 987–992. IEEE (2019)
Google Scholar
Ghosh, D., et al.: Script recognition–a review. IEEE Trans. Pattern Anal. Mach. Intell. 32(12), 2142–2161 (2010)
Article Google Scholar
Bhunia, A., et al.: Indic handwritten script identification using offline-online multi-modal deep network. Inform. Fusion 57, 1–14 (2020)
Article Google Scholar
Ubul, K., et al.: Script identification of multi-script documents: a survey. IEEE Access 5, 6546–6559 (2017)
Google Scholar
Obaidullah, S.M., et al.: Handwritten Indic script identification in multi-script document images: a survey. IJPRAI 32(10), 1856012 (2018)
Google Scholar
Brunessaux, S., et al.: The maurdor project: improving automatic processing of digital documents. In: DAS, pp. 349–354. IEEE (2014)
Google Scholar
Singh, P.K., et al.: Benchmark databases of handwritten Bangla-Roman and Devanagari-Roman mixed-script document images. Multimed. Tools ad Appl. 77(7), 8441–8473 (2018)
Article Google Scholar
Sharma, N., et al.: ICDAR 2015 competition on video script identification (CVSI 2015). In: ICDAR, pp. 1196–1200. IEEE (2015)
Google Scholar
Nayef, N., et al.: ICDAR 2019 robust reading challenge on multi-lingual scene text detection and recognition—RRC-MLT-2019. In: ICDAR, pp. 1582–1587. IEEE (2019)
Google Scholar
Nayef, N., et al.: ICDAR 2017 robust reading challenge on multi-lingual scene text detection and script identification-RRC-MLT. In: ICDAR, vol. 1, pp. 1454–1459. IEEE (2017)
Google Scholar
Kumar, D., et al.: Multi-script robust reading competition in ICDAR 2013. In: 4th International Workshop on Multilingual OCR, pp. 1–5 (2013)
Google Scholar
Djeddi, C., et al.: ICDAR 2015 competition on multi-script writer identification and gender classification using ‘QUWI’ database. In: ICDAR, pp. 1191–1195. IEEE (2015)
Google Scholar
Ferrer, M.A., et al.: MDIW-13 multiscript document database (2019)
Google Scholar
Ferrer, M.A. et al.: MDIW-13: New database and benchmark for script identification
Google Scholar
Hu, T., et al.: See better before looking closer: weakly supervised data augmentation network for fine-grained visual classification. arXiv preprint arXiv:1901.09891 (2019)
Zhang, N., Donahue, J., Girshick, R., Darrell, T.: Part-based R-CNNs for fine-grained category detection. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 834–849. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_54
Chapter Google Scholar
Wei, X., et al.: Mask-CNN: localizing parts and selecting descriptors for fine-grained image recognition. arXiv preprint arXiv:1605.06878 (2016)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Google Scholar
Long, J., et al.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)
Google Scholar
Chen, L., et al.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: ECCV, pp. 801–818 (2018)
Google Scholar
Zhang, H., et al.: ResNeSt: split-attention networks. arXiv preprint arXiv:2004.08955 (2020)
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. arXiv preprint arXiv:2103.14030 (2021)
Sun, K., et al.: Deep high-resolution representation learning for human pose estimation. In: CVPR, pp. 5693–5703 (2019)
Google Scholar
Yuan, Y., et al.: Object-contextual representations for semantic segmentation. arXiv preprint arXiv:1909.11065 (2019)
Dai, J., et al.: Deformable convolutional networks. In: CVPR, pp. 764–773 (2017)
Google Scholar
Lee, D., et al.: Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: ICMLW, vol. 3 (2013)
Google Scholar
Lin, T., et al.: Focal loss for dense object detection. In: CVPR, pp. 2980–2988 (2017)
Google Scholar
Luo, C., et al.: Learn to augment: joint data augmentation and network optimization for text recognition. In: CVPR, pp. 13746–13755 (2020)
Google Scholar
Huang, G., et al.: Densely connected convolutional networks. In: CVPR, pp. 2261–2269 (2017)
Google Scholar
Woo, S., et al.: CBAM: convolutional block attention module. In: ECCV, pp. 3–19 (2018)
Google Scholar
He, K., et al.: Deep residual learning for image recognition, pp. 2980–2988 (2017)
Google Scholar
Le, Q., Tan, M.: EfficientNet: rethinking model scaling for convolutional neural networks, pp. 6105–6114 (2019)
Google Scholar
Tan, M., Le, Q.V.: EfficientNet: rethinking model scaling for convolutional neural networks. CoRR, abs/1905.11946 (2019)
Google Scholar
Isola, P., et al.: Image-to-image translation with conditional adversarial networks. In: CVPR (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Indian Statistical Institute, Kolkata, India
Abhijit Das & Umapada Pal
Thapar University, Patiala, India
Abhijit Das
Univ. de Las Palmas de Gran Canaria, Las Palmas, Spain
Miguel A. Ferrer
Universidad Autonoma de Madrid, Madrid, Spain
Aythami Morales
Universidad del Atlántico Medio, Las Palmas, Spain
Moises Diaz
Università degli Studi di Bari Aldo Moro, Bari, Italy
Donato Impedovo
South China University of Technology, Guangzhou, China
Hongliang Li & Wentao Yang
Canon IT Solutions Inc., Tokyo, Japan
Kensho Ota & Tadahito Yao
University of Information Technology, Ho Chi Minh City, Vietnam
Le Quang Hung & Nguyen Quoc Cuong
NAVER Papago, Seoul, Korea
Seungjae Kim
Larbi Tebessi University, Tebessa, Algeria
Abdeljalil Gattal

Authors

Abhijit Das
View author publications
You can also search for this author in PubMed Google Scholar
Miguel A. Ferrer
View author publications
You can also search for this author in PubMed Google Scholar
Aythami Morales
View author publications
You can also search for this author in PubMed Google Scholar
Moises Diaz
View author publications
You can also search for this author in PubMed Google Scholar
Umapada Pal
View author publications
You can also search for this author in PubMed Google Scholar
Donato Impedovo
View author publications
You can also search for this author in PubMed Google Scholar
Hongliang Li
View author publications
You can also search for this author in PubMed Google Scholar
Wentao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Kensho Ota
View author publications
You can also search for this author in PubMed Google Scholar
Tadahito Yao
View author publications
You can also search for this author in PubMed Google Scholar
Le Quang Hung
View author publications
You can also search for this author in PubMed Google Scholar
Nguyen Quoc Cuong
View author publications
You can also search for this author in PubMed Google Scholar
Seungjae Kim
View author publications
You can also search for this author in PubMed Google Scholar
Abdeljalil Gattal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abhijit Das .

Editor information

Editors and Affiliations

Universitat Autònoma de Barcelona, Barcelona, Spain
Josep Lladós
Lehigh University, Bethlehem, PA, USA
Daniel Lopresti
Kyushu University, Fukuoka-shi, Japan
Seiichi Uchida

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Das, A. et al. (2021). ICDAR 2021 Competition on Script Identification in the Wild. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12824. Springer, Cham. https://doi.org/10.1007/978-3-030-86337-1_49

Download citation

DOI: https://doi.org/10.1007/978-3-030-86337-1_49
Published: 02 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86336-4
Online ISBN: 978-3-030-86337-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)