Abstract
The paper presents a summary of the 1st Competition on Script Identification in the Wild (SIW 2021) organised in conjunction with 16th International Conference on Document Analysis and Recognition (ICDAR 2021). The goal of SIW is to evaluate the limits of script identification approaches through a large scale in the wild database including 13 scripts (MDIW-13 dataset) and two different scenarios (handwritten and printed). The competition includes the evaluation over three different tasks depending of the nature of the data used for training and testing. Nineteen research groups registered for SIW 2021, out of which 6 teams from both academia and industry took part in the final round and submitted a total of 166 algorithms for scoring. Submissions included a wide variety of deep-learning solutions as well as approaches based on standard image processing techniques. The performance achieved by the participants prove the elevate accuracy of deep learning methods in comparison with traditional statistical approaches. The best approach obtained classification accuracies of 99% in all three tasks with experiments over more than 50K test samples. The results suggest that there is still room for improvements, specially over handwritten samples and specific scripts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
56 is 224/4, and 224 is CNN’s default input width.
References
Eikvil, L.: Optical character recognition, p. 26 (1993). citeseer.ist.psu.edu/142042.html
Das, A., et al.: Multi-script versus single-script scenarios in automatic off-line signature verification. IET Biometr. 5(4), 305–313 (2016)
Das, A., et al.: Thai automatic signature verification system employing textural features. IET Biometr. 7(6), 615–627 (2018)
Ferrer, M.A., et al.: Multiple training-one test methodology for handwritten word-script identification. In: 2014 14th International Conference on Frontiers in Handwriting Recognition, pp. 754–759. IEEE (2014)
Suwanwiwat, H., et al.: Benchmarked multi-script Thai scene text dataset and its multi-class detection solution. Multimed. Tools Appl. 80(8), 11843–11863 (2021)
Keserwani, P., et al.: Zero shot learning based script identification in the wild. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 987–992. IEEE (2019)
Ghosh, D., et al.: Script recognition–a review. IEEE Trans. Pattern Anal. Mach. Intell. 32(12), 2142–2161 (2010)
Bhunia, A., et al.: Indic handwritten script identification using offline-online multi-modal deep network. Inform. Fusion 57, 1–14 (2020)
Ubul, K., et al.: Script identification of multi-script documents: a survey. IEEE Access 5, 6546–6559 (2017)
Obaidullah, S.M., et al.: Handwritten Indic script identification in multi-script document images: a survey. IJPRAI 32(10), 1856012 (2018)
Brunessaux, S., et al.: The maurdor project: improving automatic processing of digital documents. In: DAS, pp. 349–354. IEEE (2014)
Singh, P.K., et al.: Benchmark databases of handwritten Bangla-Roman and Devanagari-Roman mixed-script document images. Multimed. Tools ad Appl. 77(7), 8441–8473 (2018)
Sharma, N., et al.: ICDAR 2015 competition on video script identification (CVSI 2015). In: ICDAR, pp. 1196–1200. IEEE (2015)
Nayef, N., et al.: ICDAR 2019 robust reading challenge on multi-lingual scene text detection and recognition—RRC-MLT-2019. In: ICDAR, pp. 1582–1587. IEEE (2019)
Nayef, N., et al.: ICDAR 2017 robust reading challenge on multi-lingual scene text detection and script identification-RRC-MLT. In: ICDAR, vol. 1, pp. 1454–1459. IEEE (2017)
Kumar, D., et al.: Multi-script robust reading competition in ICDAR 2013. In: 4th International Workshop on Multilingual OCR, pp. 1–5 (2013)
Djeddi, C., et al.: ICDAR 2015 competition on multi-script writer identification and gender classification using ‘QUWI’ database. In: ICDAR, pp. 1191–1195. IEEE (2015)
Ferrer, M.A., et al.: MDIW-13 multiscript document database (2019)
Ferrer, M.A. et al.: MDIW-13: New database and benchmark for script identification
Hu, T., et al.: See better before looking closer: weakly supervised data augmentation network for fine-grained visual classification. arXiv preprint arXiv:1901.09891 (2019)
Zhang, N., Donahue, J., Girshick, R., Darrell, T.: Part-based R-CNNs for fine-grained category detection. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 834–849. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_54
Wei, X., et al.: Mask-CNN: localizing parts and selecting descriptors for fine-grained image recognition. arXiv preprint arXiv:1605.06878 (2016)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Long, J., et al.: Fully convolutional networks for semantic segmentation. In: CVPR, pp. 3431–3440 (2015)
Chen, L., et al.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: ECCV, pp. 801–818 (2018)
Zhang, H., et al.: ResNeSt: split-attention networks. arXiv preprint arXiv:2004.08955 (2020)
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. arXiv preprint arXiv:2103.14030 (2021)
Sun, K., et al.: Deep high-resolution representation learning for human pose estimation. In: CVPR, pp. 5693–5703 (2019)
Yuan, Y., et al.: Object-contextual representations for semantic segmentation. arXiv preprint arXiv:1909.11065 (2019)
Dai, J., et al.: Deformable convolutional networks. In: CVPR, pp. 764–773 (2017)
Lee, D., et al.: Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: ICMLW, vol. 3 (2013)
Lin, T., et al.: Focal loss for dense object detection. In: CVPR, pp. 2980–2988 (2017)
Luo, C., et al.: Learn to augment: joint data augmentation and network optimization for text recognition. In: CVPR, pp. 13746–13755 (2020)
Huang, G., et al.: Densely connected convolutional networks. In: CVPR, pp. 2261–2269 (2017)
Woo, S., et al.: CBAM: convolutional block attention module. In: ECCV, pp. 3–19 (2018)
He, K., et al.: Deep residual learning for image recognition, pp. 2980–2988 (2017)
Le, Q., Tan, M.: EfficientNet: rethinking model scaling for convolutional neural networks, pp. 6105–6114 (2019)
Tan, M., Le, Q.V.: EfficientNet: rethinking model scaling for convolutional neural networks. CoRR, abs/1905.11946 (2019)
Isola, P., et al.: Image-to-image translation with conditional adversarial networks. In: CVPR (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Das, A. et al. (2021). ICDAR 2021 Competition on Script Identification in the Wild. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12824. Springer, Cham. https://doi.org/10.1007/978-3-030-86337-1_49
Download citation
DOI: https://doi.org/10.1007/978-3-030-86337-1_49
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86336-4
Online ISBN: 978-3-030-86337-1
eBook Packages: Computer ScienceComputer Science (R0)