Annotation-Free Character Detection in Historical Vietnamese Stele Images

Scius-Bertrand, Anna; Jungo, Michael; Wolf, Beat; Fischer, Andreas; Bui, Marc

doi:10.1007/978-3-030-86549-8_28

Annotation-Free Character Detection in Historical Vietnamese Stele Images

Anna Scius-Bertrand^11,12,
Michael Jungo¹¹,
Beat Wolf¹¹,
Andreas Fischer^11,13 &
…
Marc Bui¹²

Conference paper
First Online: 02 September 2021

3838 Accesses
3 Citations
1 Altmetric

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12821))

Abstract

Images of Historical Vietnamese stone engravings provide historians with a unique opportunity to study the past of the country. However, due to the large heterogeneity of thousands of images regarding both the text foreground and the stone background, it is difficult to use automatic document analysis methods for supporting manual examination, especially with a view to the labeling effort needed for training machine learning systems. In this paper, we present a method for finding the location of Chu Nom characters in the main text of the steles without the need of any human annotation. Using self-calibration, fully convolutional object detection methods trained on printed characters are successfully adapted to the handwritten image collection. The achieved detection results are promising for subsequent document analysis tasks, such as keyword spotting or transcription.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
https://vietnamica.hypotheses.org.
2.
Readers interested in the source code and the dataset are referred to our GitHub repository https://github.com/asciusb/annotationfree.
3.
http://www.nomfoundation.org.
4.
More specifically, three random selections have been performed. 11 samples have been selected among already transcribed steles, 22 samples from the dataset used in previous work [20], and 22 samples from the rest of the dataset.

References

Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: Optimal speed and accuracy of object detection. arXiv:2004.10934 (2020)
Borges Oliveira, D.A., Viana, M.P.: Fast CNN-based document layout analysis. In: Proceedings International Conference on Computer Vision Workshops (ICCVW), pp. 1173–1180 (2017)
Google Scholar
Clanuwat, T., Lamb, A., Kitamoto, A.: KuroNet: Pre-modern Japanese Kuzushiji character recognition with deep learning. In: Proceedings 15th International Conference on Document Analysis and Recognition (ICDAR), pp. 607–614 (2019)
Google Scholar
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings 2nd International Conference on Knowledge Discovery and Data Mining, pp. 226–231 (1996)
Google Scholar
Farhadi, A., Redmon, J.: Yolov3: An incremental improvement. arXiv:1804.02767 (2018)
Fischer, A., Liwicki, M., Ingold, R. (eds.): Handwritten historical document analysis, recognition, and retrieval – State of the art and future trends. World Scientific (2020)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Google Scholar
Jocher, G., et al.: ultralytics/yolov5: v4.0 - nn.SiLU() activations, Weights & Biases logging, PyTorch Hub integration (2021). https://doi.org/10.5281/ZENODO.4418161
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings International Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2117–2125 (2017)
Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)
Google Scholar
Lin, T.Y., et al.: Microsoft COCO: common objects in context. In: Proceedings 13th European Conference on Computer Vision (ECCV), pp. 740–755 (2014)
Google Scholar
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8759–8768 (2018)
Google Scholar
Nguyen, K.C., Nguyen, C.T., Nakagawa, M.: Nom document digitalization by deep convolution neural networks. Pattern Recogn. Lett. 133, 8–16 (2020)
Article Google Scholar
Papin, P.: Aperçu sur le programme “Publication de l’inventaire et du corpus complet des inscriptions sur stèles du Viêt-Nam’’. Bull. de l’École Française d’Extrême-Orient 90(1), 465–472 (2003)
Article Google Scholar
Papin, P., Manh, T.K., Nguyên, N.V.: Corpus des inscriptions anciennes du Vietnam. EPHE, EFEO, Institut Han-Nôm (2005–2013)
Google Scholar
Papin, P., Manh, T.K., Nguyên, N.V.: Catalogue des inscriptions du Viêt-Nam. EPHE, EFEO, Institut Han-Nôm (2007–2012)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016)
Google Scholar
Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 658–666 (2019)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Proceedings International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), pp. 234–241 (2015)
Google Scholar
Scius-Bertrand, A., Voegtlin, L., Alberti, M., Fischer, A., Bui, M.: Layout analysis and text column segmentation for historical Vietnamese steles. In: Proceedings 5th International Workshop on Historical Document Imaging and Processing (HIP), pp. 84–89 (2019)
Google Scholar
Stewart, S., Barrett, B.: Document image page segmentation and character recognition as semantic segmentation. In: Proceedings 4th International Workshop on Historical Document Imaging and Processing (HIP), pp. 101–106 (2017)
Google Scholar
Sudholt, S., Fink, G.A.: PHOCNet: a deep convolutional neural network for word spotting in handwritten documents. In: Proceedings 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 277–282 (2016)
Google Scholar
Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning (ICML), pp. 6105–6114 (2019)
Google Scholar
Tian, Z., Shen, C., Chen, H., He, T.: FCOS: fully convolutional one-stage object detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 9627–9636 (2019)
Google Scholar
Wang, C.Y., Liao, H.Y.M., Wu, Y.H., Chen, P.Y., Hsieh, J.W., Yeh, I.H.: CSPNet: a new backbone that can enhance learning capability of CNN. In: Proceedings International Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 390–391 (2020)
Google Scholar
Wu, Y., He, K.: Group normalization. In: Proceedings European Conference on Computer Vision (ECCV), pp. 3–19 (2018)
Google Scholar
Yang, H., Jin, L., Huang, W., Yang, Z., Lai, S., Sun, J.: Dense and tight detection of Chinese characters in historical documents: datasets and a recognition guided detector. IEEE Access 6, 30174–30183 (2018)
Article Google Scholar
Zhang, S., Chi, C., Yao, Y., Lei, Z., Li, S.Z.: Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9759–9768 (2020)
Google Scholar

Download references

Acknowledgements

This work has been supported by the Swiss Hasler Foundation (project 20008). It has also received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 833933 - VIETNAMICA).

We would like to thank Bélinda Hakkar, Marine Scius-Bertrand, Jean-Michel Nafziger, René Boutin, Morgane Vannier, Delphine Mamie and Tobias Widmer for annotating bounding boxes during more than hundred hours to create the ground truth of the test set.

Author information

Authors and Affiliations

iCoSys, University of Applied Sciences and Arts Western Switzerland, Sierre, Switzerland
Anna Scius-Bertrand, Michael Jungo, Beat Wolf & Andreas Fischer
Ecole Pratique des Hautes Etudes, PSL, Paris, France
Anna Scius-Bertrand & Marc Bui
DIVA, University of Fribourg, Fribourg, Switzerland
Andreas Fischer

Authors

Anna Scius-Bertrand
View author publications
You can also search for this author in PubMed Google Scholar
Michael Jungo
View author publications
You can also search for this author in PubMed Google Scholar
Beat Wolf
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Fischer
View author publications
You can also search for this author in PubMed Google Scholar
Marc Bui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anna Scius-Bertrand .

Editor information

Editors and Affiliations

Universitat Autònoma de Barcelona, Barcelona, Spain
Josep Lladós
Lehigh University, Bethlehem, PA, USA
Daniel Lopresti
Kyushu University, Fukuoka-shi, Japan
Seiichi Uchida

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Scius-Bertrand, A., Jungo, M., Wolf, B., Fischer, A., Bui, M. (2021). Annotation-Free Character Detection in Historical Vietnamese Stele Images. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12821. Springer, Cham. https://doi.org/10.1007/978-3-030-86549-8_28

Download citation

DOI: https://doi.org/10.1007/978-3-030-86549-8_28
Published: 02 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86548-1
Online ISBN: 978-3-030-86549-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)