Abstract
Images of Historical Vietnamese stone engravings provide historians with a unique opportunity to study the past of the country. However, due to the large heterogeneity of thousands of images regarding both the text foreground and the stone background, it is difficult to use automatic document analysis methods for supporting manual examination, especially with a view to the labeling effort needed for training machine learning systems. In this paper, we present a method for finding the location of Chu Nom characters in the main text of the steles without the need of any human annotation. Using self-calibration, fully convolutional object detection methods trained on printed characters are successfully adapted to the handwritten image collection. The achieved detection results are promising for subsequent document analysis tasks, such as keyword spotting or transcription.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
Readers interested in the source code and the dataset are referred to our GitHub repository https://github.com/asciusb/annotationfree.
- 3.
- 4.
More specifically, three random selections have been performed. 11 samples have been selected among already transcribed steles, 22 samples from the dataset used in previous work [20], and 22 samples from the rest of the dataset.
References
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: Optimal speed and accuracy of object detection. arXiv:2004.10934 (2020)
Borges Oliveira, D.A., Viana, M.P.: Fast CNN-based document layout analysis. In: Proceedings International Conference on Computer Vision Workshops (ICCVW), pp. 1173–1180 (2017)
Clanuwat, T., Lamb, A., Kitamoto, A.: KuroNet: Pre-modern Japanese Kuzushiji character recognition with deep learning. In: Proceedings 15th International Conference on Document Analysis and Recognition (ICDAR), pp. 607–614 (2019)
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings 2nd International Conference on Knowledge Discovery and Data Mining, pp. 226–231 (1996)
Farhadi, A., Redmon, J.: Yolov3: An incremental improvement. arXiv:1804.02767 (2018)
Fischer, A., Liwicki, M., Ingold, R. (eds.): Handwritten historical document analysis, recognition, and retrieval – State of the art and future trends. World Scientific (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Jocher, G., et al.: ultralytics/yolov5: v4.0 - nn.SiLU() activations, Weights & Biases logging, PyTorch Hub integration (2021). https://doi.org/10.5281/ZENODO.4418161
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings International Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2117–2125 (2017)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)
Lin, T.Y., et al.: Microsoft COCO: common objects in context. In: Proceedings 13th European Conference on Computer Vision (ECCV), pp. 740–755 (2014)
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8759–8768 (2018)
Nguyen, K.C., Nguyen, C.T., Nakagawa, M.: Nom document digitalization by deep convolution neural networks. Pattern Recogn. Lett. 133, 8–16 (2020)
Papin, P.: Aperçu sur le programme “Publication de l’inventaire et du corpus complet des inscriptions sur stèles du Viêt-Nam’’. Bull. de l’École Française d’Extrême-Orient 90(1), 465–472 (2003)
Papin, P., Manh, T.K., Nguyên, N.V.: Corpus des inscriptions anciennes du Vietnam. EPHE, EFEO, Institut Han-Nôm (2005–2013)
Papin, P., Manh, T.K., Nguyên, N.V.: Catalogue des inscriptions du Viêt-Nam. EPHE, EFEO, Institut Han-Nôm (2007–2012)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016)
Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 658–666 (2019)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Proceedings International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), pp. 234–241 (2015)
Scius-Bertrand, A., Voegtlin, L., Alberti, M., Fischer, A., Bui, M.: Layout analysis and text column segmentation for historical Vietnamese steles. In: Proceedings 5th International Workshop on Historical Document Imaging and Processing (HIP), pp. 84–89 (2019)
Stewart, S., Barrett, B.: Document image page segmentation and character recognition as semantic segmentation. In: Proceedings 4th International Workshop on Historical Document Imaging and Processing (HIP), pp. 101–106 (2017)
Sudholt, S., Fink, G.A.: PHOCNet: a deep convolutional neural network for word spotting in handwritten documents. In: Proceedings 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 277–282 (2016)
Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning (ICML), pp. 6105–6114 (2019)
Tian, Z., Shen, C., Chen, H., He, T.: FCOS: fully convolutional one-stage object detection. In: Proceedings of International Conference on Computer Vision (ICCV), pp. 9627–9636 (2019)
Wang, C.Y., Liao, H.Y.M., Wu, Y.H., Chen, P.Y., Hsieh, J.W., Yeh, I.H.: CSPNet: a new backbone that can enhance learning capability of CNN. In: Proceedings International Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 390–391 (2020)
Wu, Y., He, K.: Group normalization. In: Proceedings European Conference on Computer Vision (ECCV), pp. 3–19 (2018)
Yang, H., Jin, L., Huang, W., Yang, Z., Lai, S., Sun, J.: Dense and tight detection of Chinese characters in historical documents: datasets and a recognition guided detector. IEEE Access 6, 30174–30183 (2018)
Zhang, S., Chi, C., Yao, Y., Lei, Z., Li, S.Z.: Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9759–9768 (2020)
Acknowledgements
This work has been supported by the Swiss Hasler Foundation (project 20008). It has also received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 833933 - VIETNAMICA).
We would like to thank Bélinda Hakkar, Marine Scius-Bertrand, Jean-Michel Nafziger, René Boutin, Morgane Vannier, Delphine Mamie and Tobias Widmer for annotating bounding boxes during more than hundred hours to create the ground truth of the test set.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Scius-Bertrand, A., Jungo, M., Wolf, B., Fischer, A., Bui, M. (2021). Annotation-Free Character Detection in Historical Vietnamese Stele Images. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12821. Springer, Cham. https://doi.org/10.1007/978-3-030-86549-8_28
Download citation
DOI: https://doi.org/10.1007/978-3-030-86549-8_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86548-1
Online ISBN: 978-3-030-86549-8
eBook Packages: Computer ScienceComputer Science (R0)