Abstract
Stone monuments have historical value, and the inscriptions engraved on them can tell us about the events and people at the time of their installation. Photography is an easy way to record inscriptions; however, the light falling on the monument, the resulting shadows, and the innate texture of the stone can make the text in the photographs unclear and difficult to recognize. This paper presents a method for inferring pixel-wise text areas in a stone monument image by developing a deep learning network that can deduce the shape of kanji characters. Our method uses pseudo-inscription images for training a deep neural network, which is generated by synthesizing a shaded image representing the engraved text and stone texture image. Through experiments using a High Resolution Net (HRNet), we confirm that the HRNet achieves high accuracy in the task of inscription segmentation and that training with pseudo-inscription images is effective in detecting inscriptions on real stone monuments. Thus, synthetic inscription images can facilitate efficient and accurate detection of text on stone monuments, thereby contributing to further history research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Baek, Y., Lee, B., Han, D., Yun, S., Lee, H.: Character region awareness for text detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019
Bhat, S., Seshikala, G.: Restoration of characters in degraded inscriptions using phase based binarization and geodesic morphology. Int. J. Recent Technol. Eng. 7(6), 1070–1075 (2019)
Chen, L., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. CoRR abs/1706.05587 (2017). http://arxiv.org/abs/1706.05587
Han, D., Kim, J., Kim, J.: Deep pyramidal residual networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6307–6315 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Synthetic data and artificial neural networks for natural scene text recognition. In: Workshop on Deep Learning, NIPS (2014)
Jégou, S., Drozdzal, M., Vazquez, D., Romero, A., Bengio, Y.: The one hundred layers Tiramisu: Fully convolutional DenseNets for semantic segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1175–1183 (2017)
Kang, S., Iwana, B.K., Uchida, S.: Cascading modular u-nets for document image binarization. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 675–680 (2019)
Kitadai, A., Saito, K., Hachiya, D., Nakagawa, M., Baba, H., Watanabe, A.: Support system for archaeologists to read scripts on Mokkans. In: 2005 International Conference on Document Analysis and Recognition (ICDAR), vol. 2, pp. 1030–1034 (2005)
Kopf, J., Fu, C.W., Cohen-Or, D., Deussen, O., Lischinski, D., Wong, T.T.: Solid texture synthesis from 2D exemplars. ACM Trans. Graph. (Proc. SIGGRAPH 2007) 26(3), 2:1–2:9 (2007)
Liu, G., Xing, J., Xiong, J.: Spatial pyramid block for oracle bone inscription detection. In: Proceedings of the 2020 9th International Conference on Software and Computer Applications, pp. 133–140 (2020)
Mondal, R., Chakraborty, D., Chanda, B.: Learning 2D morphological network for old document image binarization. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 65–70 (2019)
Ono, G.: Thousand Character Classic in Three Styles, Kai, Gyo and So. Maar-sha (1982). (in Japanese)
Peng, X., Wang, C., Cao, H.: Document binarization via multi-resolutional attention model with DRD loss. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 45–50 (2019)
Pratikakis, I., Zagoris, K., Barlas, G., Gatos, B.: ICDAR 2017 competition on document image binarization (DIBCO 2017). In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, pp. 1395–1403 (2017)
Qin, X., Chu, X., Yuan, C., Wang, R.: Entropy-based feature extraction algorithm for stone carving character detection. J. Eng. 2018(16), 1719–1723 (2018)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017)
Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5686–5696 (2019)
Sun, K., et al.: High-resolution representations for labeling pixels and regions. CoRR abs/1904.04514 (2019)
Tensmeyer, C., Brodie, M., Saunders, D., Martinez, T.: Generating realistic binarization data with generative adversarial networks. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 172–177 (2019)
Acknowledgement
This work was supported by JSPS KAKENHI Grant Number JP20H01304.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Morita, N. et al. (2021). Inscription Segmentation Using Synthetic Inscription Images for Text Detection at Stone Monuments. In: Barney Smith, E.H., Pal, U. (eds) Document Analysis and Recognition – ICDAR 2021 Workshops. ICDAR 2021. Lecture Notes in Computer Science(), vol 12916. Springer, Cham. https://doi.org/10.1007/978-3-030-86198-8_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-86198-8_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86197-1
Online ISBN: 978-3-030-86198-8
eBook Packages: Computer ScienceComputer Science (R0)