Abstract
Image denoising is one of the most important steps in the document image analysis pipeline thanks to its good effect into the rest of the workflow. However, the noise in historical documents is totally different from the common noise present in other classical problems of image processing. It is particularly the case of the image of Cham inscriptions obtained by the stamping of ancient stele. In this paper, we leverage the advantage of deep learning to adapt with these noisy conditions. The proposed network follows an encoder-decoder structure by combining convolution/deconvolution operators with symmetrical skip connections and residual blocks for improving reconstructed image. Furthermore, global attention fusion is proposed to learn the relevant regions in the image. Our experiments demonstrate the proposed method can’t only remove unwanted parts in the image, but also enhance the visual quality for the Cham inscriptions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Mao, X.J., Shen, C., Yang, Y.B.: Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. In: Proceedings of the 30th International Conference on Neural Information Processing Systems (2016)
Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a gaussian denoiser: residual learning of deep cnn for image denoising. IEEE Trans. Image Process 26(7), 3142–3155 (2017)
Kesiman, M.W.A., et al.: Benchmarking of document image analysis tasks for palm leaf manuscripts from southeast Asia. J. Imaging 4(2), 43 (2018)
Lehtinen, J., et al.: Noise2noise: Learning image restoration without clean data. In: International Conference on Machine Learning (2018)
Krull, A., Buchholz, T.O., Jug, F.: Noise2void-learning denoising from single noisy images. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)
Pitas, I., Venetsanopoulos, A.N.: Nonlinear Digital Filters: Principles and Applications, vol. 84. Springer, New York (2013) https://doi.org/10.1007/978-1-4757-6017-0
Wiener, N.: Extrapolation, Interpolation, and Smoothing of Stationary time Series: with Engineering Applications. MIT Press, Cambridge (1950)
Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: Sixth International Conference on Computer Vision, pp. 839–846. IEEE (1998)
Buades, A., Coll, B., Morel, J.M.: A non-local algorithm for image denoising. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 60–65 (2005)
Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Phys. D Nonlinear Phenom. 60(1–4), 259–268 (1992)
Elad, M., Aharon, M.: Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Process. 15(12), 3736–3745 (2006)
Dong, W., Shi, G., Li, X.: Nonlocal image restoration with bilateral variance estimation: a low-rank approach. IEEE Trans. Image Process. 22(2), 700–711 (2012)
Choi, H., Baraniuk, R.: Analysis of wavelet-domain wiener filters. In: Proceedings of the IEEE-SP International Symposium on Time-Frequency and Time-Scale Analysis, pp. 613–616 (1998)
Ram, I., Elad, M., Cohen, I.: Generalized tree-based wavelet transform. IEEE Trans. Signal Process. 59(9), 4199–4209 (2011)
Dabov, K., Foi, A., Katkovnik, V., Egiazarian, K.: Image denoising by sparse 3-d transform-domain collaborative filtering. IEEE Trans. Image Process. 16(8), 2080–2095 (2007)
Portilla, J., Strela, V., Wainwright, M.J., Simoncelli, E.P.: Image denoising using scale mixtures of gaussians in the wavelet domain. IEEE Trans. Image Process. 12(11), 1338–1351 (2003)
Burger, H.C., Schuler, C.J., Harmeling, S.: Image denoising: Can plain neural networks compete with bm3d? In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2392–2399 (2012)
Dumpala, V., Kurupathi, S.R., Bukhari, S.S., Dengel, A.: Removal of historical document degradations using conditional gans. In: ICPRAM (2019)
Souibgui, M.A., Kessentini, Y.: De-gan: a conditional generative adversarial network for document enhancement. In: IEEE Transactions on PAMI (2020)
Luong, T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1412–1421. Lisbon, Portugal (September 2015)
Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 31th International Conference on Neural Information Processing Systems (2017)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision, pp. 3–19 (2018)
Park, J., Woo, S., Lee, J.Y., Kweon, I.S.: Bam: bottleneck attention module. In: British Machine Vision Conference (2018)
Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. In: International Conference on Machine Learning, pp. 7354–7363. PMLR (2019)
Schlemper, J., et al.: Attention gated networks: Learning to leverage salient regions in medical images. Med. Image Anal. 53, 197–207 (2019)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision, pp. 694–711 (2016)
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Nguyen, M.T., Shweyer, A.V., Le, T.L., Tran, T.H., Vu, H.: Preliminary results on ancient cham glyph recognition from cham inscription images. In: 2019 International Conference on Multimedia Analysis and Pattern Recognition (MAPR), pp. 1–6. IEEE (2019)
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)
Hyvarinen, A., Hoyer, P., Oja, E.: Sparse code shrinkage: Denoising by nonlinear maximum likelihood estimation. Adv. Neural Inf. Process. Syst. 11, 473–479 (1999)
Févotte, C., Idier, J.: Algorithms for nonnegative matrix factorization with the beta-divergence. Neural Comput. 23(9), 2421–2456 (2011)
Deledalle, C.A., Salmon, J., Dalalyan, A.S., et al.: Image denoising with patch based pca: local versus global. BMVC 81, 425–455 (2011)
Zhang, K., Zuo, W., Zhang, L.: Ffdnet: toward a fast and flexible solution for cnn based image denoising. IEEE Trans. Image Process 27(9), 4608–4622 (2018)
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Niblack, W.: An Introduction to Digital Image Processing. Strandberg Publishing Company, Birkeroed (1985)
Sauvola, J., Pietikainen, M.: Adaptive document image binarization. Pattern Recogn. 33(2), 225–236 (2000)
Acknowledgment
This work is supported by the French National Research Agency (ANR) in the framework of the ChAMDOC Project, n\(^\circ \)ANR-19-CE27-0018-02.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Nguyen, TN., Burie, JC., Le, TL., Schweyer, AV. (2021). On the Use of Attention in Deep Learning Based Denoising Method for Ancient Cham Inscription Images. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12821. Springer, Cham. https://doi.org/10.1007/978-3-030-86549-8_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-86549-8_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86548-1
Online ISBN: 978-3-030-86549-8
eBook Packages: Computer ScienceComputer Science (R0)