Abstract
The purpose of this paper is to present a context learning algorithm for inpainting missing regions using visual features. This encoder learns physical structure and semantic information from the image and this representation differentiates it from simple auto encoders. Such properties are crucial for tasks like image in-painting, classification and detection. Training was performed by patch-wise reconstruction loss using Structural Similarity (SSIM) jointly with an adversarial loss. The reconstruction loss is also augmented using spatially varying saliency maps that increase the error penalty on distinctive regions and thus promote image sharpness. Furthermore, in order to improve image continuity on the boundary of the missing region, distance functions with increasing importance towards the center of the inpainting region are also used either independently or in conjunction with the saliency maps. We also show that our choice of reconstruction loss outperforms conventional criteria such as the L2 norm. This means giving more weight to pixels closer to the border of the missing image parts and also giving more important to salience parts of the image to guide the reconstruction, thus producing more realistic images.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.B.: Patchmatch: a randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28, 24:1–24:11 (2009)
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. TPAMI 40(4), 834–848 (2016)
Darabi, S., Shechtman, E., Barnes, C., Goldman, D.B., Sen, P.: Image melding: combining inconsistent images using patch-based synthesis. ACM Trans. Graph. (TOG) 31(4), 82:1–82:10 (2012). Proceedings of SIGGRAPH 2012
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR 2009, pp. 248–255 (2009)
Efros, A.A., Leung, T.K.: Texture synthesis by non-parametric sampling. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1033–1038, September 1999
Erus, G., Zacharaki, E.I., Davatzikos, C.: Individualized statistical learning from medical image databases: application to identification of brain lesions. Med. Image Anal. 18, 542–554 (2014)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, vol. 27, pp. 2672–2680 (2014)
Herling, J., Broll, W.: High-quality real-time video inpainting with pixmix. IEEE Trans. Visual. Comput. Graph. 20, 866–879 (2014)
Kadir, T., Brady, M.: Saliency, scale and image description. Int. J. Comput. Vis. 45(2), 83–105 (2001)
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: CoRR (2018)
Krizhevsky, A., Sutskever, I.E., Hinton, G.: Imagenet classification with deep convolutional neural networks. Neural Inf. Process. Syst. 25, 1097–1105 (2012)
Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.: Context encoders: feature learning by inpainting. In: CVPR, pp. 2536–2544 (2016)
Rebuffi, S.A., Bilen, H., Vedaldi, A.: Learning multiple visual domains with residual adapters. In: NIPS, pp. 506–516 (2017)
Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. Off. J. Int. Neural Netw. Soc. 61, 85–117 (2015)
Sharma, G., Jurie, F., Schmid, C.: Discriminative spatial saliency for image classification. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3506–3513, June 2012
Simakov, D., Caspi, Y., Shechtman, E., Irani, M.: Summarizing visual data using bidirectional similarity. In: IEEE CVPR, pp. 1–8, June 2008
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P., et al.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Zacharaki, E.I., Shen, D., Lee, S.K., Davatzikos, C.: Orbit: a multiresolution framework for deformable registration of brain tumor images. IEEE Trans. Med. Imaging 27, 1003–1017 (2008)
Zhao, H., Gallo, O., Frosio, I., Kautz, J.: Loss functions for image restoration with neural networks. IEEE Trans. Comput. Imaging 3, 47–57 (2017)
Acknowledgment
This research has been co-financed by the European Regional Development Fund of the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH - CREATE - INNOVATE (project code: T1EDK-03832).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Stagakis, N., Zacharaki, E.I., Moustakas, K. (2019). Hierarchical Image Inpainting by a Deep Context Encoder Exploiting Structural Similarity and Saliency Criteria. In: Tzovaras, D., Giakoumis, D., Vincze, M., Argyros, A. (eds) Computer Vision Systems. ICVS 2019. Lecture Notes in Computer Science(), vol 11754. Springer, Cham. https://doi.org/10.1007/978-3-030-34995-0_42
Download citation
DOI: https://doi.org/10.1007/978-3-030-34995-0_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34994-3
Online ISBN: 978-3-030-34995-0
eBook Packages: Computer ScienceComputer Science (R0)