Abstract
Image inpainting is inferring missing pixels in images using known content and surrounding regions. To achieve this task, existing methods have utilized convolutional neural networks and adversarial generative networks. However, these approaches produce blurry and unrealistic completion results when the region of the missing pixels is large and irregular. This is because there is insufficient information to fill the missing region, and the networks fail to utilize known pixel information. Therefore, this study proposes a coarse-to-fine strategy using a pre-trained partial convolution-based encoder-decoder network. Our two-stage image inpainting strategy consists of a coarse inpainting network using partial convolution and a fine inpainting network using cross-attention layers based on an Unsupervised Cross-space Generative Adversarial Network (UCTGAN). In the first stage, the coarse network completes the missing region to feed the clue for inpainting into the following stage. Then, in the second stage, the fine network projects instance images and the coarse clue image into a low-dimensional manifold space and combines low-dimensional spatial features across semantic attention layers. Finally, the generator reconstructs a reconstructed image conditioned on the coarse completion. In the experiments, we qualitatively and quantitatively evaluated generated images in terms of image quality and similarity. As a result, our framework can generate precise completion results for a large and irregular missing region better than the previous methods, such as the methods based on partial convolution or the original UTCGAN.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Goodfellow, I., et al.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)
Yang, C., et al.: High-resolution image inpainting using multi-scale neural patch synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Pathak, D., et al.: Context encoders: feature learning by inpainting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. ACM Trans. Graph. 36(4), 1–14 (2017)
Demir, U., Unal, G.: Patch-based image inpainting with generative adversarial networks. arXiv preprint arXiv:1803.07422 (2018)
Shen, J., Chan, T.F.: Mathematical models for local nontexture inpaintings. SIAM J. Appl. Math. 62(3), 1019–1043 (2002)
Barnes, C., et al.: PatchMatch: a randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28(3), 24 (2009)
Pathak, D., et al.: Context encoders: feature learning by inpainting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Efros, A.A., Freeman, W.T.: Image quilting for texture synthesis and transfer. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques (2001)
Levin, A., Zomet, A., Weiss, Y.: Learning how to inpaint from global image statistics. In: ICCV, vol. 1 (2003)
Liu, G., et al.: Image inpainting for irregular holes using partial convolutions. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)
Song, Y., et al.: Spg-net: segmentation prediction and guidance network for image inpainting. arXiv preprint arXiv:1805.03356 (2018)
Yu, J., et al.: Free-form image inpainting with gated convolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2019)
Liu, H., et al.: Coherent semantic attention for image inpainting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2019)
Li, Y., et al.: Generative face completion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Zhao, L., et al.: Uctgan: diverse image inpainting based on unsupervised cross-space translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020)
Karras, T., et al.: Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017)
Doersch, C., et al.: What makes Paris look like Paris? ACM Trans. Graph. 31, 4 (2012)
Wang, Z., et al.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Zhou, B., et al.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1452–1464 (2017)
Girshick, R., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, Y., Aizawa, H., Kurita, T. (2023). Image Inpainting for Large and Irregular Mask Based on Partial Convolution and Cross Semantic Attention. In: Lu, H., Blumenstein, M., Cho, SB., Liu, CL., Yagi, Y., Kamiya, T. (eds) Pattern Recognition. ACPR 2023. Lecture Notes in Computer Science, vol 14407. Springer, Cham. https://doi.org/10.1007/978-3-031-47637-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-47637-2_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47636-5
Online ISBN: 978-3-031-47637-2
eBook Packages: Computer ScienceComputer Science (R0)