Abstract
Image inpainting aims to restore missing or damaged regions of an image with plausible visual content. Most existing methods always face challenges when dealing with large hole images, such as structural distortion and prominent artifacts, which mainly stem from the absence of adequate semantic guidance. To overcome this problem, this paper proposes a novel progressive image inpainting network driven by dynamic context, where the guided restoration dynamic semantic prior not only includes information from the known region but also considers the semantic inference during the filling process. Specifically, a multi-view cooperative strategy is proposed firstly to predict the inherent semantic information by estimating the distributions of the image, grayscale and edge of the masked image. In addition, to cope with potential semantic changes, an auxiliary generative unit is proposed for learning semantic information of the intermediate inpainting results, where the learned intermediate semantics are fused with the intrinsic semantics in a weighted manner to form a real-time updating dynamic context. Moreover, the dynamic semantic prior is propagated to various stages of inpainting to assist in constructing refined feature maps with multiple levels of resolution. Experiments on CelebA-HQ and Paris street view datasets demonstrate that the proposed approach can recover reasonable structures and realistic textures on large-scale masked images, achieving advanced performance.
Similar content being viewed by others
Data availability
Some or all data, models, or code generated or used during the study are available from the corresponding author by request.
References
Bertalmio, M., Sapiro, G., Caselles, V., Ballester, C.: Image inpainting. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, pp. 417–424 (2000)
Darabi, S., Shechtman, E., Barnes, C., Goldman, D.B., Sen, P.: Image melding: combining inconsistent images using patch-based synthesis. ACM Trans. Graph. (TOG) 31(4), 1–10 (2012)
Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: feature learning by inpainting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2536–2544 (2016)
Ren, Y., Yu, X., Zhang, R., Li, T.H., Liu, S., Li, G.: Structureflow: image inpainting via structure-aware appearance flow. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 181–190 (2019)
Xiong, W., Yu, J., Lin, Z., Yang, J., Lu, X., Barnes, C., Luo, J.: Foreground-aware image inpainting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5840–5848 (2019)
Chen, Y., Liu, L., Tao, J., Xia, R., Zhang, Q., Yang, K., Xiong, J., Chen, X.: The improved image inpainting algorithm via encoder and similarity constraint. Vis. Comput. 37, 1691–1705 (2021)
Yan, Z., Li, X., Li, M., Zuo, W., Shan, S.: Shift-net: image inpainting via deep feature rearrangement. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 1–17 (2018)
Xie, C., Liu, S., Li, C., Cheng, M.-M., Zuo, W., Liu, X., Wen, S., Ding, E.: Image inpainting with learnable bidirectional attention maps. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8858–8867 (2019)
Nazeri, K., Ng, E., Joseph, T., Qureshi, F.Z., Ebrahimi, M.: Edgeconnect: generative image inpainting with adversarial edge learning. arXiv:1901.00212 (2019)
Guo, X., Yang, H., Huang, D.: Image inpainting via conditional texture and structure dual generation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14134–14143 (2021)
Liu, H., Jiang, B., Song, Y., Huang, W., Yang, C.: Rethinking image inpainting via a mutual encoder–decoder with feature equalizations. In: European Conference on Computer Vision, pp. 725–741. Springer (2020)
Lin, J., Wang, Y.-G., Tang, W., Li, A.: Multi-feature co-learning for image inpainting. arXiv:2205.10578 (2022)
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextual attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5505–5514 (2018)
Zhang, W., Zhu, J., Tai, Y., Wang, Y., Chu, W., Ni, B., Wang, C., Yang, X.: Context-aware image inpainting with learned semantic priors. arXiv:2106.07220 (2021)
Dou, L., Qian, Z., Qin, C., Feng, G., Zhang, X.: Anti-forensics of diffusion-based image inpainting. J. Electron. Imaging 29(4), 043026 (2020)
Li, K., Wei, Y., Yang, Z., Wei, W.: Image inpainting algorithm based on tv model and evolutionary algorithm. Soft Comput. 20(3), 885–893 (2016)
Sridevi, G., Srinivas Kumar, S.: Image inpainting based on fractional-order nonlinear diffusion for image reconstruction. Circuits Syst. Signal Process. 38(8), 3802–3817 (2019)
Ding, D., Ram, S., Rodríguez, J.J.: Image inpainting using nonlocal texture matching and nonlinear filtering. IEEE Trans. Image Process. 28(4), 1705–1719 (2018)
Isogawa, M., Mikami, D., Iwai, D., Kimata, H., Sato, K.: Mask optimization for image inpainting. IEEE Access 6, 69728–69741 (2018)
Zeng, J., Fu, X., Leng, L., Wang, C.: Image inpainting algorithm based on saliency map and gray entropy. Arab. J. Sci. Eng. 44(4), 3549–3558 (2019)
Guo, Q., Gao, S., Zhang, X., Yin, Y., Zhang, C.: Patch-based image inpainting via two-stage low rank approximation. IEEE Trans. Vis. Comput. Graph. 24(6), 2023–2036 (2017)
Zha, Z., Yuan, X., Zhou, J., Zhu, C., Wen, B.: Image restoration via simultaneous nonlocal self-similarity priors. IEEE Trans. Image Process. 29, 8561–8576 (2020)
Li, Y., Jiang, Y., Zhang, H., Liu, J., Ding, X., Gui, G.: Nonconvex l1/2-regularized nonlocal self-similarity denoiser for compressive sensing based CT reconstruction. J. Frankl. Inst. 360(6), 4172–4195 (2023)
Li, Y., Gao, L., Hu, S., Gui, G., Chen, C.-Y.: Nonlocal low-rank plus deep denoising prior for robust image compressed sensing reconstruction. Expert Syst. Appl. 228, 120456 (2023)
Zhang, X., Ma, W., Varinlioglu, G., Rauh, N., He, L., Aliaga, D.: Guided pluralistic building contour completion. Vis. Comput. 38(9–10), 3205–3216 (2022)
Li, J., Wang, N., Zhang, L., Du, B., Tao, D.: Recurrent feature reasoning for image inpainting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7760–7768 (2020)
Lugmayr, A., Danelljan, M., Romero, A., Yu, F., Timofte, R., Van Gool, L.: Repaint: inpainting using denoising diffusion probabilistic models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11461–11471 (2022)
Peng, J., Liu, D., Xu, S., Li, H.: Generating diverse structure for image inpainting with hierarchical VQ-VAE. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10775–10784 (2021)
Li, A., Zhao, L., Zuo, Z., Wang, Z., Xing, W., Lu, D.: Migt: multi-modal image inpainting guided with text. Neurocomputing 520, 376–385 (2023)
Yang, Y., Cheng, Z., Yu, H., Zhang, Y., Cheng, X., Zhang, Z., Xie, G.: MSE-NET: generative image inpainting with multi-scale encoder. Vis. Comput. 1–13 (2021)
Xie, Y., Lin, Z., Yang, Z., Deng, H., Wu, X., Mao, X., Li, Q., Liu, W.: Learning semantic alignment from image for text-guided image inpainting. Vis. Comput. 38(9–10), 3149–3161 (2022)
Lahiri, A., Jain, A.K., Agrawal, S., Mitra, P., Biswas, P.K.: Prior guided GAN based semantic inpainting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13696–13705 (2020)
Yi, Z., Tang, Q., Azizi, S., Jang, D., Xu, Z.: Contextual residual aggregation for ultra high-resolution image inpainting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7508–7517 (2020)
Zhao, L., Mo, Q., Lin, S., Wang, Z., Zuo, Z., Chen, H., Xing, W., Lu, D.: Uctgan: diverse image inpainting based on unsupervised cross-space translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5741–5750 (2020)
Liao, L., Xiao, J., Wang, Z., Lin, C.-W., Satoh, S.: Image inpainting guided by coherence priors of semantics and textures. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6539–6548 (2021)
Zeng, Y., Fu, J., Chao, H., Guo, B.: Aggregated contextual transformations for high-resolution image inpainting. IEEE Trans. Vis. Comput. Graph. (2022)
Liu, H., Wan, Z., Huang, W., Song, Y., Han, X., Liao, J.: Pd-gan: probabilistic diverse gan for image inpainting. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9371–9381 (2021)
Li, H., Zhong, Z., Guan, W., Du, C., Yang, Y., Wei, Y., Ye, C.: Generative character inpainting guided by structural information. Vis. Comput. 37, 2895–2906 (2021)
Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. ACM Trans. Graph. (ToG) 36(4), 1–14 (2017)
Liu, G., Reda, F.A., Shih, K.J., Wang, T.-C., Tao, A., Catanzaro, B.: Image inpainting for irregular holes using partial convolutions. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 85–100 (2018)
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4471–4480 (2019)
Suvorov, R., Logacheva, E., Mashikhin, A., Remizova, A., Ashukha, A., Silvestrov, A., Kong, N., Goka, H., Park, K., Lempitsky, V.: Resolution-robust large mask inpainting with Fourier convolutions. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2149–2159 (2022)
Chi, L., Jiang, B., Mu, Y.: Fast Fourier convolution. Adv. Neural Inf. Process. Syst. 33, 4479–4488 (2020)
Wang, D., Xie, C., Liu, S., Niu, Z., Zuo, W.: Image inpainting with edge-guided learnable bidirectional attention maps. arXiv:2104.12087 (2021)
Wang, C., Zhu, Y., Yuan, C.: Diverse image inpainting with normalizing flow. In: Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXIII, pp. 53–69. Springer (2022)
Yu, J., Li, K., Peng, J.: Reference-guided face inpainting with reference attention network. Neural Comput. Appl. 34(12), 9717–9731 (2022)
Li, W., Lin, Z., Zhou, K., Qi, L., Wang, Y., Jia, J.: Mat: mask-aware transformer for large hole image inpainting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10758–10768 (2022)
Zhao, S., Cui, J., Sheng, Y., Dong, Y., Liang, X., Chang, E.I., Xu, Y.: Large scale image completion via co-modulated generative adversarial networks. arXiv:2103.10428 (2021)
Yu, Y., Zhang, L., Fan, H., Luo, T.: High-fidelity image inpainting with gan inversion. In: Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XVI, Springer, pp. 242–258 (2022)
Dumoulin, V., Shlens, J., Kudlur, M.: A learned representation for artistic style. arXiv:1610.07629 (2016)
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510 (2017)
Razavi, A., Van den Oord, A., Vinyals, O.: Generating diverse high-fidelity images with vq-vae-2. Adv. Neural Inf. Process. Syst. 32 (2019)
Esser, P., Rombach, R., Ommer, B.: Taming transformers for high-resolution image synthesis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12873–12883 (2021)
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 33, 6840–6851 (2020)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Huang, Z., Wang, X., Huang, L., Huang, C., Wei, Y., Liu, W.: Ccnet: criss-cross attention for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 603–612 (2019)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision, pp. 694–711. Springer (2016)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of gans for improved quality, stability, and variation. arXiv:1710.10196 (2017)
Doersch, C., Singh, S., Gupta, A., Sivic, J., Efros, A.: What makes Paris look like Paris? ACM Trans. Graph. 31(4), 1–9 (2012)
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32, 8024–8035 (2019)
Wang, Y., Tao, X., Qi, X., Shen, X., Jia, J.: Image inpainting via generative multi-column convolutional neural networks. Adv. Neural Inf. Process. Syst. 31 (2018)
Zheng, C., Cham, T.-J., Cai, J.: Pluralistic image completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1438–1447 (2019)
Wang, N., Ma, S., Li, J., Zhang, Y., Zhang, L.: Multistage attention network for image inpainting. Pattern Recognit. 106, 107448 (2020)
Wadhwa, G., Dhall, A., Murala, S., Tariq, U.: Hyperrealistic image inpainting with hypergraphs. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3912–3921 (2021)
Liu, W., Cun, X., Pun, C.-M., Xia, M., Zhang, Y., Wang, J.: Coordfill: efficient high-resolution image inpainting via parameterized coordinate querying. arXiv:2303.08524 (2023)
Acknowledgements
This work is supported in part by the Natural Science Foundation of Hebei Province (No. F2022201009), the Hebei University High-level Scientific Research Foundation for the Introduction of Talent (No. 521100221029) and the Science and Technology Project of Hebei Education Department (No. QN2023186).
Author information
Authors and Affiliations
Contributions
ZW contributed to conceptualization, methodology, software, and writing—original draft. KL contributed to formal analysis, data curation, supervision, project administration, and funding acquisition. JP contributed to validation, investigation, resources, writing—review and editing.
Corresponding author
Ethics declarations
Conflict of interest
All the authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Ethical and informed consent for data used
All data used in this study were obtained from publicly available datasets.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, Z., Li, K. & Peng, J. Dynamic context-driven progressive image inpainting with auxiliary generative units. Vis Comput 40, 3457–3472 (2024). https://doi.org/10.1007/s00371-023-03045-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-023-03045-z