Inpainting at Modern Camera Resolution by Guided PatchMatch with Auto-curation

Zhang, Lingzhi; Barnes, Connelly; Wampler, Kevin; Amirghodsi, Sohrab; Shechtman, Eli; Lin, Zhe; Shi, Jianbo

doi:10.1007/978-3-031-19790-1_4

Lingzhi Zhang¹²,
Connelly Barnes¹³,
Kevin Wampler¹³,
Sohrab Amirghodsi¹³,
Eli Shechtman¹³,
Zhe Lin¹³ &
…
Jianbo Shi¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13677))

Included in the following conference series:

European Conference on Computer Vision

3212 Accesses
4 Citations

Abstract

Recently, deep models have established SOTA performance for low-resolution image inpainting, but they lack fidelity at resolutions associated with modern cameras such as 4K or more, and for large holes. We contribute an inpainting benchmark dataset of photos at 4K and above representative of modern sensors. We demonstrate a novel framework that combines deep learning and traditional methods. We use an existing deep inpainting model LaMa [27] to fill the hole plausibly, establish three guide images consisting of structure, segmentation, depth, and apply a multiply-guided PatchMatch [1] to produce eight candidate upsampled inpainted images. Next, we feed all candidate inpaintings through a novel curation module that chooses a good inpainting by column summation on an 8$\,\times \,$8 antisymmetric pairwise preference matrix. Our framework’s results are overwhelmingly preferred by users over 8 strong baselines, with improvements of quantitative metrics up to 7.4 times over the best baseline LaMa, and our technique when paired with 4 different SOTA inpainting backbones improves each such that ours is overwhelmingly preferred by users over a strong super-res baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

VCNet: A Robust Approach to Blind Image Inpainting

Exemplar-based image inpainting using angle-aware patch matching

Article Open access 08 July 2019

Hi-NeRF: Hybridizing 2D Inpainting with Neural Radiance Fields for 3D Scene Inpainting

Notes

1.
With the exception of HiFill [36] and LaMa [27], which we discuss in related work.

References

Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.B.: Patchmatch: a randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28(3), 24 (2009)
Article Google Scholar
Bénard, P., et al.: Stylizing animation by example. ACM Trans. Graph. (TOG) 32(4), 1–12 (2013)
Article Google Scholar
Bosse, S., Maniry, D., Müller, K.R., Wiegand, T., Samek, W.: Deep neural networks for no-reference and full-reference image quality assessment. IEEE Trans. Image Process. 27(1), 206–219 (2017)
Article MathSciNet Google Scholar
Burt, P.J., Adelson, E.H.: The Laplacian pyramid as a compact image code. In: Readings in Computer Vision, pp. 671–679. Elsevier (1987)
Google Scholar
Cade, D.: The world’s first ‘fully’ digital camera was created by Fuji (2016). https://petapixel.com/2016/06/09/photo-history-worlds-first-fully-digital-camera-invented-fuji/
Darabi, S., Shechtman, E., Barnes, C., Goldman, D.B., Sen, P.: Image melding: combining inconsistent images using patch-based synthesis. ACM Trans. Graph. (TOG) 31(4), 1–10 (2012)
Article Google Scholar
Diamanti, O., Barnes, C., Paris, S., Shechtman, E., Sorkine-Hornung, O.: Synthesis of complex image appearance from limited exemplars. ACM Trans. Graph. (TOG) 34(2), 1–14 (2015)
Article Google Scholar
Duggal, S., Wang, S., Ma, W.C., Hu, R., Urtasun, R.: DeepPruner: learning efficient stereo matching via differentiable PatchMatch. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4384–4393 (2019)
Google Scholar
Fišer, J., et al.: StyLit: illumination-guided example-based stylization of 3D renderings. ACM Trans. Graph. (TOG) 35(4), 1–11 (2016)
Article Google Scholar
Gu, S., Lugmayr, A., Danelljan, M., Fritsche, M., Lamour, J., Timofte, R.: DIV8K: DIVerse 8K resolution image dataset. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3512–3516. IEEE (2019)
Google Scholar
He, K., Sun, J.: Statistics of patch offsets for image completion. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7573, pp. 16–29. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33709-3_2
Chapter Google Scholar
Hertzmann, A., Jacobs, C.E., Oliver, N., Curless, B., Salesin, D.H.: Image analogies. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, pp. 327–340 (2001)
Google Scholar
Huang, J.B., Kang, S.B., Ahuja, N., Kopf, J.: Image completion using planar structure guidance. ACM Trans. Graph. (TOG) 33(4), 1–10 (2014)
Google Scholar
Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. ACM Trans. Graph. (TOG) 36(4), 1–14 (2017)
Article Google Scholar
Jamriška, O., et al.: Stylizing video by example. ACM Trans. Graph. (TOG) 38(4), 1–11 (2019)
Article Google Scholar
Kaspar, A., Neubert, B., Lischinski, D., Pauly, M., Kopf, J.: Self tuning texture optimization. Comput. Graph. Forum 34, 349–359 (2015)
Article Google Scholar
Li, Y., et al.: Fully convolutional networks for panoptic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 214–223 (2021)
Google Scholar
Liao, J., Yao, Y., Yuan, L., Hua, G., Kang, S.B.: Visual attribute transfer through deep image analogy. ACM Trans. Graph. 36(4), 120:1–120:15 (2017). https://doi.org/10.1145/3072959.3073683. https://doi.acm.org/10.1145/3072959.3073683
Liu, G., Reda, F.A., Shih, K.J., Wang, T.-C., Tao, A., Catanzaro, B.: Image inpainting for irregular holes using partial convolutions. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 89–105. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_6
Chapter Google Scholar
Liu, H., Jiang, B., Song, Y., Huang, W., Yang, C.: Rethinking image inpainting via a mutual encoder-decoder with feature equalizations. arXiv preprint arXiv:2007.06929 (2020)
Liu, W., Zhang, P., Huang, X., Yang, J., Shen, C., Reid, I.: Real-time image smoothing via iterative least squares. ACM Trans. Graph. (TOG) 39(3), 1–24 (2020)
Article Google Scholar
Nazeri, K., Ng, E., Joseph, T., Qureshi, F.Z., Ebrahimi, M.: EdgeConnect: generative image inpainting with adversarial edge learning. arXiv preprint arXiv:1901.00212 (2019)
Parmar, G., Zhang, R., Zhu, J.Y.: On buggy resizing libraries and surprising subtleties in FID calculation. arXiv preprint arXiv:2104.11222 (2021)
Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: feature learning by inpainting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2536–2544 (2016)
Google Scholar
Ranftl, R., Bochkovskiy, A., Koltun, V.: Vision transformers for dense prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12179–12188 (2021)
Google Scholar
Ren, Y., Yu, X., Zhang, R., Li, T.H., Liu, S., Li, G.: StructureFlow: image inpainting via structure-aware appearance flow. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 181–190 (2019)
Google Scholar
Suvorov, R., et al.: Resolution-robust large mask inpainting with Fourier convolutions. In: WACV: Winter Conference on Applications of Computer Vision (2022)
Google Scholar
Talebi, H., Milanfar, P.: NIMA: neural image assessment. IEEE Trans. Image Process. 27(8), 3998–4011 (2018)
Article MathSciNet Google Scholar
Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)
Google Scholar
Wang, S.Y., Wang, O., Zhang, R., Owens, A., Efros, A.A.: CNN-generated images are surprisingly easy to spot... for now. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8695–8704 (2020)
Google Scholar
Wang, X., Xie, L., Dong, C., Shan, Y.: Real-ESRGAN: training real-world blind super-resolution with pure synthetic data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1905–1914 (2021)
Google Scholar
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Article Google Scholar
Wexler, Y., Shechtman, E., Irani, M.: Space-time completion of video. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 463–476 (2007)
Article Google Scholar
Xiong, W., et al.: Foreground-aware image inpainting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5840–5848 (2019)
Google Scholar
Xu, L., Yan, Q., Xia, Y., Jia, J.: Structure extraction from texture via relative total variation. ACM Trans. Graph. (TOG) 31(6), 1–10 (2012)
Google Scholar
Yi, Z., Tang, Q., Azizi, S., Jang, D., Xu, Z.: Contextual residual aggregation for ultra high-resolution image inpainting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7508–7517 (2020)
Google Scholar
Yin, W., et al.: Learning to recover 3D scene shape from a single image. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Google Scholar
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextual attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5505–5514 (2018)
Google Scholar
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4471–4480 (2019)
Google Scholar
Zeng, Yu., Lin, Z., Yang, J., Zhang, J., Shechtman, E., Lu, H.: High-resolution image inpainting with iterative confidence feedback and guided upsampling. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12364, pp. 1–17. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58529-7_1
Chapter Google Scholar
Zhang, H., et al.: ResNeSt: split-attention networks. arXiv preprint arXiv:2004.08955 (2020)
Zhang, L., et al.: Perceptual artifacts localization for inpainting. In: Farinella, T. (ed.) ECCV 2022, LNCS 13689, pp. 146–164. Springer, Cham (2022)
Google Scholar
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 586–595, June 2018
Google Scholar
Zhao, S., et al.: Large scale image completion via co-modulated generative adversarial networks. In: International Conference on Learning Representations (ICLR) (2021)
Google Scholar
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1452–1464 (2017)
Article Google Scholar
Zhou, X., et al.: CoCosNet v2: full-resolution correspondence learning for image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11465–11475 (2021)
Google Scholar
Zhou, X., et al.: Full-resolution correspondence learning for image translation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Google Scholar
Zhu, H., Li, L., Wu, J., Dong, W., Shi, G.: MetaIQA: deep meta-learning for no-reference image quality assessment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14143–14152 (2020)
Google Scholar
Zhu, H., Li, L., Wu, J., Dong, W., Shi, G.: Generalizable no-reference image quality assessment via deep meta-learning. IEEE Trans. Circuits Syst. Video Technol. 32(3), 1048–1060 (2021)
Article Google Scholar
Zhu, J.Y., Krahenbuhl, P., Shechtman, E., Efros, A.A.: Learning a discriminative model for the perception of realism in composite images. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3943–3951 (2015)
Google Scholar
Zhu, M., et al.: Image inpainting by end-to-end cascaded refinement with mask awareness. IEEE Trans. Image Process. 30, 4855–4866 (2021)
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Pennsylvania, Philadelphia, USA
Lingzhi Zhang & Jianbo Shi
Adobe Research, Cambridge, USA
Connelly Barnes, Kevin Wampler, Sohrab Amirghodsi, Eli Shechtman & Zhe Lin

Authors

Lingzhi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Connelly Barnes
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Wampler
View author publications
You can also search for this author in PubMed Google Scholar
Sohrab Amirghodsi
View author publications
You can also search for this author in PubMed Google Scholar
Eli Shechtman
View author publications
You can also search for this author in PubMed Google Scholar
Zhe Lin
View author publications
You can also search for this author in PubMed Google Scholar
Jianbo Shi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lingzhi Zhang .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 7385 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, L. et al. (2022). Inpainting at Modern Camera Resolution by Guided PatchMatch with Auto-curation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13677. Springer, Cham. https://doi.org/10.1007/978-3-031-19790-1_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-19790-1_4
Published: 24 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19789-5
Online ISBN: 978-3-031-19790-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Inpainting at Modern Camera Resolution by Guided PatchMatch with Auto-curation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

VCNet: A Robust Approach to Blind Image Inpainting

Exemplar-based image inpainting using angle-aware patch matching

Hi-NeRF: Hybridizing 2D Inpainting with Neural Radiance Fields for 3D Scene Inpainting

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 7385 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Inpainting at Modern Camera Resolution by Guided PatchMatch with Auto-curation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

VCNet: A Robust Approach to Blind Image Inpainting

Exemplar-based image inpainting using angle-aware patch matching

Hi-NeRF: Hybridizing 2D Inpainting with Neural Radiance Fields for 3D Scene Inpainting

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 7385 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation