Skip to main content

Inpainting at Modern Camera Resolution by Guided PatchMatch with Auto-curation

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13677))

Included in the following conference series:

  • 3209 Accesses

Abstract

Recently, deep models have established SOTA performance for low-resolution image inpainting, but they lack fidelity at resolutions associated with modern cameras such as 4K or more, and for large holes. We contribute an inpainting benchmark dataset of photos at 4K and above representative of modern sensors. We demonstrate a novel framework that combines deep learning and traditional methods. We use an existing deep inpainting model LaMa [27] to fill the hole plausibly, establish three guide images consisting of structure, segmentation, depth, and apply a multiply-guided PatchMatch [1] to produce eight candidate upsampled inpainted images. Next, we feed all candidate inpaintings through a novel curation module that chooses a good inpainting by column summation on an 8\(\,\times \,\)8 antisymmetric pairwise preference matrix. Our framework’s results are overwhelmingly preferred by users over 8 strong baselines, with improvements of quantitative metrics up to 7.4 times over the best baseline LaMa, and our technique when paired with 4 different SOTA inpainting backbones improves each such that ours is overwhelmingly preferred by users over a strong super-res baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    With the exception of HiFill [36] and LaMa [27], which we discuss in related work.

References

  1. Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.B.: Patchmatch: a randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28(3), 24 (2009)

    Article  Google Scholar 

  2. Bénard, P., et al.: Stylizing animation by example. ACM Trans. Graph. (TOG) 32(4), 1–12 (2013)

    Article  Google Scholar 

  3. Bosse, S., Maniry, D., Müller, K.R., Wiegand, T., Samek, W.: Deep neural networks for no-reference and full-reference image quality assessment. IEEE Trans. Image Process. 27(1), 206–219 (2017)

    Article  MathSciNet  Google Scholar 

  4. Burt, P.J., Adelson, E.H.: The Laplacian pyramid as a compact image code. In: Readings in Computer Vision, pp. 671–679. Elsevier (1987)

    Google Scholar 

  5. Cade, D.: The world’s first ‘fully’ digital camera was created by Fuji (2016). https://petapixel.com/2016/06/09/photo-history-worlds-first-fully-digital-camera-invented-fuji/

  6. Darabi, S., Shechtman, E., Barnes, C., Goldman, D.B., Sen, P.: Image melding: combining inconsistent images using patch-based synthesis. ACM Trans. Graph. (TOG) 31(4), 1–10 (2012)

    Article  Google Scholar 

  7. Diamanti, O., Barnes, C., Paris, S., Shechtman, E., Sorkine-Hornung, O.: Synthesis of complex image appearance from limited exemplars. ACM Trans. Graph. (TOG) 34(2), 1–14 (2015)

    Article  Google Scholar 

  8. Duggal, S., Wang, S., Ma, W.C., Hu, R., Urtasun, R.: DeepPruner: learning efficient stereo matching via differentiable PatchMatch. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4384–4393 (2019)

    Google Scholar 

  9. Fišer, J., et al.: StyLit: illumination-guided example-based stylization of 3D renderings. ACM Trans. Graph. (TOG) 35(4), 1–11 (2016)

    Article  Google Scholar 

  10. Gu, S., Lugmayr, A., Danelljan, M., Fritsche, M., Lamour, J., Timofte, R.: DIV8K: DIVerse 8K resolution image dataset. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 3512–3516. IEEE (2019)

    Google Scholar 

  11. He, K., Sun, J.: Statistics of patch offsets for image completion. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7573, pp. 16–29. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33709-3_2

    Chapter  Google Scholar 

  12. Hertzmann, A., Jacobs, C.E., Oliver, N., Curless, B., Salesin, D.H.: Image analogies. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, pp. 327–340 (2001)

    Google Scholar 

  13. Huang, J.B., Kang, S.B., Ahuja, N., Kopf, J.: Image completion using planar structure guidance. ACM Trans. Graph. (TOG) 33(4), 1–10 (2014)

    Google Scholar 

  14. Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. ACM Trans. Graph. (TOG) 36(4), 1–14 (2017)

    Article  Google Scholar 

  15. Jamriška, O., et al.: Stylizing video by example. ACM Trans. Graph. (TOG) 38(4), 1–11 (2019)

    Article  Google Scholar 

  16. Kaspar, A., Neubert, B., Lischinski, D., Pauly, M., Kopf, J.: Self tuning texture optimization. Comput. Graph. Forum 34, 349–359 (2015)

    Article  Google Scholar 

  17. Li, Y., et al.: Fully convolutional networks for panoptic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 214–223 (2021)

    Google Scholar 

  18. Liao, J., Yao, Y., Yuan, L., Hua, G., Kang, S.B.: Visual attribute transfer through deep image analogy. ACM Trans. Graph. 36(4), 120:1–120:15 (2017). https://doi.org/10.1145/3072959.3073683. https://doi.acm.org/10.1145/3072959.3073683

  19. Liu, G., Reda, F.A., Shih, K.J., Wang, T.-C., Tao, A., Catanzaro, B.: Image inpainting for irregular holes using partial convolutions. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 89–105. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_6

    Chapter  Google Scholar 

  20. Liu, H., Jiang, B., Song, Y., Huang, W., Yang, C.: Rethinking image inpainting via a mutual encoder-decoder with feature equalizations. arXiv preprint arXiv:2007.06929 (2020)

  21. Liu, W., Zhang, P., Huang, X., Yang, J., Shen, C., Reid, I.: Real-time image smoothing via iterative least squares. ACM Trans. Graph. (TOG) 39(3), 1–24 (2020)

    Article  Google Scholar 

  22. Nazeri, K., Ng, E., Joseph, T., Qureshi, F.Z., Ebrahimi, M.: EdgeConnect: generative image inpainting with adversarial edge learning. arXiv preprint arXiv:1901.00212 (2019)

  23. Parmar, G., Zhang, R., Zhu, J.Y.: On buggy resizing libraries and surprising subtleties in FID calculation. arXiv preprint arXiv:2104.11222 (2021)

  24. Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: feature learning by inpainting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2536–2544 (2016)

    Google Scholar 

  25. Ranftl, R., Bochkovskiy, A., Koltun, V.: Vision transformers for dense prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12179–12188 (2021)

    Google Scholar 

  26. Ren, Y., Yu, X., Zhang, R., Li, T.H., Liu, S., Li, G.: StructureFlow: image inpainting via structure-aware appearance flow. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 181–190 (2019)

    Google Scholar 

  27. Suvorov, R., et al.: Resolution-robust large mask inpainting with Fourier convolutions. In: WACV: Winter Conference on Applications of Computer Vision (2022)

    Google Scholar 

  28. Talebi, H., Milanfar, P.: NIMA: neural image assessment. IEEE Trans. Image Process. 27(8), 3998–4011 (2018)

    Article  MathSciNet  Google Scholar 

  29. Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)

    Google Scholar 

  30. Wang, S.Y., Wang, O., Zhang, R., Owens, A., Efros, A.A.: CNN-generated images are surprisingly easy to spot... for now. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8695–8704 (2020)

    Google Scholar 

  31. Wang, X., Xie, L., Dong, C., Shan, Y.: Real-ESRGAN: training real-world blind super-resolution with pure synthetic data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1905–1914 (2021)

    Google Scholar 

  32. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)

    Article  Google Scholar 

  33. Wexler, Y., Shechtman, E., Irani, M.: Space-time completion of video. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 463–476 (2007)

    Article  Google Scholar 

  34. Xiong, W., et al.: Foreground-aware image inpainting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5840–5848 (2019)

    Google Scholar 

  35. Xu, L., Yan, Q., Xia, Y., Jia, J.: Structure extraction from texture via relative total variation. ACM Trans. Graph. (TOG) 31(6), 1–10 (2012)

    Google Scholar 

  36. Yi, Z., Tang, Q., Azizi, S., Jang, D., Xu, Z.: Contextual residual aggregation for ultra high-resolution image inpainting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7508–7517 (2020)

    Google Scholar 

  37. Yin, W., et al.: Learning to recover 3D scene shape from a single image. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

    Google Scholar 

  38. Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextual attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5505–5514 (2018)

    Google Scholar 

  39. Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4471–4480 (2019)

    Google Scholar 

  40. Zeng, Yu., Lin, Z., Yang, J., Zhang, J., Shechtman, E., Lu, H.: High-resolution image inpainting with iterative confidence feedback and guided upsampling. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12364, pp. 1–17. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58529-7_1

    Chapter  Google Scholar 

  41. Zhang, H., et al.: ResNeSt: split-attention networks. arXiv preprint arXiv:2004.08955 (2020)

  42. Zhang, L., et al.: Perceptual artifacts localization for inpainting. In: Farinella, T. (ed.) ECCV 2022, LNCS 13689, pp. 146–164. Springer, Cham (2022)

    Google Scholar 

  43. Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 586–595, June 2018

    Google Scholar 

  44. Zhao, S., et al.: Large scale image completion via co-modulated generative adversarial networks. In: International Conference on Learning Representations (ICLR) (2021)

    Google Scholar 

  45. Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1452–1464 (2017)

    Article  Google Scholar 

  46. Zhou, X., et al.: CoCosNet v2: full-resolution correspondence learning for image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11465–11475 (2021)

    Google Scholar 

  47. Zhou, X., et al.: Full-resolution correspondence learning for image translation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

    Google Scholar 

  48. Zhu, H., Li, L., Wu, J., Dong, W., Shi, G.: MetaIQA: deep meta-learning for no-reference image quality assessment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14143–14152 (2020)

    Google Scholar 

  49. Zhu, H., Li, L., Wu, J., Dong, W., Shi, G.: Generalizable no-reference image quality assessment via deep meta-learning. IEEE Trans. Circuits Syst. Video Technol. 32(3), 1048–1060 (2021)

    Article  Google Scholar 

  50. Zhu, J.Y., Krahenbuhl, P., Shechtman, E., Efros, A.A.: Learning a discriminative model for the perception of realism in composite images. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3943–3951 (2015)

    Google Scholar 

  51. Zhu, M., et al.: Image inpainting by end-to-end cascaded refinement with mask awareness. IEEE Trans. Image Process. 30, 4855–4866 (2021)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lingzhi Zhang .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 7385 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, L. et al. (2022). Inpainting at Modern Camera Resolution by Guided PatchMatch with Auto-curation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13677. Springer, Cham. https://doi.org/10.1007/978-3-031-19790-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19790-1_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19789-5

  • Online ISBN: 978-3-031-19790-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics