Conditional Image Repainting via Semantic Bridge and Piecewise Value Function

Weng, Shuchen; Li, Wenbo; Li, Dawei; Jin, Hongxia; Shi, Boxin

doi:10.1007/978-3-030-58545-7_27

Shuchen Weng¹²,
Wenbo Li¹³,
Dawei Li¹³,
Hongxia Jin¹³ &
…
Boxin Shi^12,14

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12354))

Included in the following conference series:

European Conference on Computer Vision

4256 Accesses
2 Citations

Abstract

We study conditional image repainting where a model is trained to generate visual content conditioned on user inputs, and composite the generated content seamlessly onto a user provided image while preserving the semantics of users’ inputs. The content generation community has been pursuing to lower the skill barriers. The usage of human language is the rose among thorns for this purpose, because the language is friendly to users but poses great difficulties for the model in associating relevant words with the semantically ambiguous regions. To resolve this issue, we propose a delicate mechanism which bridges the semantic chasm between the language input and the generated visual content. The state-of-the-art image compositing techniques pose a latent ceiling of fidelity for the composited content during the adversarial training process. In this work, we improve the compositing by breaking through the latent ceiling using a novel piecewise value function. We demonstrate on two datasets that the proposed techniques can better assist tackling conditional image repainting compared to the existing ones.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

FaceApp. https://www.faceapp.com/
Caesar, H., Uijlings, J.R.R., Ferrari, V.: Coco-stuff: thing and stuff classes in context. In: CVPR (2018)
Google Scholar
Chen, B., Kae, A.: Toward realistic image compositing with adversarial learning. In: CVPR (2019)
Google Scholar
Chen, X., Qing, L., He, X., Luo, X., Xu, Y.: FTGAN: a fully-trained generative adversarial networks for text to face generation. CoRR abs/1904.05729 (2019)
Google Scholar
Cong, W., et al.: Deep image harmonization via domain verification. CoRR abs/1911.13239 (2019)
Google Scholar
Cun, X., Pun, C.: Improving the harmony of the composite image by spatial-separated attention module. CoRR abs/1907.06406 (2019)
Google Scholar
Goodfellow, I.J., et al.: Generative adversarial nets. In: NIPS (2014)
Google Scholar
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Klambauer, G., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a nash equilibrium. In: NIPS (2017)
Google Scholar
Huang, H., Xu, S., Cai, J., Liu, W., Hu, S.: Temporally coherent video harmonization using adversarial networks. TIP 29, 214–224 (2020)
MathSciNet Google Scholar
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: CVPR (2019)
Google Scholar
Li, W., et al.: Object-driven text-to-image synthesis via adversarial training. In: CVPR (2019)
Google Scholar
Li, Y., Singh, K.K., Ojha, U., Lee, Y.J.: Mixnmatch: multifactor disentanglement and encoding for conditional image generation. CoRR abs/1911.11758 (2019)
Google Scholar
Park, T., Liu, M., Wang, T., Zhu, J.: Semantic image synthesis with spatially-adaptive normalization. In: CVPR (2019)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP (2014)
Google Scholar
Qiao, T., Zhang, J., Xu, D., Tao, D.: Mirrorgan: learning text-to-image generation by redescription. In: CVPR (2019)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Tan, H., Liu, X., Li, X., Zhang, Y., Yin, B.: Semantics-enhanced adversarial nets for text-to-image synthesis. In: ICCV (2019)
Google Scholar
Tripathi, S., Chandra, S., Agrawal, A., Tyagi, A., Rehg, J.M., Chari, V.: Learning to generate synthetic data via compositing. In: CVPR (2019)
Google Scholar
Tsai, Y., Shen, X., Lin, Z., Sunkavalli, K., Lu, X., Yang, M.: Deep image harmonization. In: CVPR (2017)
Google Scholar
Tsai, Y., Shen, X., Lin, Z., Sunkavalli, K., Yang, M.: Sky is not the limit: semantic-aware sky replacement. TOG 35, 149 (2016)
Article Google Scholar
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200-2011 Dataset. Technical report CNS-TR-2011-001, California Institute of Technology (2011)
Google Scholar
Wang, T., Liu, M., Zhu, J., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional gans. In: CVPR (2018)
Google Scholar
Weng, S., Li, W., Li, D., Jin, H., Shi, B.: Misc: multi-condition injection and spatially-adaptive compositing for conditional person image synthesis. In: CVPR (2020)
Google Scholar
Wu, H., Zheng, S., Zhang, J., Huang, K.: GP-GAN: towards realistic high-resolution image blending. In: ACM MM (2019)
Google Scholar
Xu, T., et al.: Attngan: fine-grained text to image generation with attentional generative adversarial networks. In: CVPR (2018)
Google Scholar
Yin, G., Liu, B., Sheng, L., Yu, N., Wang, X., Shao, J.: Semantics disentangling for text-to-image generation. In: CVPR (2019)
Google Scholar
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: ICCV (2019)
Google Scholar
Zhou, P., Han, X., Morariu, V.I., Davis, L.S.: Learning rich features for image manipulation detection. In: CVPR (2018)
Google Scholar
Zhu, M., Pan, P., Chen, W., Yang, Y.: DM-GAN: dynamic memory generative adversarial networks for text-to-image synthesis. In: CVPR (2019)
Google Scholar

Download references

Acknowledgements

PKU affiliated authors are supported by National Natural Science Foundation of China under Grant No. 61872012, National Key R&D Program of China (2019YFF0302902), and Beijing Academy of Artificial Intelligence (BAAI).

Author information

Authors and Affiliations

NELVT, Department of Computer Science and Technology, Peking University, Beijing, China
Shuchen Weng & Boxin Shi
Samsung Research America AI Center, Mountain View, CA, USA
Wenbo Li, Dawei Li & Hongxia Jin
Institute for Artificial Intelligence, Peking University, Beijing, China
Boxin Shi

Authors

Shuchen Weng
View author publications
You can also search for this author in PubMed Google Scholar
Wenbo Li
View author publications
You can also search for this author in PubMed Google Scholar
Dawei Li
View author publications
You can also search for this author in PubMed Google Scholar
Hongxia Jin
View author publications
You can also search for this author in PubMed Google Scholar
Boxin Shi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Boxin Shi .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 6059 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Weng, S., Li, W., Li, D., Jin, H., Shi, B. (2020). Conditional Image Repainting via Semantic Bridge and Piecewise Value Function. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12354. Springer, Cham. https://doi.org/10.1007/978-3-030-58545-7_27

Download citation

DOI: https://doi.org/10.1007/978-3-030-58545-7_27
Published: 05 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58544-0
Online ISBN: 978-3-030-58545-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics