Semantic Inpainting with Multi-dimensional Adversarial Network and Wasserstein Distance

Wang, Haodi; Jiao, Libin; Bie, Rongfang; Wu, Hao

doi:10.1007/978-3-030-60636-7_7

Haodi Wang¹⁶,
Libin Jiao¹⁷,
Rongfang Bie¹⁶ &
…
Hao Wu¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12307))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

1116 Accesses

Abstract

Inpainting represents a procedure which can restore the lost parts of an image based upon the residual information. We present an inpainting network that consists of an Encoder-Decoder pipeline and a multi-dimensional adversarial network. The Encoder-Decoder pipeline extracts features from the input image with missing area and learns these features. Through unsupervised learning, the pipeline can predict and fill the missing region with the most reasonable content. Meanwhile the multi-dimensional adversarial network identifies the difference between the ground truth and the generated images both in detail and in general. Compared with the traditional training procedure, our model combines with Wasserstein Distance that enhances the stability of network training. The network is training specifically on street view images and not only performs a satisfying outcome, but also shows competitiveness when comparing with existing methods.

This research is sponsored by National Natural Science Foundation of China (No. 61571049, 61371185, 61401029, 11401028, 61472044, 61472403, 61601033) and the Fundamental Research Funds for the Central Universities (No. 2014KJJCB32, 2013NT57) and by SRF for ROCS, SEM and China Postdoctoral Science Foundation Funded Project (No. 2016M590337).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abadi, M., et al.: Tensorflow: a system for large-scale machine learning. In: 12th \(\{\)USENIX\(\}\) Symposium on Operating Systems Design and Implementation (\(\{\)OSDI\(\}\) 2016), pp. 265–283 (2016)
Google Scholar
Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein GAN (2017)
Google Scholar
Bedi, A., Gupta, S., Gupta, S.: Content aware fill based on similar images (Jul 4 2017), uS Patent 9,697,595
Google Scholar
Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: Infogan: interpretable representation learning by information maximizing generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2172–2180 (2016)
Google Scholar
Efros, A.A., Leung, T.K.: Texture synthesis by non-parametric sampling. In: Seventh IEEE International Conference on Computer Vision (2002)
Google Scholar
Ganapathi, V., Vickrey, D., Duchi, J., Koller, D.: Constrained approximate maximum entropy learning of Markov random fields (2012)
Google Scholar
Ghasedi Dizaji, K., Herandi, A., Deng, C., Cai, W., Huang, H.: Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization. In: The IEEE International Conference on Computer Vision (ICCV), October 2017
Google Scholar
Golub, G.H., Hansen, P.C., O’Leary, D.P.: Tikhonov regularization and total least squares. SIAM J. Matrix Anal. Appl. 21(1), 185–194 (1999)
Article MathSciNet Google Scholar
Goodfellow, I.J., et al.: Generative adversarial nets. In: International Conference on Neural Information Processing Systems (2014)
Google Scholar
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of Wasserstein GANs. In: Advances in Neural Information Processing Systems, pp. 5767–5777 (2017)
Google Scholar
Hays, J., Efros, A.A.: Scene completion using millions of photographs 26(3), 4 (2007)
Google Scholar
Jiao, L., Wu, H., Wang, H., Bie, R.: Multi-scale semantic image inpainting with residual learning and GAN. Neurocomputing 331, 199–212 (2019)
Article Google Scholar
Li, Y., Liu, S., Yang, J., Yang, M.: Generative face completion. CoRR abs/1704.05838 (2017). http://arxiv.org/abs/1704.05838
Ma, L., Jiang, W., Jie, Z., Wang, X.: Bidirectional image-sentence retrieval by local and global deep matching. Neurocomputing (2019). https://doi.org/10.1016/j.neucom.2018.11.089
Article Google Scholar
Park, Y., Yang, H.S.: Convolutional neural network based on an extreme learning machine for image classification. Neurocomputing 339, 66–76 (2019). https://doi.org/10.1016/j.neucom.2018.12.080
Article Google Scholar
Pathak, D., Krähenbühl, P., Donahue, J., Darrell, T., Efros, A.: Context encoders: feature learning by inpainting (2016)
Google Scholar
Patwardhan, K.A., Sapiro, G.: Projection based image and video inpainting using wavelets. In: International Conference on Image Processing (2003)
Google Scholar
Tschumperlé, D., Deriche, R.: Vector-valued image regularization with PDE’s: a common framework for different applications. IEEE Trans. Pattern Anal. Mach. Intell. 27 (2002)
Google Scholar
Wen, T., Yang, F., Gu, J., Wang, L.: A novel Bayesian-based nonlocal reconstruction method for freehand 3D ultrasound imaging. Neurocomputing 168, 104–118 (2015). https://doi.org/10.1016/j.neucom.2015.06.009
Article Google Scholar
Zhang, H., et al.: StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5907–5915 (2017)
Google Scholar
Zhang, L., Zhang, Y., Gao, Y.: A Wasserstein GAN model with the total variational regularization. CoRR abs/1812.00810 (2018)
Google Scholar
Zhao, Y., Price, B., Cohen, S., Gurari, D.: Guided image inpainting: replacing an image region by pulling content from another image (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Artificial Intelligence, Beijing Normal University, Beijing, China
Haodi Wang, Rongfang Bie & Hao Wu
Institute of Remote Sensing and Digital Earth (RADI), Chinese Academy of Science (CAS), Beijing, China
Libin Jiao

Authors

Haodi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Libin Jiao
View author publications
You can also search for this author in PubMed Google Scholar
Rongfang Bie
View author publications
You can also search for this author in PubMed Google Scholar
Hao Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hao Wu .

Editor information

Editors and Affiliations

Peking University, Beijing, China
Yuxin Peng
Nanjing University of Information Science and Technology, Nanjing, China
Qingshan Liu
Dalian University of Technology, Dalian, China
Huchuan Lu
Chinese Academy of Sciences, Beijing, China
Zhenan Sun
Chinese Academy of Sciences, Beijing, China
Chenglin Liu
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Xilin Chen
Peking University, Beijing, China
Hongbin Zha
Nanjing University of Science and Technology, Nanjing, China
Jian Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, H., Jiao, L., Bie, R., Wu, H. (2020). Semantic Inpainting with Multi-dimensional Adversarial Network and Wasserstein Distance. In: Peng, Y., et al. Pattern Recognition and Computer Vision. PRCV 2020. Lecture Notes in Computer Science(), vol 12307. Springer, Cham. https://doi.org/10.1007/978-3-030-60636-7_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-60636-7_7
Published: 13 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60635-0
Online ISBN: 978-3-030-60636-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics