Attentive Semantic and Perceptual Faces Completion Using Self-attention Generative Adversarial Networks

Liu, Xiaowei; Li, Kenli; Li, Keqin

doi:10.1007/s11063-019-10080-2

Attentive Semantic and Perceptual Faces Completion Using Self-attention Generative Adversarial Networks

Published: 27 July 2019

Volume 51, pages 211–229, (2020)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

778 Accesses
6 Citations
3 Altmetric
Explore all metrics

Abstract

We propose an approach based on self-attention generative adversarial networks to accomplish the task of image completion where completed images become globally and locally consistent. Using self-attention GANs with contextual and other constraints, the generator can draw realistic images, where fine details are generated in the damaged region and coordinated with the whole image semantically. To train the consistent generator, i.e. image completion network, we employ global and local discriminators where the global discriminator is responsible for evaluating the consistency of the entire image, while the local discriminator assesses the local consistency by analyzing local areas containing completed regions only. Last but not least, attentive recurrent neural block is introduced to obtain the attention map about the missing part in the image, which will help the subsequent completion network to fill contents better. By comparing the experimental results of different approaches on CelebA dataset, our method shows relatively good results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Deep Multi-task Generative Adversarial Network for Face Completion

AMSFANet: attention-based multiscale small face aware restoration method

Article 28 February 2024

Face Video Super-Resolution with Identity Guided Generative Adversarial Networks

References

Ba J, Mnih V, Kavukcuoglu K (2014) Multiple object recognition with visual attention. Preprint. arXiv:1412.7755
Barnes C, Shechtman E, Finkelstein A, Goldman DB (2009) PatchMatch: a randomized correspondence algorithm for structural image editing. In: ACM transactions on graphics (ToG), vol 28. ACM, New York, p 24
Buades A, Coll B, Morel JM (2005) A non-local algorithm for image denoising. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol 2. IEEE, New York, pp 60–65
Cho K, Courville A, Bengio Y (2015) Describing multimedia content using attention-based encoder–decoder networks. IEEE Trans Multimed 17(11):1875–1886
Google Scholar
Darabi S, Shechtman E, Barnes C, Goldman DB, Sen P (2012) Image melding: combining inconsistent images using patch-based synthesis. ACM Trans Graph 31(4):82 https://doi.org/10.1145/2185520.2185578
Denton EL, Chintala S, Fergus R et al (2015) Deep generative image models using a Laplacian pyramid of adversarial networks. In: Advances in neural information processing systems, pp 1486–1494
Desimone R, Duncan J (1995) Neural mechanisms of selective visual attention. Annu Rev Neurosci 18(1):193–222
Google Scholar
Drori I, Cohen-Or D, Yeshurun H (2003) Fragment-based image completion. In: ACM transactions on graphics (TOG), vol 22. ACM, New York, pp 303–312
Duan M, Li K, Li K (2017) An ensemble CNN2ELM for age estimation. IEEE Trans Inf Forensics Secur 13(3):758–772
Google Scholar
Fawzi A, Samulowitz H, Turaga D, Frossard P (2016) Image inpainting through neural networks hallucinations. In: 2016 IEEE 12th image, video, and multidimensional signal processing workshop (IVMSP). IEEE, New York, pp 1–5
Gehring J, Miao Y, Metze F, Waibel A (2013) Extracting deep bottleneck features using stacked auto-encoders. IEEE, pp 3377–3381. https://doi.org/10.1109/ICASSP.2013.6638284
Gers F (2001) Long short-term memory in recurrent neural networks. PhD thesis, Verlag nicht ermittelbar
Goldman B, Shechtman E, Belaunde I (2010) Content-aware fill. https://research.adobe.com/project/content-aware-fill
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
Guo S, Tan G, Pan H, Chen L, Gao C (2017) Face alignment under occlusion based on local and global feature regression. Multimed Tools Appl 76(6):8677–8694
Google Scholar
Hays J, Efros AA (2007) Scene completion using millions of photographs. ACM Trans Graph 26(3):4. https://doi.org/10.1145/1276377.1276382
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Google Scholar
Hong C, Yu J, Tao D, Wang M (2014) Image-based three-dimensional human pose recovery by multiview locality-sensitive sparse retrieval. IEEE Trans Ind Electron 62(6):3742–3751
Google Scholar
Hong C, Yu J, Wan J, Tao D, Wang M (2015) Multimodal deep autoencoder for human pose recovery. IEEE Trans Image Process 24(12):5659–5670
Google Scholar
Hong C, Yu J, Zhang J, Jin X, Lee KH (2018) Multi-modal face pose estimation with multi-task manifold deep learning. IEEE Trans Ind Inf 15:3952–3961
Google Scholar
Iizuka S, Simo-Serra E, Ishikawa H (2017) Globally and locally consistent image completion. ACM Trans Graph (ToG) 36(4):107. https://doi.org/10.1145/3072959.3073659
Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 11:1254–1259
Google Scholar
Jaderberg M, Simonyan K, Zisserman A, Kavukcuoglu K (2015) Spatial transformer networks. http://arxiv.org/abs/1506.02025
Kataoka Y, Matsubara T, Uehara K (2016) Image generation using generative adversarial networks and attention mechanism. In: 2016 IEEE/ACIS 15th international conference on computer and information science (ICIS). IEEE, New York, pp 1–6
Lee CH, Liu Z, Wu L, Luo P (2019) MaskGAN: towards diverse and interactive facial image manipulation. Technical report
Li Y, Liu S, Yang J, Yang MH (2017) Generative face completion. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3911–3919
Liu Z, Luo P, Wang X, Tang X (2015) Deep learning face attributes in the wild. In: Proceedings of the IEEE international conference on computer vision, pp 3730–3738
Mnih V, Heess N, Graves A et al (2014) Recurrent models of visual attention. In: Advances in neural information processing systems, pp 2204–2212
Ouyang X, Zhang X, Ma D, Agam G (2018) Generating image sequence from description with LSTM conditional GAN. In: 2018 24th international conference on pattern recognition (ICPR). IEEE, New York, pp 2456–2461
Pathak D, Krahenbuhl P, Donahue J, Darrell T, Efros AA (2016) Context encoders: feature learning by inpainting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2536–2544
Qian R, Tan RT, Yang W, Su J, Liu J (2018) Attentive generative adversarial network for raindrop removal from a single image. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2482–2491
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. Preprint. arXiv:1511.06434
Rares A, Reinders MJ, Biemond J (2005) Edge-based image restoration. IEEE Trans Image Process 14(10):1454–1468
Google Scholar
Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7794–7803
Wexler Y, Shechtman E, Irani M (2004) Space–time video completion. In: Proceedings of the 2004 IEEE computer society conference on computer vision and pattern recognition, 2004. CVPR 2004, vol 1. IEEE, New York, p 1
Wexler Y, Shechtman E, Irani M (2007) Space–time completion of video. IEEE Trans Pattern Anal Mach Intell 3:463–476
Google Scholar
Xia C, Zhang H, Gao X (2017) Combining multi-layer integration algorithm with background prior and label propagation for saliency detection. J Vis Commun Image Represent 48:110–121
Google Scholar
Xie J, Xu L, Chen E (2012) Image denoising and inpainting with deep neural networks. In: Proceedings of the 25th international conference on neural information processing systems, NIPS’12, vol 1. Curran Associates Inc., Lake Tahoe, Nevada, pp 341–349. http://dl.acm.org/citation.cfm?id=2999134.2999173
Xingjian S, Chen Z, Wang H, Yeung DY, Wong WK, Woo W (2015) Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In: Advances in neural information processing systems, pp 802–810
Xu K, Ba JL, Kiros R, Cho K, Courville A, Salakhutdinov R, Zemel RS, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: 32nd International Conference on Machine Learning, ICML 2015, pp 2048–2057
Yeh RA, Chen C, Yian Lim T, Schwing AG, Hasegawa-Johnson M, Do MN (2017) Semantic image inpainting with deep generative models. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5485–5493
Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) Bisenet: bilateral segmentation network for real-time semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 325–341
Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions. Preprint. arXiv:1511.07122
Yu J, Rui Y, Tao D (2014) Click prediction for web image reranking using multimodal sparse coding. IEEE Trans Image Process 23(5):2019–2032
Google Scholar
Yu J, Tao D, Wang M, Rui Y (2014) Learning to rank using user clicks and visual features for image retrieval. IEEE Trans Cybern 45(4):767–779
Google Scholar
Yu J, Kuang Z, Zhang B, Zhang W, Lin D, Fan J (2018) Leveraging content sensitiveness and user trustworthiness to recommend fine-grained privacy settings for social image sharing. IEEE Trans Inf Forensics Secur 13(5):1317–1332
Google Scholar
Yu J, Lin Z, Yang J, Shen X, Lu X, Huang TS (2018) Generative image inpainting with contextual attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5505–5514
Yu J, Zhu C, Zhang J, Huang Q, Tao D (2019) Spatial pyramid-enhanced NetVLAD with weighted triplet loss for place recognition. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2019.2908982
Google Scholar
Zhang H, Xu T, Li H, Zhang S, Wang X, Huang X, Metaxas DN (2017) Stackgan: text to photo-realistic image synthesis with stacked generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 5907–5915
Zhang H, Goodfellow I, Metaxas D, Odena A (2018) Self-attention generative adversarial networks. Preprint. arXiv:1805.08318

Download references

Acknowledgements

This work was supported by the National Key R&D Program of China under Grant 2018YFB1003401, and in part by the National Outstanding Youth Science Program of National Natural Science Foundation of China under Grant 61625202.

Author information

Authors and Affiliations

College of Information Science and Engineering, Hunan University, Changsha, 410081, China
Xiaowei Liu & Kenli Li
Department of Computer Science, State University of New York, New Paltz, NY, 12561, USA
Keqin Li

Authors

Xiaowei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Kenli Li
View author publications
You can also search for this author in PubMed Google Scholar
Keqin Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kenli Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, X., Li, K. & Li, K. Attentive Semantic and Perceptual Faces Completion Using Self-attention Generative Adversarial Networks. Neural Process Lett 51, 211–229 (2020). https://doi.org/10.1007/s11063-019-10080-2

Download citation

Published: 27 July 2019
Issue Date: February 2020
DOI: https://doi.org/10.1007/s11063-019-10080-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Attentive Semantic and Perceptual Faces Completion Using Self-attention Generative Adversarial Networks

Abstract

Access this article

Similar content being viewed by others

A Deep Multi-task Generative Adversarial Network for Face Completion

AMSFANet: attention-based multiscale small face aware restoration method

Face Video Super-Resolution with Identity Guided Generative Adversarial Networks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Attentive Semantic and Perceptual Faces Completion Using Self-attention Generative Adversarial Networks

Abstract

Access this article

Similar content being viewed by others

A Deep Multi-task Generative Adversarial Network for Face Completion

AMSFANet: attention-based multiscale small face aware restoration method

Face Video Super-Resolution with Identity Guided Generative Adversarial Networks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation