A novel attribute-based generation architecture for facial image editing

Li, Defang; Zhang, Min; Zhang, Lifang; Chen, Weifu; Feng, Guocan

doi:10.1007/s11042-020-09858-7

A novel attribute-based generation architecture for facial image editing

Published: 02 October 2020

Volume 80, pages 4881–4902, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Defang Li^1,2,
Min Zhang^1,2,
Lifang Zhang³,
Weifu Chen ORCID: orcid.org/0000-0002-9375-2214^1,2 &
…
Guocan Feng^1,2

583 Accesses
3 Citations
Explore all metrics

Abstract

Facial image editing is one of the hot topics in recent years due to the great development in deep generative models. Current models are either based on variational autoencoder(VAE) or generative adversarial network(GAN). However, VAE-based models usually generate oversmooth images, while GAN-based-only models cannot randomly generate images with specific attributes and suffer from unstable training. To overcome these limitations, a novel attribute-disentangled generative model based on the combination of VAE and GAN is proposed for facial image editing by manipulating specific attributes and synthesizing facial images conditioned on the specified attributes. In the encoder-decoder architecture of the proposed model, the latent space mapped by the encoder is split into two subspaces: the attribute-irrelevant space and the attribute-relevant space. The attribute-irrelevant space characterizes the factors such as identity, position, background etc, which are expected to be kept unchanged during the editing. The attribute-relevant space is used to represent the attributes such as hair color, gender, age etc that we want to manipulate. We use the adversarial training scheme to train the model, where images generated by the proposed model are re-feeded to the encoder to ensure their distribution is close to the real data distribution in the attribute-irrelevant subspace while they can be correctly classified in the attribute-relevant subspace, without explicitly giving the discriminators such as in GANs. To evaluate the performance of the proposed model, quantitative and qualitative comparisons between the proposed model and other state-of-the-art algorithms were tesed on the CelebA dataset. The evaluation results show that the proposed model can effectively generate high-quality facial images with diverse specified attributes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Editable Generative Adversarial Networks: Generating and Editing Faces Simultaneously

Face attribute editing based on generative adversarial networks

Article 27 February 2020

MagGAN: High-Resolution Face Attribute Editing with Mask-Guided Generative Adversarial Network

References

Akhtar Z, Dasgupta D, Banerjee B (2019) Face authenticity: An overview of face manipulation generation, detection and recognition. In: Nutan College of Engineering & Research, International Conference on Communication and Information Processing (ICCIP)
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp 214–223
Bao J, Chen D, Wen F, Li H, Hua G (2017) Cvae-gan: fine-grained image generation through asymmetric training. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2745–2754
Bengio Y, éric ThibodeauLaufer, Alain G, Yosinski J (2013) Deep generative stochastic networks trainable by backprop. Computer Science 2:226–234
Google Scholar
Brock A, Donahue J, Simonyan K (2019) Large scale GAN training for high fidelity natural image synthesis. In: International Conference on Learning Representations. https://openreview.net/forum?id=B1xsqj09Fm
Brock A, Lim T, Ritchie JM, Weston N (2016) Neural photo editing with introspective adversarial networks. arXiv:1609.07093
Charlier P, Froesch P, Huynh-Charlier I, Fort A, Hurel A, Jullien F (2014) Use of 3d surface scanning to match facial shapes against altered exhumed remains in a context of forensic individual identification. Forensic Science, Medicine, and Pathology 10(4):654–661
Article Google Scholar
Che T, Li Y, Jacob AP, Bengio Y, Li W (2016) Mode regularized generative adversarial networks. arXiv:1612.02136
Choi Y, Choi M, Kim M, Ha J-W, Kim S, Choo J (2018) Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8789–8797
Dai B, Wipf D (2019) Diagnosing and enhancing VAE models. In: International Conference on Learning Representations. https://openreview.net/forum?id=B1e0X3C9tQ
Donahue J, Krähenbühl P, Darrell T (2016) Adversarial feature learning. arXiv:1605.09782
Dumoulin V, Belghazi I, Poole B, Lamb A, Arjovsky M, Mastropietro O, Courville A (2016) Adversarially learned inference. arXiv:1606.00704
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems , pp 2672–2680
Gregor K, Danihelka I, Graves A, Rezende DJ, Wierstra D (2015) Draw: a recurrent neural network for image generation
Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville AC (2017) Improved training of wasserstein gans. In: Advances in neural information processing systems, pp 5767–5777
Guo Q, Zhu C, Xia Z, Wang Z, Liu Y (2017) Attribute-controlled face photo synthesis from simple line drawing. arXiv:1702.02805
He Z, Zuo W, Kan M, Shan S, Chen X (2019) Attgan: Facial attribute editing by only changing what you want. IEEE Trans Image Process 28 (11):5464–5478
Article MathSciNet Google Scholar
Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in Neural Information Processing Systems, pp 6626–6637
Huang H, He R, Sun Z, Tan T, et al. (2018) Introvae: Introspective variational autoencoders for photographic image synthesis. In: Advances in neural information processing systems, pp 52–63
Isola P, Zhu J-Y, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition , pp 1125–1134
Karras T, Aila T, Laine S, Lehtinen J (2018) Progressive growing of GANs for improved quality, stability, and variation. In: International Conference on Learning Representations. https://openreview.net/forum?id=Hk99zCeAb
Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pp 4401–4410
Karras T, Laine S, Aittala M, Hellsten J, Lehtinen J, Aila T (2019) Analyzing and improving the image quality of stylegan. arXiv:1912.04958
Kim T, Cha M, Kim H, Lee JK, Kim J (2017) Learning to discover cross-domain relations with generative adversarial networks, JMLR. org. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp 1857–1865
Kim T, Kim B, Cha M, Kim J (2017) Unsupervised visual attribute transfer with reconfigurable generative adversarial networks Computer Vision and Pattern Recognition
Kingma DP, Ba JL (2015) Adam: A method for stochastic optimization, international conference on learning representations
Kingma DP, Welling M (2014) Auto-encoding variational bayes. In: international conference on learning representations
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Lample G, Zeghidour N, Usunier N, Bordes A, Denoyer L, Ranzato M (2017) Fader networks: Manipulating images by sliding attributes. In: Advances in Neural Information Processing Systems , pp 5967–5976
Larsen A BL, Sønderby SK, Larochelle H, Winther O (2016) Autoencoding beyond pixels using a learned similarity metric International Conference on Machine Learning, pp 1558–1566
Li M, Zuo W, Zhang D (2016) Deep identity-aware transfer of facial attributes. arXiv:1610.05586
Liu M, Breuel TM, Kautz J (2017) Unsupervised image-to-image translation networks
Liu M, Tuzel O (2016) Coupled generative adversarial networks
Liu Z, Luo P, Wang X, Tang X (2016) Deep learning face attributes in the wild. In: IEEE International Conference on Computer Vision, pp 3730–3738
Lu Y, Tai Y-W, Tang C-K (2018) Attribute-guided face generation using conditional cyclegan. In: Proceedings of the European conference on computer vision (ECCV), pp 282–297
Makhzani A, Shlens J, Jaitly N, Goodfellow I, Frey B (2015) Adversarial autoencoders, Computer Science
Marcolin F, Vezzetti E (2017) Novel descriptors for geometrical 3d face analysis. Multimedia Tools and Applications 76:13805–13834
Article Google Scholar
Mirza M, Osindero S (2014) Conditional generative adversarial nets
Perarnau G, van de Weijer J, Raducanu B, Álvarez JM (2016) Invertible Conditional GANs for image editing. In: NIPS Workshop on Adversarial Training
Radford A, Metz L, Chintala S (2016) Unsupervised representation learning with deep convolutional generative adversarial networks. In: International Conference on Learning Representations
Rezende D J, Mohamed S, Wierstra D (2014) Stochastic backpropagation and approximate inference in deep generative models. In: international conference on machine learning, pp 1278–1286
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. In: Advances in neural information processing systems, pp 2234–2242
Shen W, Liu R (2017) Learning residual images for face attribute manipulation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 1225–1233
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition, Computer Science
Sohn K, Lee H, Yan X (2015) Learning structured output representation using deep conditional generative models. In: Advances in neural information processing systems, pp 3483–3491
Taigman Y, Polyak A, Wolf L (2017) Unsupervised cross-domain image generation, international conference on learning representations
Tang Y, Salakhutdinov R (2013) Learning stochastic feedforward neural networks. In: International Conference on Neural Information Processing Systems, pp 530–538
Tolosana R, Vera-Rodriguez R, Fierrez J, Morales A, Ortega-Garcia J (2020) Deepfakes and beyond: A survey of face manipulation and fake detection. arXiv:2001.00179
Tolstikhin I, Bousquet O, Gelly S, Schoelkopf B (2018) Wasserstein auto-encoders. In: International Conference on Learning Representations. https://openreview.net/forum?id=HkL7n1-0b
Ulyanov D, Vedaldi A, Lempitsky V (2018) It takes (only) two: Adversarial generator-encoder networks. In: Thirty-Second AAAI Conference on Artificial Intelligence
Upchurch P, Gardner J R, Pleiss G, Pless R, Snavely N, Bala K, Weinberger K Q (2017) Deep feature interpolation for image content changes
Vezzetti E, Tornincasa-Luca S, Federica Marcolin U, Dagnes N (2018) 3d geometry-based automatic landmark localization in presence of facial occlusions. Multimedia Tools and Applications 77: 14177–14205
Article Google Scholar
Wang T-C, Liu M-Y, Zhu J-Y, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8798–8807
Wang Z, Bovik A C, Sheikh H R, Simoncelli E P (2004) Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13(4):600–612
Article Google Scholar
Xiao T, Hong J, Ma J (2018) Dna-gan: Learning disentangled representations from multi-attribute images. In: International Conference on Learning Representations, Workshop
Xiao T, Hong J, Ma J (2018) Elegant: Exchanging latent encodings with gan for transferring multiple face attributes. In: Proceedings of the European conference on computer vision (ECCV), pp 168–184
Yan X, Yang J, Sohn K, Lee H (2016) Attribute2image: Conditional image generation from visual attributes, Springer International Publishing
Zhang R, Isola P, Efros A A, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR
Zhou S, Xiao T, Yang Y, Feng D, He Q, He W (2017) Genegan: Learning object transfiguration and attribute subspace from unpaired data. In: Proceedings of the British Machine Vision Conference (BMVC). arXiv:1705.04932
Zhu J-Y, Park T, Isola P, Efros A A (2017) Unpaired image-to-image translation using cycle-consistent adversarial networkss. In: Computer Vision (ICCV), 2017 IEEE International Conference on

Download references

Acknowledgments

This research was funded by Natural Science Foundation of China under grants numbers 61673018, 61272338, 61703443 and Guangzhou Science and Technology Founding Committee under grant No. 201707010222 and Guangdong Province Key Laboratory of Computer Science.

Author information

Authors and Affiliations

School of Mathematics, Sun Yat-sen University, Guangzhou, China
Defang Li, Min Zhang, Weifu Chen & Guocan Feng
Guangdong Province Key Laboratory, Sun Yat-sen University, Guangzhou, China
Defang Li, Min Zhang, Weifu Chen & Guocan Feng
School of Computer Science and Technology, Dongguan University of Technology, Dongguan, China
Lifang Zhang

Authors

Defang Li
View author publications
You can also search for this author inPubMed Google Scholar
Min Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Lifang Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Weifu Chen
View author publications
You can also search for this author inPubMed Google Scholar
Guocan Feng
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Weifu Chen.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, D., Zhang, M., Zhang, L. et al. A novel attribute-based generation architecture for facial image editing. Multimed Tools Appl 80, 4881–4902 (2021). https://doi.org/10.1007/s11042-020-09858-7

Download citation

Received: 21 September 2019
Revised: 27 August 2020
Accepted: 09 September 2020
Published: 02 October 2020
Issue Date: February 2021
DOI: https://doi.org/10.1007/s11042-020-09858-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel attribute-based generation architecture for facial image editing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Editable Generative Adversarial Networks: Generating and Editing Faces Simultaneously

Face attribute editing based on generative adversarial networks

MagGAN: High-Resolution Face Attribute Editing with Mask-Guided Generative Adversarial Network

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now