Skip to main content
Log in

FAE-GAN: facial attribute editing with multi-scale attention normalization

Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Facial attribute editing has gained increasing attention recently. Previous methods tackle this challenge by incorporating encoder–decoder and generative adversarial networks. However, the bottleneck layer in encoder–decoder of these methods often leads to blurry and low-quality editing results. And skip connections are used between deep and shallow layers to improve image quality but suffer from a limited ability to manipulate attribute. To address these issues, we propose a novel Facial Attribute Editing Generative Adversarial Networks from a selective refinement perspective, which is capable of focusing on editing the image attributes to be changed while preserving its unique details. Specifically, our method first learns a spatially varying function that maps a high-level feature map to an appropriate parameter map of the normalization layer. Then, by utilizing the residual block, the low-level feature map is added to the feature map after modulation, making the attribute refinement task easier. Experimental results show the superiority of our method in both performances of the attribute manipulation accuracy and perception quality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. Adv. Neural Inf. Process. Syst. 27, 2672–2680 (2014)

    Google Scholar 

  2. Mirza, M., Osindero, S.: Conditional generative adversarial nets. ArXiv:1411.1784 (2014)

  3. Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5967–5976 (2016)

  4. Yi, Z., Zhang, H., Tan, P., Gong, M.: Dualgan: unsupervised dual learning for image-to-image translation. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2868–2876 (2017)

  5. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2242–2251 (2017)

  6. Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. ArXiv:1703.00848 (2017)

  7. Choi, Y., Choi, M.J., Kim, M., Ha, J.W., Kim, S., Choo, J.: Stargan: unified generative adversarial networks for multi-domain image-to-image translation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8789–8797 (2017)

  8. He, Z., Zuo, W., Kan, M., Shan, S., Chen, X.: Attgan: facial attribute editing by only changing what you want. IEEE Trans. Image Process. 28, 5464–5478 (2017)

    Article  MathSciNet  Google Scholar 

  9. Perarnau, G., van de Weijer, J., Raducanu, B., Álvarez, J.M.: Invertible conditional gans for image editing. ArXiv:1611.06355 (2016)

  10. Lample, G., Zeghidour, N., Usunier, N., Bordes, A., Denoyer, L., Ranzato, M.: Fader networks: manipulating images by sliding attributes. In: NIPS (2017)

  11. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein gan. ArXiv:1701.07875 (2017)

  12. Qi, G.J.: Loss-sensitive generative adversarial networks on lipschitz densities. ArXiv:1701.06264 (2017)

  13. Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional gans. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8798–8807 (2017)

  14. Zhu, J.Y., Zhang, R., Pathak, D., Darrell, T., Efros, A.A., Wang, O., Shechtman, E.: Toward multimodal image-to-image translation. Adv. Neural Inf. Process. Syst. 30, 465–476 (2017)

    Google Scholar 

  15. Ledig, C., Theis, L., Huszar, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., Shi, W.: Photo-realistic single image super-resolution using a generative adversarial network. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 105–114 (2016)

  16. Huang, X., Liu, M.Y., Belongie, S.J., Kautz, J.: Multimodal unsupervised image-to-image translation. In: ECCV (2018)

  17. Bao, J., Chen, D., Wen, F., Li, H., Hua, G.: Towards open-set identity preserving face synthesis. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6713–6722 (2018)

  18. Luan, T., Xi, Y., Liu, X.: Disentangled representation learning gan for pose-invariant face recognition. In: Computer Vision and Pattern Recognition (2017)

  19. Lee, H.Y., Tseng, H.Y., Mao, Q., Huang, J.B., Lu, Y.D., Singh, M.K., Yang, M.H.: Drit++: diverse image-to-image translation via disentangled representations. ArXiv:1808.00948 (2018)

  20. Zhao, B., Chang, B., Jie, Z., Sigal, L.: Modular generative adversarial networks. In: ECCV (2018)

  21. Liu, M., Ding, Y., Xia, M., Liu, X., Ding, E., Zuo, W., Wen, S.: Stgan: a unified selective transfer network for arbitrary image attribute editing. ArXiv:1904.09709 (2019)

  22. Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. ArXiv:1502.03167 (2015)

  23. Salimans, T., Kingma, D.P.: Weight normalization: a simple reparameterization to accelerate training of deep neural networks. In: NIPS (2016)

  24. Ba, J., Kiros, J.R., Hinton, G.E.: Layer normalization. ArXiv:1607.06450 (2016)

  25. Ulyanov, D., Vedaldi, A., Lempitsky, V.S.: Instance normalization: the missing ingredient for fast stylization. ArXiv:1607.08022 (2016)

  26. Wu, Y., He, K.: Group normalization. Int. J. Comput. Vis. 1–14 (2018)

  27. Dumoulin, V., Shlens, J., Kudlur, M.: A learned representation for artistic style. ArXiv:1610.07629 (2016)

  28. Huang, X., Belongie, S.J.: Arbitrary style transfer in real-time with adaptive instance normalization. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1510–1519 (2017)

  29. Park, T., Liu, M.Y., Wang, T.C., Zhu, J.Y.: Semantic image synthesis with spatially-adaptive normalization. ArXiv:1903.07291 (2019)

  30. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2015)

  31. Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of wasserstein gans. In: NIPS (2017)

  32. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR arXiv:1412.6980 (2014)

  33. Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 3730–3738 (2014)

Download references

Acknowledgements

This work was supported by Development of complex-domain neural network system for 3D face recognition based on 3D real-time imaging No: JD2019XKJH0029. The authors would like to thank the anonymous reviews for their helpful and constructive comments and suggestions regarding this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shu Zhan.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies that used human participants or animals.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, J., Ouyang, P., Tao, R. et al. FAE-GAN: facial attribute editing with multi-scale attention normalization. Machine Vision and Applications 32, 97 (2021). https://doi.org/10.1007/s00138-021-01208-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00138-021-01208-3

Keywords

Navigation