Performance Analysis of Generative Adversarial Networks and Diffusion Models for Face Aging

Kemmer, Bruno; Simões, Rodolfo; Ivamoto, Victor; Lima, Clodoaldo

doi:10.1007/978-3-031-45389-2_16

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14196))

Included in the following conference series:

Brazilian Conference on Intelligent Systems

251 Accesses

Abstract

Computational face aging enables predicting a person’s future appearance using algorithms, with the goal that the output age is close to the expected age and that the individual’s characteristics are maintained. In this work, we evaluate the performance of four generative models on facial aging. Two models are based on generative adversarial networks (GANs), HRFAE, and SAM, and the other two are based on diffusion models, Pix2pix-zero and Instruct-pix2pix. The first two were explicitly trained to generate an aged version of the original person, and the others have a zero-shot generation; in other words, they are generic models that perform different tasks, including facial aging. Since diffusion models have been gaining attention because of their diversity and high-quality image generation, comparing their results against models specifically designed for the task using meaningful metrics is essential. Therefore, we compared these models using the FFHQ Aging database and with the metrics: Mean absolute error (MAE) of the predicted age, Fréchet inception distance (FID), and the cosine similarity of the FaceNet’s embeddings.

V. Ivamoto—Partially supported by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) – Finance Code 001.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The authors used Stable diffusion 1.4.
2.
Ratio of diffusion steps with cross-attention weights.

References

Alaluf, Y., Patashnik, O., Cohen-Or, D.: Only a matter of style: Age transformation using a style-based regression model. ACM Trans. Graph. 40(4), 1–12 (2021)
Article Google Scholar
Brooks, T., Holynski, A., Efros, A.A.: Instructpix2pix: learning to follow image editing instructions (2022)
Google Scholar
Brown, T.B., et al.: Language models are few-shot learners. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. NIPS’20, Curran Associates Inc., Red Hook, NY, USA (2020)
Google Scholar
Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4685–4694 (2019)
Google Scholar
Despois, J., Flament, F., Perrot, M.: AgingMapGAN (AMGAN): high-resolution controllable face aging with spatially-aware conditional GANs. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12537, pp. 613–628. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-67070-2_37
Chapter Google Scholar
Gal, R., et al.: An image is worth one word: personalizing text-to-image generation using textual inversion (2022)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets, p. 9 (2014)
Google Scholar
Grimmer, M., Ramachandra, R., Busch, C.: Deep face age progression: a survey. IEEE Access 9, 83376–83393 (2021)
Article Google Scholar
Heljakka, A., Solin, A., Kannala, J.: Recursive chaining of reversible image-to-image translators for face aging. In: Blanc-Talon, J., Helbert, D., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2018. LNCS, vol. 11182, pp. 309–320. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01449-0_26
Chapter Google Scholar
Hertz, A., Mokady, R., Tenenbaum, J., Aberman, K., Pritch, Y., Cohen-Or, D.: Prompt-to-prompt image editing with cross attention control (2022)
Google Scholar
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, pp. 6629–6640. Curran Associates Inc., Red Hook, NY, USA (2017)
Google Scholar
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., Lin, H. (eds.) Advances in Neural Information Processing Systems, vol. 33, pp. 6840–6851. Curran Associates, Inc. (2020)
Google Scholar
Ho, J., Salimans, T.: Classifier-free diffusion guidance (2022)
Google Scholar
Huang, Y., Hu, H.: A parallel architecture of age adversarial convolutional neural network for cross-age face recognition. IEEE Trans. Circuits Syst. Video Technol. 31(1), 148–159 (2021)
Article Google Scholar
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. arXiv:1812.04948 [cs, stat] (2019)
Kemmer, B., Simões, R., Lima, C.: Face aging using generative adversarial networks. In: Razavi-Far, R., Ruiz-Garcia, A., Palade, V., Schmidhuber, J., et al. (eds.) Generative Adversarial Learning: Architectures and Applications. Intelligent Systems Reference Library, vol. 217. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-91390-8_7
Chapter Google Scholar
Khanna, A., Thakur, A., Tewari, A., Bhat, A.: Cross-age face verification using face aging, pp. 94–99 (2020)
Google Scholar
King, D.E.: Dlib-ml: a machine learning toolkit. J. Mach. Learn. Res. 10, 1755–1758 (2009)
Google Scholar
Li, J., Li, D., Xiong, C., Hoi, S.: BLIP: bootstrapping language-image pre-training for unified vision-language understanding and generation (2022)
Google Scholar
Mao, X., Li, Q., Xie, H., Lau, R.Y.K., Wang, Z., Smolley, S.P.: Least squares generative adversarial networks. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2813–2821 (2017)
Google Scholar
Nichol, A., Dhariwal, P.: Improved denoising diffusion probabilistic models (2021)
Google Scholar
Or-El, R., Sengupta, S., Fried, O., Shechtman, E., Kemelmacher-Shlizerman, I.: Lifespan age transformation synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12351, pp. 739–755. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58539-6_44
Chapter Google Scholar
Parmar, G., Singh, K.K., Zhang, R., Li, Y., Lu, J., Zhu, J.Y.: Zero-shot image-to-image translation (2023)
Google Scholar
Radford, A., et al.: Learning transferable visual models from natural language supervision (2021)
Google Scholar
Richardson, E., et al.: Encoding in style: a StyleGAN encoder for image-to-image translation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2287–2296 (2021)
Google Scholar
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models (2021)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015. Lecture Notes in Computer Science(), vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Rothe, R., Timofte, R., Van Gool, L.: DEX: deep expectation of apparent age from a single image. In: 2015 IEEE International Conference on Computer Vision Workshop (ICCVW), pp. 252–257 (2015)
Google Scholar
Ruiz, N., Li, Y., Jampani, V., Pritch, Y., Rubinstein, M., Aberman, K.: DreamBooth: fine tuning text-to-image diffusion models for subject-driven generation (2022)
Google Scholar
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 815–823 (2015)
Google Scholar
Schuhmann, C., et al.: LAION-400M: open dataset of clip-filtered 400 million image-text pairs (2021)
Google Scholar
Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., Ganguli, S.: Deep unsupervised learning using nonequilibrium thermodynamics. In: Bach, F., Blei, D. (eds.) Proceedings of the 32nd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 37, pp. 2256–2265. PMLR, Lille, France (2015)
Google Scholar
Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. arXiv:2010.02502 (2020)
Wang, Z., Tang, X., Luo, W., Gao, S.: Face aging with identity-preserved conditional generative adversarial networks, pp. 7939–7947 (2018)
Google Scholar
Wolleb, J., Sandkühler, R., Bieder, F., Cattin, P.C.: The swiss army knife for image-to-image translation: multi-task diffusion models (2022)
Google Scholar
Yang, H., Huang, D., Wang, Y., Jain, A.: Learning face age progression: a pyramid architecture of GANs, pp. 31–39 (2018)
Google Scholar
Yao, X., Puy, G., Newson, A., Gousseau, Y., Hellier, P.: High resolution face age editing. CoRR abs/2005.04410 (2020)
Google Scholar
Zhang, Z., Song, Y., Qi, H.: Age progression/regression by conditional adversarial autoencoder, vol. 2017-January, pp. 4352–4360 (2017)
Google Scholar
Zoss, G., Chandran, P., Sifakis, E., Gross, M., Gotardo, P., Bradley, D.: Production-ready face re-aging for visual effects. ACM Trans. Graph. 41(6), 1–15 (2022)
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of São Paulo, São Paulo, SP, Brazil
Bruno Kemmer, Rodolfo Simões, Victor Ivamoto & Clodoaldo Lima

Authors

Bruno Kemmer
View author publications
You can also search for this author in PubMed Google Scholar
Rodolfo Simões
View author publications
You can also search for this author in PubMed Google Scholar
Victor Ivamoto
View author publications
You can also search for this author in PubMed Google Scholar
Clodoaldo Lima
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bruno Kemmer .

Editor information

Editors and Affiliations

Federal University of São Carlos, São Carlos, Brazil
Murilo C. Naldi
Centro Universitario da FEI, São Bernardo do Campo, Brazil
Reinaldo A. C. Bianchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kemmer, B., Simões, R., Ivamoto, V., Lima, C. (2023). Performance Analysis of Generative Adversarial Networks and Diffusion Models for Face Aging. In: Naldi, M.C., Bianchi, R.A.C. (eds) Intelligent Systems. BRACIS 2023. Lecture Notes in Computer Science(), vol 14196. Springer, Cham. https://doi.org/10.1007/978-3-031-45389-2_16

Download citation

DOI: https://doi.org/10.1007/978-3-031-45389-2_16
Published: 12 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-45388-5
Online ISBN: 978-3-031-45389-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Performance Analysis of Generative Adversarial Networks and Diffusion Models for Face Aging