research-article

Open access

Generating Facial Expression Sequences of Complex Emotions with Generative Adversarial Networks

Authors:

Zakariae Belmekki,

David Antonio Gómez Jáuregui,

Patrick Reuter,

Jean-Claude Martin,

Karl W Jenkins,

Nadine CoutureAuthors Info & Claims

ICMI '24: Proceedings of the 26th International Conference on Multimodal Interaction

Pages 361 - 372

https://doi.org/10.1145/3678957.3685712

Published: 04 November 2024 Publication History

All formats PDF

Abstract

There is a rising interest in animating realistic virtual agents for multiple purposes in different domains. Such a task requires systems capable of generating complex mental states on par with human emotional complexity. Considering the high representational capacity of Generative Adversarial Networks (GANs), it is only natural to consider them in such applications. In this work, we propose a conditional GAN model for generating sequences of facial expressions of categorical complex emotions. Trained on a scarce and highly imbalanced dataset, the proposed model is able to generate realistic variable-length sequences in a single inference step. These expressions of emotional states, of which there are 24 in total, follow the Facial Actions Coding System (FACS) formatting. In the absence of meaningful objective evaluation methods, we propose a deep-learning-based metric to assess the realism of generated Action Unit (AU) sequences: the Action Unit Fréchet Inception Distance (AUFID). Objective and subjective results validate the realism of our generated samples.

Supplemental Material

WEBM File

Videos of sequences (real and synthetic) visualized on OpenFACS.

Download
475.81 KB

WEBM File

Videos of sequences (real and synthetic) visualized on OpenFACS.

Download
1.10 MB

WEBM File

Videos of sequences (real and synthetic) visualized on OpenFACS.

Download
742.88 KB

WEBM File

Videos of sequences (real and synthetic) visualized on OpenFACS.

Download
545.15 KB

WEBM File

Videos of sequences (real and synthetic) visualized on OpenFACS.

Download
627.22 KB

WEBM File

Videos of sequences (real and synthetic) visualized on OpenFACS.

Download
1.28 MB

WEBM File

Videos of sequences (real and synthetic) visualized on OpenFACS.

Download
571.20 KB

WEBM File

Videos of sequences (real and synthetic) visualized on OpenFACS.

Download
1.07 MB

WEBM File

Videos of sequences (real and synthetic) visualized on OpenFACS.

Download
631.28 KB

WEBM File

Videos of sequences (real and synthetic) visualized on OpenFACS.

Download
370.90 KB

WEBM File

Videos of sequences (real and synthetic) visualized on OpenFACS.

Download
1.05 MB

WEBM File

Videos of sequences (real and synthetic) visualized on OpenFACS.

Download
770.88 KB

WEBM File

Videos of sequences (real and synthetic) visualized on OpenFACS.

Download
871.40 KB

WEBM File

Videos of sequences (real and synthetic) visualized on OpenFACS.

Download
787.32 KB

WEBM File

Videos of sequences (real and synthetic) visualized on OpenFACS.

Download
523.83 KB

WEBM File

Videos of sequences (real and synthetic) visualized on OpenFACS.

Download
408.00 KB

PDF File

Appendix: PCA results for all 24 categorical emotional states.

Download
814.68 KB

References

[1]

Petar S. Aleksic, Gerasimos Potamianos, and Aggelos K. Katsaggelos. 2005. 10.8 - Exploiting Visual Information in Automatic Speech Processing. In Handbook of Image and Video Processing (Second Edition) (second edition ed.), AL BOVIK (Ed.). Academic Press, Burlington, 1263–XXXIX. https://doi.org/10.1016/B978-012119792-6/50134-0

[2]

Martin Arjovsky, Soumith Chintala, and Léon Bottou. 2017. Wasserstein Generative Adversarial Networks. In Proceedings of the 34th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 70), Doina Precup and Yee Whye Teh (Eds.). PMLR, 214–223. https://proceedings.mlr.press/v70/arjovsky17a.html

[3]

Tadas Baltrušaitis, Amir Zadeh, Yao Chong Lim, and Louis-Philippe Morency. 2018. OpenFace 2.0: Facial Behavior Analysis Toolkit. IEEE International Conference on Automatic Face and Gesture Recognition (2018).

Digital Library

[4]

Simon Baron-Cohen and Jessica Kingsley. 2007. Mind Reading: The Interactive Guide to Emotions. J Can Acad Child Adolesc Psychiatry (2007).

[5]

Christoph Bartneck. 2023. Godspeed Questionnaire Series: Translations and Usage. Springer, Cham, 1–35. https://doi.org/10.1007/978-3-030-89738-3_24-1

[6]

Andrew Brock, Jeff Donahue, and Karen Simonyan. 2019. Large Scale GAN Training for High Fidelity Natural Image Synthesis. In The Seventh International Conference on Learning Representations. ICLR. arxiv:1809.11096 [cs.LG]

[7]

Sen Chen, Zhilei Liu, Jiaxing Liu, Zhengxiang Yan, and Longbiao Wang. 2021. Talking Head Generation with Audio and Speech Related Facial Action Units. ArXiv abs/2110.09951 (2021). https://api.semanticscholar.org/CorpusID:239024632

[8]

Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. 2018. StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). CVPR. arxiv:1711.09020 [cs.CV] https://arxiv.org/abs/1711.09020

[9]

Matthieu Courgeon, Céline Clavel, and Jean-Claude Martin. 2014. Modeling Facial Signs of Appraisal During Interaction; Impact on Users’ Perception and Behavior. In international conference on Autonomous agents and multi-agent systems(Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems). Paris, France, 765–772. https://hal.science/hal-01144001

[10]

Vittorio Cuculo and Alessandro D’Amelio. 2019. OpenFACS: An Open Source FACS-Based 3D Face Animation System. In Image and Graphics, Yao Zhao, Nick Barnes, Baoquan Chen, Rüdiger Westermann, Xiangwei Kong, and Chunyu Lin (Eds.). Springer International Publishing, Cham, 232–242.

[11]

Alice Delbosc, Magalie Ochs, and Stephane Ayache. 2022. Automatic facial expressions, gaze direction and head movements generation of a virtual agent. In Companion Publication of the 2022 International Conference on Multimodal Interaction (Bengaluru, India) (ICMI ’22 Companion). Association for Computing Machinery, New York, NY, USA, 79–88. https://doi.org/10.1145/3536220.3558806

Digital Library

[12]

Alice Delbosc, Magalie Ochs, Nicolas Sabouret, Brian Ravenet, and Stephane Ayache. 2023. Towards the generation of synchronized and believable non-verbal facial behaviors of a talking virtual agent. In Companion Publication of the 25th International Conference on Multimodal Interaction (, Paris, France, ) (ICMI ’23 Companion). Association for Computing Machinery, New York, NY, USA, 228–237. https://doi.org/10.1145/3610661.3616547

Digital Library

[13]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. CVPR, 248–255. https://doi.org/10.1109/CVPR.2009.5206848

[14]

P. Ekman and W. Friesen. 1978. Facial Action Coding System: A Technique for the Measurement of Facial Movement. In Consulting Psychologists Press.

[15]

Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Networks. In Proceedings of the twenty-eigth Annual Conference on Neural Information Processing Systems. NIPS. arxiv:1406.2661 [stat.ML]

[16]

Dan Hendrycks and Kevin Gimpel. 2023. Gaussian Error Linear Units (GELUs). arXiv preprint arXiv:1606.08415 (2023). arxiv:1606.08415 [cs.LG]

[17]

Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. In Proceedings of the the thirty-first Annual Conference on Neural Information Processing Systems (NIPS). NIPS, 6629 – 6640. arxiv:1706.08500 [cs.LG]

[18]

Paolo Domenico Lambiase, Alessandra Rossi, and Silvia Rossi. 2023. A Two-Tier GAN Architecture for Conditioned Expressions Synthesis on Categorical Emotions. International Journal of Social Robotics (03 2023), 1–17. https://doi.org/10.1007/s12369-023-00973-7

[19]

Jun Ling, Han Xue, Li Song, Shuhui Yang, Rong Xie, and Xiao Gu. 2020. Toward Fine-Grained Facial Expression Manipulation. In Computer Vision – ECCV 2020, Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer International Publishing, Cham, 37–53.

Digital Library

[20]

Zhilei Liu, Diyi Liu, and Yunpeng Wu. 2019. Region Based Adversarial Synthesis of Facial Action Units. Springer International Publishing, 514–526. https://doi.org/10.1007/978-3-030-37734-2_42

Digital Library

[21]

Zhilei Liu, Guoxian Song, Jianfei Cai, Tat-Jen Cham, and Juyong Zhang. 2019. Conditional adversarial synthesis of 3D facial action units. Neurocomputing 355 (2019), 200 – 208. https://doi.org/10.1016/j.neucom.2019.05.003 Cited by: 17; All Open Access, Green Open Access.

Digital Library

[22]

Birgit Lugrin, Catherine Pelachaud, Elisabeth André, Ruth Aylett, Timothy Bickmore, Cynthia Breazeal, Joost Broekens, Kerstin Dautenhahn, Jonathan Gratch, Stefan Kopp, Jacqueline Nadel, Ana Paiva, and Agnieszka Wykowska. 2022. Challenge Discussion on Socially Interactive Agents: Considerations on Social Interaction, Computational Architectures, Evaluation, and Ethics (1 ed.). Association for Computing Machinery, New York, NY, USA, 561–626. https://doi.org/10.1145/3563659.3563677

Digital Library

[23]

Xudong Mao, Qing Li, Haoran Xie, Raymond Y. K. Lau, Zhen Wang, and Stephen Paul Smolley. 2017. Least Squares Generative Adversarial Networks. In International Conference on Computer Vision (ICCV). arxiv:1611.04076 [cs.CV]

[24]

A. Miolla, M. Cardaioli, and C. Scarpazza. 2023. Padova Emotional Dataset of Facial Expressions (PEDFE): A unique dataset of genuine and posed emotional facial expressions. Behav Res Methods (2023).

[25]

Mehdi Mirza and Simon Osindero. 2014. Conditional Generative Adversarial Nets. arxiv:1411.1784 [cs.LG]

[26]

Radoslaw Niewiadomski, Sylwia Hyniewska, and Catherine Pelachaud. 2009. Evaluation of multimodal sequential expressions of emotions in ECA. In 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops. 1 – 7. https://doi.org/10.1109/ACII.2009.5349569

[27]

Koichiro Niinuma, Itir Onal Ertugrul, Jeffrey F. Cohn, and László A. Jeni. 2022. Facial Expression Manipulation for Personalized Facial Action Estimation. Frontiers in Signal Processing 2 (2022). https://doi.org/10.3389/frsip.2022.861641

[28]

Naima Otberdout, Mohamed Daoudi, Anis Kacem, Lahoucine Ballihi, and Stefano Berretti. 2019. Dynamic Facial Expression Generation on Hilbert Hypersphere With Conditional Wasserstein Generative Adversarial Nets. IEEE Transactions on Pattern Analysis and Machine Intelligence 44 (2019), 848–863. https://api.semanticscholar.org/CorpusID:198229775

Digital Library

[29]

Catherine Pelachaud, Carlos Busso, and Dirk Heylen. 2021. Multimodal Behavior Modeling for Socially Interactive Agents (1 ed.). Association for Computing Machinery, New York, NY, USA, 259–310. https://doi.org/10.1145/3477322.3477331

Digital Library

[30]

Albert Pumarola, Antonio Agudo, Aleix M. Martínez, Alberto Sanfeliu, and Francesc Moreno-Noguer. 2018. GANimation: Anatomically-Aware Facial Animation from a Single Image. In Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part X(Lecture Notes in Computer Science, Vol. 11214). Springer, 835–851. https://doi.org/10.1007/978-3-030-01249-6_50

Digital Library

[31]

Yong Zhao, Le Yang, Ercheng Pei, Meshia Cédric Oveneke, Mitchel Alioscha‐Perez, Longfei Li, Dongmei Jiang, and Hichem Sahli. 2021. Action Unit Driven Facial Expression Synthesis from a Single Image with Patch Attentive GAN. Computer Graphics Forum (2021). https://doi.org/10.1111/cgf.14202

Index Terms

Generating Facial Expression Sequences of Complex Emotions with Generative Adversarial Networks
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Person-independent facial expression recognition method based on improved Wasserstein generative adversarial networks in combination with identity aware
Abstract
Since the distinction between two expressions is fairly vague, usually a subtle change in one part of the human face is enough to change a facial expression. Most of the existing facial expression recognition algorithms are not robust enough ...
Learning inter-class optical flow difference using generative adversarial networks for facial expression recognition
Abstract
Facial expression recognition is a fine-grained task because different emotions have subtle facial movements. This paper proposes to learn inter-class optical flow difference using generative adversarial networks (GANs) for facial expression ...
Geometry Guided Adversarial Facial Expression Synthesis
MM '18: Proceedings of the 26th ACM international conference on Multimedia

Facial expression synthesis has drawn much attention in the field of computer graphics and pattern recognition. It has been widely used in face animation and recognition. However, it is still challenging due to the high-level semantic presence of large ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICMI '24: Proceedings of the 26th International Conference on Multimodal Interaction

November 2024

725 pages

ISBN:9798400704628

DOI:10.1145/3678957

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution-NonCommercial International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 November 2024

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICMI '24

ICMI '24: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION

November 4 - 8, 2024

San Jose, Costa Rica

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
123
Total Downloads

Downloads (Last 12 months)123
Downloads (Last 6 weeks)63

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten