research-article

Robust continuous prediction of human emotions using multiscale dynamic cues

Authors:

Jérémie Nicolle,

Lionel Prevost,

Mohamed ChetouaniAuthors Info & Claims

ICMI '12: Proceedings of the 14th ACM international conference on Multimodal interaction

Pages 501 - 508

https://doi.org/10.1145/2388676.2388783

Published: 22 October 2012 Publication History

Abstract

Designing systems able to interact with humans in a natural manner is a complex and far from solved problem. A key aspect of natural interaction is the ability to understand and appropriately respond to human emotions. This paper details our response to the Audio/Visual Emotion Challenge (AVEC'12) whose goal is to continuously predict four affective signals describing human emotions (namely valence, arousal, expectancy and power). The proposed method uses log-magnitude Fourier spectra to extract multiscale dynamic descriptions of signals characterizing global and local face appearance as well as head movements and voice. We perform a kernel regression with very few representative samples selected via a supervised weighted-distance-based clustering, that leads to a high generalization power. For selecting features, we introduce a new correlation-based measure that takes into account a possible delay between the labels and the data and significantly increases robustness. We also propose a particularly fast regressor-level fusion framework to merge systems based on different modalities. Experiments have proven the efficiency of each key point of the proposed method and we obtain very promising results.

References

[1]

A. Cruz, B. Bhanu, and S. Yang. A psychologically-inspired match-score fusion model for video-based facial expression recognition. In Proc. of Affective Computing and Intelligent Interaction (ACII'11), pages 341--350, 2011.

Digital Library

[2]

M. Dahmane and J. Meunier. Continuous emotion recognition using gabor energy filters. In Proc. of Affective Computing and Intelligent Interaction (ACII'11), pages 351--358, 2011.

Digital Library

[3]

P. Ekman. Facial expression and emotion. American Psychologist, 48(4):384, 1993.

[4]

P. Ekman and W. Friesen. Facial action coding system: A technique for the measurement of facial action. Manual for the Facial Action Coding System, 1978.

[5]

J. Fontaine, K. Scherer, E. Roesch, and P. Ellsworth. The world of emotions is not two-dimensional. Psychological science, 18(12):1050, 2007.

[6]

H. Gunes and M. Pantic. Dimensional emotion prediction from spontaneous head gestures for interaction with sensitive artificial listeners. In Proc. of Intelligent Virtual Agents (IVA'10), pages 371--377, 2010.

Digital Library

[7]

H. Gunes, B. Schuller, M. Pantic, and R. Cowie. Emotion representation, analysis and synthesis in continuous space: A survey. In Proc. IEEE Int'l Conf. Face & Gesture Recognition(FG'11), pages 827--834, 2011.

[8]

B. Jiang, M. Valstar, and M. Pantic. Action unit detection using sparse appearance descriptors in space-time video volumes. In Proc. IEEE Int'l Conf. Face & Gesture Recognition(FG'11), pages 314--321, 2011.

[9]

D. McDuff, R. El Kaliouby, K. Kassam, and R. Picard. Affect valence inference from facial action unit spectrograms. In Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition Workshops (CVPRW'10), pages 17--24, 2010.

[10]

G. McKeown, M. Valstar, R. Cowie, and M. Pantic. The semaine corpus of emotionally coloured character interactions. In Proc. IEEE Int'l Conf. on Multimedia and Expo (ICME'10), pages 1079--1084, 2010.

[11]

H. Meng and N. Bianchi-Berthouze. Naturalistic affective expression classification by a multi-stage approach based on hidden markov models. In Proc. of Affective Computing and Intelligent Interaction (ACII'11), pages 378--387, 2011.

Digital Library

[12]

E. Nadaraya. On estimating regression. Theory of Prob. and Appl., 9:141--142, 1964.

[13]

M. Nicolaou, H. Gunes, and M. Pantic. Automatic segmentation of spontaneous data using dimensional labels from multiple coders. In Workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, pages 43--48, 2010.

[14]

M. Nicolaou, H. Gunes, and M. Pantic. Output-associative rvm regression for dimensional and continuous emotion prediction. Image and Vision Computing, 2012.

Digital Library

[15]

G. Ramirez, T. Baltrušaitis, and L. Morency. Modeling latent discriminative dynamic of multi-dimensional affective signals. In Proc. of Affective Computing and Intelligent Interaction (ACII'11), pages 396--406, 2011.

Digital Library

[16]

J. M. Saragih, S. Lucey, and J. F. Cohn. Deformable Model Fitting by Regularized Landmark Mean-Shift. International Journal of Computer Vision, 91(2):200--215, Sept. 2010.

Digital Library

[17]

A. Sayedelahl, P. Fewzee, M. Kamel, and F. Karray. Audio-based emotion recognition from natural conversations based on co-occurrence matrix and frequency domain energy distribution features. In Proc. of Affective Computing and Intelligent Interaction (ACII'11), pages 407--414, 2011.

Digital Library

[18]

B. Schuller, M. Valstar, F. Eyben, R. Cowie, and M. Pantic. Avec 2012 - the continuous audio/visual emotion challenge. In Proc. Second International Audio/Visual Emotion Challenge and Workshop (AVEC'12) to appear, 2012.

Digital Library

[19]

B. Schuller, M. Valstar, F. Eyben, G. McKeown, R. Cowie, and M. Pantic. Avec 2011 - the first international audio/visual emotion challenge. In Proc. of Affective Computing and Intelligent Interaction (ACII'11), pages 415--424, 2011.

Digital Library

[20]

T. Senechal, V. Rapp, H. Salam, R. Seguier, K. Bailly, and L. Prevost. Facial action recognition combining heterogeneous features via multi-kernel learning. IEEE Transactions on Systems, Man, and Cybernetics - Part B, 42(4):993--1005, 2012.

Digital Library

[21]

M. Valstar, M. Mehu, B. Jiang, M. Pantic, K. Scherer, et al. Meta-analysis of the first facial expression recognition challenge. IEEE Transactions on Systems, Man, and Cybernetics--Part B, 2012.

Digital Library

[22]

P. Viola and M. Jones. Robust real-time face detection. International journal of computer vision, 57(2):137--154, 2004.

Digital Library

[23]

M. Wöllmer, M. Kaiser, F. Eyben, and B. Schuller. Lstm-modeling of continuous emotions in an audiovisual affect recognition framework. Image and Vision Computing, 2012.

Cited By

Leung CXu Z(2024)Large Language Models for Emotion Evolution PredictionComputational Science and Its Applications – ICCSA 2024 Workshops10.1007/978-3-031-65154-0_1(3-19)Online publication date: 30-Jul-2024
https://doi.org/10.1007/978-3-031-65154-0_1
Lei YCao H(2023)Audio-Visual Emotion Recognition With Preference Learning Based on Intended and Multi-Modal Perceived LabelsIEEE Transactions on Affective Computing10.1109/TAFFC.2023.323477714:4(2954-2969)Online publication date: 1-Oct-2023
https://doi.org/10.1109/TAFFC.2023.3234777
Dollack FKiyokawa KLiu HPerusquia-Hernandez MRaman CUchiyama HWei X(2023)Ensemble Learning to Assess Dynamics of Affective Experience Ratings and Physiological Change2023 11th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)10.1109/ACIIW59127.2023.10388116(1-8)Online publication date: 10-Sep-2023
https://doi.org/10.1109/ACIIW59127.2023.10388116
Show More Cited By

Index Terms

Robust continuous prediction of human emotions using multiscale dynamic cues

Recommendations

Virtual proximity and facial expressions of computer agents regulate human emotions and attention
CASA' 2010 Special Issue

Emotion- and attention-related subjective and physiological responses to virtual proximity and facial expressions of embodied computer agents (ECA) were studied. Thirty participants viewed female and male characters with a neutral, unpleasant, or ...
Recognizing emotions in spoken dialogue with acoustic and lexical cues
ISIAA 2017: Proceedings of the 1st ACM SIGCHI International Workshop on Investigating Social Interactions with Artificial Agents

Emotions play a vital role in human communications. Therefore, it is desirable for virtual agent dialogue systems to recognize and react to user's emotions. However, current automatic emotion recognizers have limited performance compared to humans. Our ...
Inducing Genuine Emotions in Simulated Speech-Based Human-Machine Interaction: The NIMITEK Corpus

Emotional corpora provide an important empirical foundation for investigation when researchers aim at implementing emotion-aware spoken dialog systems. One of the fundamental research questions is how to acquire an appropriate, realistic emotion corpus. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMI '12: Proceedings of the 14th ACM international conference on Multimodal interaction

October 2012

636 pages

ISBN:9781450314671

DOI:10.1145/2388676

General Chairs:
Louis-Philippe Morency
University of Southern California, USA
,
Dan Bohus
Microsoft Research, USA
,
Hamid Aghajan
Stanford University, USA
,
Program Chairs:
Justine Cassell
Carnegie Mellon University, USA
,
Anton Nijholt
University of Twente, Netherlands
,
Julien Epps
The University of New South Wales, Australia

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 October 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICMI '12

Sponsor:

SIGCHI

ICMI '12: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION

October 22 - 26, 2012

California, Santa Monica, USA

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

80
Total Citations
View Citations
432
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Leung CXu Z(2024)Large Language Models for Emotion Evolution PredictionComputational Science and Its Applications – ICCSA 2024 Workshops10.1007/978-3-031-65154-0_1(3-19)Online publication date: 30-Jul-2024
https://doi.org/10.1007/978-3-031-65154-0_1
Lei YCao H(2023)Audio-Visual Emotion Recognition With Preference Learning Based on Intended and Multi-Modal Perceived LabelsIEEE Transactions on Affective Computing10.1109/TAFFC.2023.323477714:4(2954-2969)Online publication date: 1-Oct-2023
https://doi.org/10.1109/TAFFC.2023.3234777
Dollack FKiyokawa KLiu HPerusquia-Hernandez MRaman CUchiyama HWei X(2023)Ensemble Learning to Assess Dynamics of Affective Experience Ratings and Physiological Change2023 11th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)10.1109/ACIIW59127.2023.10388116(1-8)Online publication date: 10-Sep-2023
https://doi.org/10.1109/ACIIW59127.2023.10388116
Pandit DJadhav S(2023)Prediction of Face Emotion with Labelled Selective Transfer Machine as a Generalized Emotion ClassifierAdvanced Computing10.1007/978-3-031-35644-5_23(294-307)Online publication date: 14-Jul-2023
https://doi.org/10.1007/978-3-031-35644-5_23
Li XLu GYan JZhang Z(2022)A Multi-Scale Multi-Task Learning Model for Continuous Dimensional Emotion Recognition from AudioElectronics10.3390/electronics1103041711:3(417)Online publication date: 29-Jan-2022
https://doi.org/10.3390/electronics11030417
Oveneke MZhao YPei EBerenguer AJiang DSahli H(2022)Leveraging the Deep Learning Paradigm for Continuous Affect Estimation from Facial ExpressionsIEEE Transactions on Affective Computing10.1109/TAFFC.2019.294460313:1(426-439)Online publication date: 1-Jan-2022
https://doi.org/10.1109/TAFFC.2019.2944603
Hong K(2022)Facial expression recognition based on anomaly featureOptical Review10.1007/s10043-022-00734-329:3(178-187)Online publication date: 18-Apr-2022
https://doi.org/10.1007/s10043-022-00734-3
ALLAERT BBILASCO IDJERABA CBelmonte RAllaert B(2022)Facial Expression ModelingFace Analysis Under Uncontrolled Conditions10.1002/9781394173853.ch5(191-222)Online publication date: 16-Sep-2022
https://doi.org/10.1002/9781394173853.ch5
Kawulok MNalepa JKawulok JSmolka B(2021)Dynamics of facial actions for assessing smile genuinenessPLOS ONE10.1371/journal.pone.024464716:1(e0244647)Online publication date: 5-Jan-2021
https://doi.org/10.1371/journal.pone.0244647
Pei EOveneke MZhao YJiang DSahli H(2021)Monocular 3D Facial Expression Features for Continuous Affect RecognitionIEEE Transactions on Multimedia10.1109/TMM.2020.302689423(3540-3550)Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1109/TMM.2020.3026894
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten