skip to main content
10.1145/2388676.2388783acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

Robust continuous prediction of human emotions using multiscale dynamic cues

Published: 22 October 2012 Publication History

Abstract

Designing systems able to interact with humans in a natural manner is a complex and far from solved problem. A key aspect of natural interaction is the ability to understand and appropriately respond to human emotions. This paper details our response to the Audio/Visual Emotion Challenge (AVEC'12) whose goal is to continuously predict four affective signals describing human emotions (namely valence, arousal, expectancy and power). The proposed method uses log-magnitude Fourier spectra to extract multiscale dynamic descriptions of signals characterizing global and local face appearance as well as head movements and voice. We perform a kernel regression with very few representative samples selected via a supervised weighted-distance-based clustering, that leads to a high generalization power. For selecting features, we introduce a new correlation-based measure that takes into account a possible delay between the labels and the data and significantly increases robustness. We also propose a particularly fast regressor-level fusion framework to merge systems based on different modalities. Experiments have proven the efficiency of each key point of the proposed method and we obtain very promising results.

References

[1]
A. Cruz, B. Bhanu, and S. Yang. A psychologically-inspired match-score fusion model for video-based facial expression recognition. In Proc. of Affective Computing and Intelligent Interaction (ACII'11), pages 341--350, 2011.
[2]
M. Dahmane and J. Meunier. Continuous emotion recognition using gabor energy filters. In Proc. of Affective Computing and Intelligent Interaction (ACII'11), pages 351--358, 2011.
[3]
P. Ekman. Facial expression and emotion. American Psychologist, 48(4):384, 1993.
[4]
P. Ekman and W. Friesen. Facial action coding system: A technique for the measurement of facial action. Manual for the Facial Action Coding System, 1978.
[5]
J. Fontaine, K. Scherer, E. Roesch, and P. Ellsworth. The world of emotions is not two-dimensional. Psychological science, 18(12):1050, 2007.
[6]
H. Gunes and M. Pantic. Dimensional emotion prediction from spontaneous head gestures for interaction with sensitive artificial listeners. In Proc. of Intelligent Virtual Agents (IVA'10), pages 371--377, 2010.
[7]
H. Gunes, B. Schuller, M. Pantic, and R. Cowie. Emotion representation, analysis and synthesis in continuous space: A survey. In Proc. IEEE Int'l Conf. Face & Gesture Recognition(FG'11), pages 827--834, 2011.
[8]
B. Jiang, M. Valstar, and M. Pantic. Action unit detection using sparse appearance descriptors in space-time video volumes. In Proc. IEEE Int'l Conf. Face & Gesture Recognition(FG'11), pages 314--321, 2011.
[9]
D. McDuff, R. El Kaliouby, K. Kassam, and R. Picard. Affect valence inference from facial action unit spectrograms. In Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition Workshops (CVPRW'10), pages 17--24, 2010.
[10]
G. McKeown, M. Valstar, R. Cowie, and M. Pantic. The semaine corpus of emotionally coloured character interactions. In Proc. IEEE Int'l Conf. on Multimedia and Expo (ICME'10), pages 1079--1084, 2010.
[11]
H. Meng and N. Bianchi-Berthouze. Naturalistic affective expression classification by a multi-stage approach based on hidden markov models. In Proc. of Affective Computing and Intelligent Interaction (ACII'11), pages 378--387, 2011.
[12]
E. Nadaraya. On estimating regression. Theory of Prob. and Appl., 9:141--142, 1964.
[13]
M. Nicolaou, H. Gunes, and M. Pantic. Automatic segmentation of spontaneous data using dimensional labels from multiple coders. In Workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, pages 43--48, 2010.
[14]
M. Nicolaou, H. Gunes, and M. Pantic. Output-associative rvm regression for dimensional and continuous emotion prediction. Image and Vision Computing, 2012.
[15]
G. Ramirez, T. Baltrušaitis, and L. Morency. Modeling latent discriminative dynamic of multi-dimensional affective signals. In Proc. of Affective Computing and Intelligent Interaction (ACII'11), pages 396--406, 2011.
[16]
J. M. Saragih, S. Lucey, and J. F. Cohn. Deformable Model Fitting by Regularized Landmark Mean-Shift. International Journal of Computer Vision, 91(2):200--215, Sept. 2010.
[17]
A. Sayedelahl, P. Fewzee, M. Kamel, and F. Karray. Audio-based emotion recognition from natural conversations based on co-occurrence matrix and frequency domain energy distribution features. In Proc. of Affective Computing and Intelligent Interaction (ACII'11), pages 407--414, 2011.
[18]
B. Schuller, M. Valstar, F. Eyben, R. Cowie, and M. Pantic. Avec 2012 - the continuous audio/visual emotion challenge. In Proc. Second International Audio/Visual Emotion Challenge and Workshop (AVEC'12) to appear, 2012.
[19]
B. Schuller, M. Valstar, F. Eyben, G. McKeown, R. Cowie, and M. Pantic. Avec 2011 - the first international audio/visual emotion challenge. In Proc. of Affective Computing and Intelligent Interaction (ACII'11), pages 415--424, 2011.
[20]
T. Senechal, V. Rapp, H. Salam, R. Seguier, K. Bailly, and L. Prevost. Facial action recognition combining heterogeneous features via multi-kernel learning. IEEE Transactions on Systems, Man, and Cybernetics - Part B, 42(4):993--1005, 2012.
[21]
M. Valstar, M. Mehu, B. Jiang, M. Pantic, K. Scherer, et al. Meta-analysis of the first facial expression recognition challenge. IEEE Transactions on Systems, Man, and Cybernetics--Part B, 2012.
[22]
P. Viola and M. Jones. Robust real-time face detection. International journal of computer vision, 57(2):137--154, 2004.
[23]
M. Wöllmer, M. Kaiser, F. Eyben, and B. Schuller. Lstm-modeling of continuous emotions in an audiovisual affect recognition framework. Image and Vision Computing, 2012.

Cited By

View all
  • (2024)Large Language Models for Emotion Evolution PredictionComputational Science and Its Applications – ICCSA 2024 Workshops10.1007/978-3-031-65154-0_1(3-19)Online publication date: 30-Jul-2024
  • (2023)Audio-Visual Emotion Recognition With Preference Learning Based on Intended and Multi-Modal Perceived LabelsIEEE Transactions on Affective Computing10.1109/TAFFC.2023.323477714:4(2954-2969)Online publication date: 1-Oct-2023
  • (2023)Ensemble Learning to Assess Dynamics of Affective Experience Ratings and Physiological Change2023 11th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)10.1109/ACIIW59127.2023.10388116(1-8)Online publication date: 10-Sep-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI '12: Proceedings of the 14th ACM international conference on Multimodal interaction
October 2012
636 pages
ISBN:9781450314671
DOI:10.1145/2388676
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 October 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. affective computing
  2. dynamic features
  3. facial expressions
  4. feature selection
  5. multimodal fusion

Qualifiers

  • Research-article

Conference

ICMI '12
Sponsor:
ICMI '12: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION
October 22 - 26, 2012
California, Santa Monica, USA

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Large Language Models for Emotion Evolution PredictionComputational Science and Its Applications – ICCSA 2024 Workshops10.1007/978-3-031-65154-0_1(3-19)Online publication date: 30-Jul-2024
  • (2023)Audio-Visual Emotion Recognition With Preference Learning Based on Intended and Multi-Modal Perceived LabelsIEEE Transactions on Affective Computing10.1109/TAFFC.2023.323477714:4(2954-2969)Online publication date: 1-Oct-2023
  • (2023)Ensemble Learning to Assess Dynamics of Affective Experience Ratings and Physiological Change2023 11th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)10.1109/ACIIW59127.2023.10388116(1-8)Online publication date: 10-Sep-2023
  • (2023)Prediction of Face Emotion with Labelled Selective Transfer Machine as a Generalized Emotion ClassifierAdvanced Computing10.1007/978-3-031-35644-5_23(294-307)Online publication date: 14-Jul-2023
  • (2022)A Multi-Scale Multi-Task Learning Model for Continuous Dimensional Emotion Recognition from AudioElectronics10.3390/electronics1103041711:3(417)Online publication date: 29-Jan-2022
  • (2022)Leveraging the Deep Learning Paradigm for Continuous Affect Estimation from Facial ExpressionsIEEE Transactions on Affective Computing10.1109/TAFFC.2019.294460313:1(426-439)Online publication date: 1-Jan-2022
  • (2022)Facial expression recognition based on anomaly featureOptical Review10.1007/s10043-022-00734-329:3(178-187)Online publication date: 18-Apr-2022
  • (2022)Facial Expression ModelingFace Analysis Under Uncontrolled Conditions10.1002/9781394173853.ch5(191-222)Online publication date: 16-Sep-2022
  • (2021)Dynamics of facial actions for assessing smile genuinenessPLOS ONE10.1371/journal.pone.024464716:1(e0244647)Online publication date: 5-Jan-2021
  • (2021)Monocular 3D Facial Expression Features for Continuous Affect RecognitionIEEE Transactions on Multimedia10.1109/TMM.2020.302689423(3540-3550)Online publication date: 1-Jan-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media