Emotional facial sensing and multimodal fusion in a continuous 2D affective space

Cerezo, Eva; Hupont, Isabelle; Baldassarri, Sandra; Ballano, Sergio

doi:10.1007/s12652-011-0087-6

Emotional facial sensing and multimodal fusion in a continuous 2D affective space

Original Research
Published: 30 October 2011

Volume 3, pages 31–46, (2012)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Eva Cerezo¹,
Isabelle Hupont²,
Sandra Baldassarri¹ &
…
Sergio Ballano²

358 Accesses
10 Citations
Explore all metrics

Abstract

This paper deals with two main research focuses on Affective Computing: facial emotion recognition and multimodal fusion of affective information coming from different channels. The facial sensing system developed implements an emotional classification mechanism that combines, in a novel and robust manner, the five most commonly used classifiers in the field of affect sensing, obtaining at the output an associated weight of the facial expression to each of the six Ekman’s universal emotional categories plus the neutral. The system is able to analyze any subject, male or female, of any age, and ethnicity and has been validated by means of statistical evaluation strategies, such as cross-validation, classification accuracy ratios and confusion matrices. The categorical facial sensing system has been subsequently expanded to a continuous 2D affective space which has made it also possible to face the problem of multimodal human affect recognition. A novel fusion methodology able to fuse any number of affective modules, with very different time-scales and output labels, is proposed. It relies on the 2D Whissell affective space and is able to output a continuous emotional path characterizing the user’s affective progress over time. A Kalman filtering technique controls this path in real-time to ensure temporal consistency and robustness to the system. Moreover, the methodology is adaptive to eventual temporal changes in the reliability of the different inputs’ quality. The potential of the multimodal fusion methodology is demonstrated by fusing dynamic affective information extracted from different channels (video, typed-in text and emoticons) of an Instant Messaging tool.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multimodal Techniques and Methods in Affective Computing – A Brief Overview

Fusing facial and speech cues for enhanced multimodal emotion recognition

Article 24 January 2024

³Comparative Analysis of Audio–Video Multimodal Methods for Emotion Recognition

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Bassili JN (1979) Emotion recognition: the role of facial movement and the relative importance of upper and lower areas of the face. J Pers Soc Psychol 37:2049–2058
Article Google Scholar
Boukricha H, Becker C, Wachsmuth I (2007) Simulating empathy for the virtual human Max. In: Proceedings of 2nd International Workshop on Emotion and computing in conj. with the German Conference on Artificial Intelligence (KI2007), pp 22–27
Cambria E, Hussain A, Havasi C, Eckl C (2010) Sentic computing: exploitation of common sense for the development of emotion-sensitive systems. Development of Multimodal Interfaces: Active Listening and Synchrony 5967:153–161
Google Scholar
Chang CY, Tsai JS, Wang CJ, Chung PC (2009) Emotion recognition with consideration of facial expression and physiological signals. In: Proceedings of the 6th Annual IEEE Conference on Computational intelligence in bioinformatics and computational biology, pp 278–283
Douglas-Cowie E, Cowie R, Sneddon I, Cox C, Lowry O, McRorie M, Martin J, Devillers L, Abrilian S, Batliner A et al (2007) The HUMAINE database: addressing the collection and annotation of naturalistic and induced emotional data. In: Proceedings of the 2nd International Conference on Affective computing and intelligent interaction, pp 488–500
Du Y, Bi W, Wang T, Zhang Y, Ai H (2007) Distributing expressional faces in 2-D emotional space. In: Proceedings of the 6th ACM International Conference on Image and video retrieval, pp 395–400
Ekman P (1999) In: Dalgleish T, Power M (eds) Handbook of cognition and emotion. Wiley, Chihester
Ekman P, Friesen WV, Hager JC (2002) Facial action coding system. Research Nexus eBook, Salt Lake City
Fragopanagos N, Taylor JG (2005) Emotion recognition in human–computer interaction. Neural Netw 18:389–405
Article Google Scholar
Garidakis G, Malatesta L, Kessous L, Amir N, Paouzaiou A, Karpouzis K (2006) Modeling naturalistic affective states via facial and vocal expression recognition. Proc Int Conf on Multimodal Interfaces: 146–154
Gilroy S, Cavazza M, Niranen M, Andr E,Vogt T, Urbain J, Benayoun M, Seichter H, Billinghurst M (2009) PAD-based multimodal affective fusion. In: Proceedings of the Conference on Affective computing and intelligent interaction, pp 1–8
Gosselin F, Schyns PG (2001) Bubbles: a technique to reveal the use of information in recognition tasks. Vis Res 41:2261–2271
Article Google Scholar
Grimm M, Kroschel K, Narayanan S (2008) The Vera am Mittag German audio-visual emotional speech database. In: Proceedings of the IEEE International Conference on multimedia and expo, pp 865–868
Gunes H, Piccardi M (2007) Bi-modal emotion recognition from expressive face and body gestures. J Netw Comp Appl 30(4):1334–1345
Article Google Scholar
Gunes H, Piccardi M, Pantic M (2008) From the lab to the real world: affect recognition using multiple cues and modalities. In: Jimmy Or (ed) Affective computing, InTech, Vienna, pp 185–218
Hall MA (1998) Correlation-based feature selection for machine learning. Hamilton, New Zealand
Google Scholar
Hammal Z, Caplier A, Rombaut M (2005) Belief theory applied to facial expressions classification. Pattern recognition and image analysis. Lect Notes Comput Sci 3687(2005):183–191
Article Google Scholar
Hupont I, Baldassarri S, del Hoyo R, Cerezo E (2008) Effective emotional classification combining facial classifiers and user assesment. Lect Notes Comput Sci 5098:431–440
Article Google Scholar
Ji Q, Lan P, Looney C (2006) A probabilistic framework for modeling and real-time monitoring human fatigue. IEEE Transac Syst, Man Cybernetics, Part A 36:862–875
Article Google Scholar
Kapoor A, Burleson W, Picard RW (2007) Automatic prediction of frustration. Int J Human–Comp Studies 65:724–736
Article Google Scholar
Kayan S, Fussell S, Setlock L (2006) Cultural differences in the use of Instant Messaging in Asia and North America. In: Proceedings of the 2006 Conference on Computer supported cooperative work, pp 525–528
Keltner D, Ekman P (2000) Facial expression of emotion. Handbook of emotions 2:236–249
Google Scholar
Kumar P, Yildirim EA (2005) Minimum-volume enclosing ellipsoids and core sets. J Optimization Theory Appl 126:1–21
Article MathSciNet MATH Google Scholar
Kuncheva L (2004) Combining pattern classifiers: methods and algorithms. Wiley-Interscience, Hoboken
Littlewort G, Bartlett MS, Fasel I, Susskind J, Movellan J (2006) Dynamics of facial expression extracted automatically from video. Image Vis Comput 24:615–625
Article Google Scholar
Morrell D, Stirling W (2003) An extended set-valued kalman filter. In: Proceedings of the 3rd International Symposium on Imprecise probabilities and their applications (ISIPTA’03), pp 396–407
Pal P, Iyer A, Yantorno R (2006) Emotion detection from infant facial expressions and cries. In: Proceedings of the IEEE International Conference on Acoustics, speech and signal processing 2, pp 721–724
Pantic M, Valstar M, Rademaker R, Maat L (2005) Web-based database for facial expression analysis. In: IEEE International Conference on Multimedia and Expo, pp 317–321
Petridis S, Gunes H, Kaltwang S, Pantic M (2009) Static vs. dynamic modeling of human nonverbal behavior from multiple cues and modalities. In: Proceedings of the International Conference on multimodal interfaces, pp 23–30
Picard R (1997) Affective Computing. The MIT Press
Plutchik R (1980) Emotion: a psychoevolutionary synthesis. Harper & Row, New York
Google Scholar
Pun T, Alecu T, Chanel G, Kronegg J, Voloshynovskiy S (2006) Braincomputer interaction research at the computer vision and multimedia laboratory, University of Geneva. IEEE Transac Neural Syst Rehabilitat Eng 14(2):210–213
Article Google Scholar
Sánchez J, Hernández N, Penagos J, Ostróvskaya Y (2006) Conveying mood and emotion in Instant Messaging by using a two-dimensional model for affective states. In: Proceedings of VII Brazilian Symposium on Human factors in computing systems, pp 66–72
Shan C, Gong S, McOwan P(2007) Beyond facial expressions: learning human emotion from body gestures. In: Proceedings of the British Machine Vision Conference
Soyel H, Demirel H (2007) Facial expression recognition using 3D facial feature distances. Lect Notes Comp Sci 4633:831–383
Google Scholar
Stoiber N, Seguier R, Breton G (2009) Automatic design of a control interface for a synthetic face. In: Proceedings of the 13th International Conference on Intelligent user interfaces, 207–216
Wallhoff F (2006) Facial expressions and emotion database. Technische Universität München, 2006. Available: http://www.mmk.ei.tum.de/~waf/fgnet/feedtum.html
Whissell CM (1989) The dictionary of affect in language. emotion: theory, research and experience 4. The Measurement of Emotions. Academic Press, New York
Witten I, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco
MATH Google Scholar
Wolf A (2000) Emotional expression online: gender differences in emoticon use. CyberPsychol Behav 3(5):827–833
Article Google Scholar
Wöllmer M, Al-Hames M, Eyben F, Schuller B, Rigoll G (2009) A multidimensional Dynamic Time Warping algorithm for efficient multimodal fusion of asynchronous data streams. Neurocomputing 73(1–3):366–380
Article Google Scholar
Yeasin M, Bullot B, Sharma R (2006) Recognition of facial expressions and measurement of levels of interest from video. IEEE Transac Multime´d 8:500–508
Article Google Scholar
Zeng Z, Tu J, Liu M, Huang T, Pianfetti B, Roth D, Levinson S (2007) Audio-visual affect recognition. IEEE Transac Multime´d 9(2):424–428
Article Google Scholar
Zeng Z, Pantic M, Huang TS (2009a) Emotion recognition based onmultimodal information. In: Tao J, Tan T (eds) Affective information processing. Springer, London, pp 241–265
Chapter Google Scholar
Zeng Z, Pantic M, Roisman GI, Huang TS (2009b) A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Transac Pattern Anal Mach Intell 31(1):39–58
Article Google Scholar

Download references

Acknowledgments

The authors wish to thank Dr. Cynthia Whissell for her explanations and kindness, Dr. Hussain and E. Cambria for the text analyzing tool and all the participants in the evaluation sessions. This work has been partly financed by the University of Zaragoza through the AVIM project.

Author information

Authors and Affiliations

Dpto. Informática e Ingeniería de Sistemas, Universidad de Zaragoza, C/Maria de Luna 1, 50018, Zaragoza, Spain
Eva Cerezo & Sandra Baldassarri
Instituto Tecnológico de Aragón (ITA), P.T. Walqa - Edificio I+D+i, Ctra. Zaragoza, N-330a, Km 566, 22197, Cuarte (Huesca), Spain
Isabelle Hupont & Sergio Ballano

Authors

Eva Cerezo
View author publications
You can also search for this author in PubMed Google Scholar
Isabelle Hupont
View author publications
You can also search for this author in PubMed Google Scholar
Sandra Baldassarri
View author publications
You can also search for this author in PubMed Google Scholar
Sergio Ballano
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eva Cerezo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cerezo, E., Hupont, I., Baldassarri, S. et al. Emotional facial sensing and multimodal fusion in a continuous 2D affective space. J Ambient Intell Human Comput 3, 31–46 (2012). https://doi.org/10.1007/s12652-011-0087-6

Download citation

Received: 03 February 2011
Accepted: 24 September 2011
Published: 30 October 2011
Issue Date: March 2012
DOI: https://doi.org/10.1007/s12652-011-0087-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Emotional facial sensing and multimodal fusion in a continuous 2D affective space

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multimodal Techniques and Methods in Affective Computing – A Brief Overview

Fusing facial and speech cues for enhanced multimodal emotion recognition

³Comparative Analysis of Audio–Video Multimodal Methods for Emotion Recognition

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now