Combining Empirical Studies of Audio-Lingual and Visual-Facial Modalities for Emotion Recognition

Virvou, M.; Tsihrintzis, G. A.; Alepis, E.; Stathopoulou, I. -O.; Kabassi, K.

doi:10.1007/978-3-540-74827-4_141

M. Virvou⁴,
G. A. Tsihrintzis⁴,
E. Alepis⁴,
I. -O. Stathopoulou⁴ &
…
K. Kabassi⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4693))

Included in the following conference series:

International Conference on Knowledge-Based and Intelligent Information and Engineering Systems

2060 Accesses
5 Citations

Abstract

In this paper, we present and discuss two empirical studies that we have conducted involving human subjects and human observers concerning the recognition of emotions from audio-lingual and visual-facial modalities. Many researchers agree that these modalities are complementary to each other and that the combination of the two can improve the accuracy in affective user models. However, there is a shortage of research in empirical work concerning the strengths and weaknesses of each modality so that more accurate recognizers can be built. In our research, we have investigated the recognition of emotions from the above mentioned modalities with respect to 6 basic emotional states, namely happiness, sadness, surprise, anger and disgust as well as the emotionless state which we refer to as neutral. We have found that certain states such as neutral happiness and surprise are more clearly recognized from the visual-facial modality whereas sadness and disgust are more clearly recognized from the audio-lingual modality.

Support for this work was provided by the General Secretariat of Research and Technology, Greece, under the auspices of the PENED-2003 program.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Alepis, E., Virvou, M., Kabassi, K.: Affective student modeling based on microphone and keyboard user actions. In: ICALT. Proceedings of the Sixth IEEE International Conference on Advanced Learning Technologies, Kerkrade, The Netherlands, pp. 139–141. IEEE Computer Society Press, Los Alamitos (2006)
Google Scholar
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C.M., Kazemzadeh, A., Lee, S., Neumann, U., Narayanan, S.: Analysis of emotion recognition using facial expressions, speech and multimodal information. In: Proceedings of the 6th international conference on Multimodal interfaces, State College, PA, USA, pp. 205–211 (2004)
Google Scholar
Chen, L.S., Huang, T.S., Miyasato, T., Nakatsu, R.: Multimodal human emotion/expression recognition. In: Proc. of Int. Conf. on Automatic Face and Gesture Recognition, IEEE Computer Soc., Nara, Japan (1998)
Google Scholar
De Silva, L., Miyasato, T., Nakatsu, R.: Facial emotion recognition using multimodal information. In: ICICS’97. Proc. IEEE Int. Conf. on Information, Communications and Signal Processing, pp. 397–401. IEEE Computer Society Press, Los Alamitos (1997)
Google Scholar
Ekman, P.: Emotion in the human face. Cambridfe University Press, New York (1982)
Google Scholar
Ekman, P.: Basic Emotions. In: Dalgleish, T., Power, T. (eds.) The Handbook of Cognition and Emotion, pp. 45–60. John Wiley & Sons, Ltd, Sussex, U.K (1999)
Google Scholar
Ekman, P.: An argument for basic emotions. Cognition and Emotion 6, 169–200 (1992)
Article Google Scholar
Ekman, P., Davidson, R.J.: The Nature of Emotion. Fundamental Questions. Oxford University Press Inc., Oxford (1994)
Google Scholar
Ekman, P., Friesen, W.V.: Facial action coding system: A technique for the measurement of facial movement. Consulting Psychologists Press, Palo Alto, Calif (1978)
Google Scholar
Ekman, P., Friesen, W.V.: Unmasking the face. A guide to recognizing emotions from facial clues. Prentice-Hall, Englewood Cliffs, New Jersey (1975)
Google Scholar
Goleman, D.: Emotional Intelligence. Bantam Books, New York (1995)
Google Scholar
Graf, H.P., Cosatto, E., Strom, V., Huang, F.J.: Visual Prosody: Facial Movements Accompanying Speech. In: Proceedings of the fifth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 381–386. IEEE Computer Society Press, Los Alamitos (2002)
Google Scholar
Pantic, M., Rothkrantz, L.J.M.: Toward an affect-sensitive multimodal human-computer interaction. Proceedings of the IEEE 91, 1370–1390 (2003)
Article Google Scholar
Picard, R.W.: Affective Computing: Challenges. Int. Journal of Human-Computer Studies 59(1-2), 55–64 (2003)
Article Google Scholar
Russell, J.A.: Is there universal recognition of emotion from facial expression? Psychological Bulletin 95, 102–141 (1994)
Google Scholar
Russell, J.A., Fernandez-Dols, J.M.: The Psychology of Facial Expression. Cambridge University Press, New York (1997)
Book Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics, University of Piraeus, Piraeus 185 34, Greece
M. Virvou, G. A. Tsihrintzis, E. Alepis, I. -O. Stathopoulou & K. Kabassi

Authors

M. Virvou
View author publications
You can also search for this author in PubMed Google Scholar
G. A. Tsihrintzis
View author publications
You can also search for this author in PubMed Google Scholar
E. Alepis
View author publications
You can also search for this author in PubMed Google Scholar
I. -O. Stathopoulou
View author publications
You can also search for this author in PubMed Google Scholar
K. Kabassi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Scienze dell’Informazione, Università degli Studi di Milano, Via Comelico 39/41, 20135, Milano, Italy
Bruno Apolloni
Centre for SMART Systems, School of Engineering, University of Brighton, BN2 4GJ, Brighton, UK
Robert J. Howlett
Knowledge-Based Intelligent Engineering Systems Centre, University of South Australia, Mawson Lakes, SA 5095, Adelaide, Australia
Lakhmi Jain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Virvou, M., Tsihrintzis, G.A., Alepis, E., Stathopoulou, I.O., Kabassi, K. (2007). Combining Empirical Studies of Audio-Lingual and Visual-Facial Modalities for Emotion Recognition. In: Apolloni, B., Howlett, R.J., Jain, L. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2007. Lecture Notes in Computer Science(), vol 4693. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74827-4_141

Download citation

DOI: https://doi.org/10.1007/978-3-540-74827-4_141
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74826-7
Online ISBN: 978-3-540-74827-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics