Enhancing Emotion Recognition from Speech through Feature Selection

Kostoulas, Theodoros; Ganchev, Todor; Lazaridis, Alexandros; Fakotakis, Nikos

doi:10.1007/978-3-642-15760-8_43

Theodoros Kostoulas²³,
Todor Ganchev²³,
Alexandros Lazaridis²³ &
…
Nikos Fakotakis²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6231))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

1521 Accesses
11 Citations

Abstract

In the present work we aim at performance optimization of a speaker-independent emotion recognition system through speech feature selection process. Specifically, relying on the speech feature set defined in the Interspeech 2009 Emotion Challenge, we studied the relative importance of the individual speech parameters, and based on their ranking, a subset of speech parameters that offered advantageous performance was selected. The affect-emotion recognizer utilized here relies on a GMM-UBM-based classifier. In all experiments, we followed the experimental setup defined by the Interspeech 2009 Emotion Challenge, utilizing the FAU Aibo Emotion Corpus of spontaneous, emotionally coloured speech. The experimental results indicate that the correct choice of the speech parameters can lead to better performance than the baseline one.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Pantic, M., Rothkrantz, L.: Toward an Affect-Sensitive Multi-Modal Human-Computer Interaction. Proc. of the IEEE 91, 1370–1390 (2003)
Article Google Scholar
Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., Taylor, J.G.: Emotion Recognition in Human-Computer Interaction. IEEE Signal Processing Magazine 18(1), 32–80 (2001)
Article Google Scholar
Batliner, A., Fisher, K., Huber, R., Spilker, J., Nöth, E.: How to Find Trouble in Communication. Speech Communication 40, 117–143 (2003)
Article MATH Google Scholar
Batliner, A., Burkhardt, F., van Ballegooy, M., Nöth, E.: A Taxonomy of Applications that Utilize Emotional Awareness. In: Erjavec, T., Gros, J. (eds.) Language Technologies, IS-LTC 2006, pp. 246–250 (2006)
Google Scholar
Callejas, Z., Lopez-Cozar, R.: Influence of Contextual Information in Emotion Annotation for Spoken Dialogue Systems. Speech Communication, 416–433 (2008)
Google Scholar
Iliou, T., Anagnostopoulos, C.N.: Comparison of Different Classifiers for Emotion Recognition. In: 13th Panhellenic Conference on Informatics, pp. 102–106 (2009)
Google Scholar
Seppi, D., Batliner, A., Schuller B., Steidl, S., Vogt, T., Wagner, J., Devillers, L., Vidrascu, L., Amir, N., Aharonson, V.: Patterns, Prototypes, Performance: Classifying Emotional User States. In: Interspeech 2008, pp. 601–604 (2008)
Google Scholar
Steidl, S.: Automatic Classification of Emotion-Related User States in Spontaneous Children’s Speech. Logos Verlag, Berlin (2009)
Google Scholar
Batliner, A., Steidl, S., Hacker, C., Nöth, E.: Private Emotions vs. Social Interaction – a Data-driven Approach towards Analysing Emotion in Speech. In: User Modeling and User-Adpated Interaction (UMUAI) 18(1-2), 175–206 (2008)
Google Scholar
Ververidis, D., Kotropoulos, C.: Fast and Accurate Feature Subset Selection Applied into Speech Emotion Recognition. Elsevier Signal Processing 88(12), 2956–2970 (2008)
MATH Google Scholar
Brendel, M., Zaccarelli R., Devillers, L.: Building a System for Emotions Detection from Speech to Control an Affective Avatar. In: Proceedings of LREC 2010, pp. 2205–2210 (2010)
Google Scholar
Schuller, B., Steidl, S., Batliner, A.: The Interspeech 2009 Emotion Challenge. In: Interspeech 2009, ISCA, Brighton, UK, pp. 312–315 (2009)
Google Scholar
Schuller, B., Steidl, S., Batliner, A., Burkhardt, F., Devillers, L., Mueller, C., Narayanan, S.: The Interspeech 2010 Paralinguistic Challenge. In: Interspeech 2010, ISCA, Makuhari, Japan (2010)
Google Scholar
Kockmann, M., Burget, L., Cernocky J.: Brno University of Technology System for Interspeech 2009 Emotion Challenge. In: Interspeech 2009, ISCA, Brighton, UK, pp. 348–351 (2009)
Google Scholar
Steidl, S., Schuller, B., Seppi, D., Batliner, A.: The Hinterland of Emotions: Facing the Open-Microphone Challenge. In: Proc. 4th International HUMAINE Association Conference on Affective Computing and Intelligent Interaction 2009 (ACII 2009), vol. 1, pp. 690–697 (2009)
Google Scholar
Eyben, F., Wollmer, M., Schuller, B.: openEAR - Introducing the Munich Open-Source Emotion and Affect Recognition Toolkit. In: Proc. of the 4th International HUMAINE Association Conference on Affective Computing and Intelligent Interaction 2009 (ACII 2009). IEEE, Amsterdam (2009)
Google Scholar
Reynolds, D.A., Rose, R.C.: Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models. IEEE Transactions on Speech and Audio Processing 3, 72–83 (1995)
Article Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. J. Roy. Stat. Soc. 39, 1–38 (1977)
MATH MathSciNet Google Scholar
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing 10, 19–41 (2000)
Article Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar
Schuller, B., Batliner, A., Seppi, D., Steidl, S., Vogt, T., Wagner, J., Devillers, L., Vidrascu, L., Amir, N., Kessous, L., Aharonson. V.: The Relevance of Feature Type for the Automatic Classification of Emotional User States: Low Level Descriptors and Functionals. In: Interspeech 2007, ISCA, Antwerp, Belgium, August 2007, pp. 2253–2256 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Wire Communications Laboratory, Department of Electrical and Computer Engineering, University of Patras, 26500, Rion-Patras, Greece
Theodoros Kostoulas, Todor Ganchev, Alexandros Lazaridis & Nikos Fakotakis

Authors

Theodoros Kostoulas
View author publications
You can also search for this author in PubMed Google Scholar
Todor Ganchev
View author publications
You can also search for this author in PubMed Google Scholar
Alexandros Lazaridis
View author publications
You can also search for this author in PubMed Google Scholar
Nikos Fakotakis
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Informatics, Masaryk University, Brno, Czech Republic
Petr Sojka
Faculty of Informatics, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Aleš Horák
Faculty of Informatics, Masaryk University, Botanická 68a, CZ-602 00, Brno, Czech Republic
Ivan Kopeček
Faculty of Informatics, Department of Computer Graphics and Design, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Karel Pala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kostoulas, T., Ganchev, T., Lazaridis, A., Fakotakis, N. (2010). Enhancing Emotion Recognition from Speech through Feature Selection. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2010. Lecture Notes in Computer Science(), vol 6231. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15760-8_43

Download citation

DOI: https://doi.org/10.1007/978-3-642-15760-8_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15759-2
Online ISBN: 978-3-642-15760-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics