Abstract
In the present work we aim at performance optimization of a speaker-independent emotion recognition system through speech feature selection process. Specifically, relying on the speech feature set defined in the Interspeech 2009 Emotion Challenge, we studied the relative importance of the individual speech parameters, and based on their ranking, a subset of speech parameters that offered advantageous performance was selected. The affect-emotion recognizer utilized here relies on a GMM-UBM-based classifier. In all experiments, we followed the experimental setup defined by the Interspeech 2009 Emotion Challenge, utilizing the FAU Aibo Emotion Corpus of spontaneous, emotionally coloured speech. The experimental results indicate that the correct choice of the speech parameters can lead to better performance than the baseline one.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Pantic, M., Rothkrantz, L.: Toward an Affect-Sensitive Multi-Modal Human-Computer Interaction. Proc. of the IEEE 91, 1370–1390 (2003)
Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., Taylor, J.G.: Emotion Recognition in Human-Computer Interaction. IEEE Signal Processing Magazine 18(1), 32–80 (2001)
Batliner, A., Fisher, K., Huber, R., Spilker, J., Nöth, E.: How to Find Trouble in Communication. Speech Communication 40, 117–143 (2003)
Batliner, A., Burkhardt, F., van Ballegooy, M., Nöth, E.: A Taxonomy of Applications that Utilize Emotional Awareness. In: Erjavec, T., Gros, J. (eds.) Language Technologies, IS-LTC 2006, pp. 246–250 (2006)
Callejas, Z., Lopez-Cozar, R.: Influence of Contextual Information in Emotion Annotation for Spoken Dialogue Systems. Speech Communication, 416–433 (2008)
Iliou, T., Anagnostopoulos, C.N.: Comparison of Different Classifiers for Emotion Recognition. In: 13th Panhellenic Conference on Informatics, pp. 102–106 (2009)
Seppi, D., Batliner, A., Schuller B., Steidl, S., Vogt, T., Wagner, J., Devillers, L., Vidrascu, L., Amir, N., Aharonson, V.: Patterns, Prototypes, Performance: Classifying Emotional User States. In: Interspeech 2008, pp. 601–604 (2008)
Steidl, S.: Automatic Classification of Emotion-Related User States in Spontaneous Children’s Speech. Logos Verlag, Berlin (2009)
Batliner, A., Steidl, S., Hacker, C., Nöth, E.: Private Emotions vs. Social Interaction – a Data-driven Approach towards Analysing Emotion in Speech. In: User Modeling and User-Adpated Interaction (UMUAI) 18(1-2), 175–206 (2008)
Ververidis, D., Kotropoulos, C.: Fast and Accurate Feature Subset Selection Applied into Speech Emotion Recognition. Elsevier Signal Processing 88(12), 2956–2970 (2008)
Brendel, M., Zaccarelli R., Devillers, L.: Building a System for Emotions Detection from Speech to Control an Affective Avatar. In: Proceedings of LREC 2010, pp. 2205–2210 (2010)
Schuller, B., Steidl, S., Batliner, A.: The Interspeech 2009 Emotion Challenge. In: Interspeech 2009, ISCA, Brighton, UK, pp. 312–315 (2009)
Schuller, B., Steidl, S., Batliner, A., Burkhardt, F., Devillers, L., Mueller, C., Narayanan, S.: The Interspeech 2010 Paralinguistic Challenge. In: Interspeech 2010, ISCA, Makuhari, Japan (2010)
Kockmann, M., Burget, L., Cernocky J.: Brno University of Technology System for Interspeech 2009 Emotion Challenge. In: Interspeech 2009, ISCA, Brighton, UK, pp. 348–351 (2009)
Steidl, S., Schuller, B., Seppi, D., Batliner, A.: The Hinterland of Emotions: Facing the Open-Microphone Challenge. In: Proc. 4th International HUMAINE Association Conference on Affective Computing and Intelligent Interaction 2009 (ACII 2009), vol. 1, pp. 690–697 (2009)
Eyben, F., Wollmer, M., Schuller, B.: openEAR - Introducing the Munich Open-Source Emotion and Affect Recognition Toolkit. In: Proc. of the 4th International HUMAINE Association Conference on Affective Computing and Intelligent Interaction 2009 (ACII 2009). IEEE, Amsterdam (2009)
Reynolds, D.A., Rose, R.C.: Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models. IEEE Transactions on Speech and Audio Processing 3, 72–83 (1995)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. J. Roy. Stat. Soc. 39, 1–38 (1977)
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing 10, 19–41 (2000)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Schuller, B., Batliner, A., Seppi, D., Steidl, S., Vogt, T., Wagner, J., Devillers, L., Vidrascu, L., Amir, N., Kessous, L., Aharonson. V.: The Relevance of Feature Type for the Automatic Classification of Emotional User States: Low Level Descriptors and Functionals. In: Interspeech 2007, ISCA, Antwerp, Belgium, August 2007, pp. 2253–2256 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kostoulas, T., Ganchev, T., Lazaridis, A., Fakotakis, N. (2010). Enhancing Emotion Recognition from Speech through Feature Selection. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2010. Lecture Notes in Computer Science(), vol 6231. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15760-8_43
Download citation
DOI: https://doi.org/10.1007/978-3-642-15760-8_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15759-2
Online ISBN: 978-3-642-15760-8
eBook Packages: Computer ScienceComputer Science (R0)