Abstract
The ability of a machine to discern various categories of emotion is of great interest in many applications. This paper attempts to explore the use of baseline features consisting of prosodic and spectral features along with formant based features for the purpose of classification of emotion along the dimensions of arousal, valence, expectancy, and power. Using three feature selection criteria namely maximum average recall, maximal relevance, and minimal-redundancy-maximal-relevance, the paper intends to find the criterion that gives the highest unweighted accuracy. Using a Gaussian Mixture Model classifier, the results indicate that the formant based features show a statistically significant improvement on the accuracy of the classification system.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Goodluck, H.: Language acquisition: A linguistic introduction, p. 162. Wiley-Blackwell (1991)
Petrushin, V.: Emotion in speech: Recognition and application to call centers. In: Artificial Neu. Net. In Engr. (ANNIE 1999), pp. 7–10 (1999)
McKeown, G., Valstar, M.F., Cowie, R., Pantic, M.: The SEMAINE corpus of emotionally coloured character interactions. In: IEEE International Conference on Multimedia and Expo. (ICME), pp. 1079–1084 (2010)
Schuller, B., Valstar, M., Eyben, F., McKeown, G., Cowie, R., Pantic, M.: AVEC 2011 - The First International Audio/Visual Emotion Challenge. In: D´Mello, S., et al. (eds.) ACII 2011, Part II, vol. 6975, pp. 415–424. Springer, Heidelberg (2011)
Bresch, E., Kim, Y.C., Nayak, K., Byrd, D., Narayanan, S.: Seeing speech: Capturing vocal tract shaping using real-time magnetic resonance imaging. IEEE Signal Processing Magazine, 123–132 (2008)
Ververidis, D., Kotropoulos, C.: Emotional speech recognition: Resources, features, and methods. Speech Communication 48(9), 1162–1181 (2006)
Liscombe, J., Venditti, J., Hirschberg, J.: Classifying subject ratings of emotional speech using acoustic features. In: EUROSPEECH-2003, pp. 725–728 (2003)
Oppenheim, A.V., Schafer, R.W., Buck, J.R., et al.: Discrete-time signal processing. Prentice hall, Englewood Cliffs (1989, 1999)
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1226–1238 (2005)
Ding, C., Peng, H.: Minimum Redundancy Feature Selection from Microarray Gene Expression Data. Journal of Bioinformatics and Computational Biology 3(2), 185–205 (2005)
Ruiz, R., Aguilar, J.S., Riquelme, J.: Best agglomerative ranked subset for feature selection. In: JMLR Workshop and Conference Proceedings. New Challenges for Feature Selection in Data Mining and Knowledge Discovery, vol. 4, pp. 148–162 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, J.C., Rao, H., Clements, M.A. (2011). Investigating the Use of Formant Based Features for Detection of Affective Dimensions in Speech. In: D’Mello, S., Graesser, A., Schuller, B., Martin, JC. (eds) Affective Computing and Intelligent Interaction. ACII 2011. Lecture Notes in Computer Science, vol 6975. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24571-8_48
Download citation
DOI: https://doi.org/10.1007/978-3-642-24571-8_48
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24570-1
Online ISBN: 978-3-642-24571-8
eBook Packages: Computer ScienceComputer Science (R0)