Abstract
For the emotion recognition in speech we had developed two feature extraction algorithms, which emphasize the subtle emotional differences while de-emphasizing the dominant linguistic components. The starting point is to extract 200 statistical features based on intensity and pitch time series, which are considered as the superset of necessary emotional features. Then, the first algorithm, rNMF (representative Non-negative Matrix Factorization), selects simple features best representing the complex NMF-based features. It first extracts a large set of complex almost-mutually-independent features by unsupervised learning and latter selects a small number of simple features for the classification tasks. The second algorithm, dNMF (discriminant NMF), extracts only the discriminate features by adding Fisher criterion as an additional constraint on the cost function of the standard NMF algorithm. Both algorithms demonstrate much better recognition rates even with only 20 features for the popular Berlin database.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Slaney, M., McRoberts, G.: Baby ears: a recognition system for affective vocalizations. Speech Communications 39, 367–384 (2003)
Lin, Y., Wei, G.: Speech emotion recognition based on HMM and SVM. In: Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, August 2005, vol. 8, pp. 4898–4901 (2005)
You, M., Chen, C., Bu, J., Liu, J., Tao, J.: Emotional speech analysis on nonlinear manifold. In: 18th International Conference on Pattern Recognition, September 2006, vol. 3, pp. 91–94 (2006)
Zhou, G., Hansen, J.H.L., Kaiser, J.F.: Nonlinear feature based classification of speech under stress. IEEE Transactions on Speech and Audio Processing 9, 201–216 (2001)
Oudeyer, P.Y.: The production and recognition of emotions in speech: features and algorithms. International Journal of Human-Computer Studies 59(1), 157–183 (2003)
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)
Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B.: A database of german emotional speech. In: Proceeding INTERSPEECH 2005, ISCA, pp. 1517–1520 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, D., Lee, SY., Amari, Si. (2009). Representative and Discriminant Feature Extraction Based on NMF for Emotion Recognition in Speech. In: Leung, C.S., Lee, M., Chan, J.H. (eds) Neural Information Processing. ICONIP 2009. Lecture Notes in Computer Science, vol 5863. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10677-4_74
Download citation
DOI: https://doi.org/10.1007/978-3-642-10677-4_74
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10676-7
Online ISBN: 978-3-642-10677-4
eBook Packages: Computer ScienceComputer Science (R0)