Abstract
This paper describes a speech emotion recognition system that is built for Audio Sub-Challenge of Audio/Visual Emotion Challenge (AVEC 2011). In this system, feature selection is conducted via L1 regularized linear regression in which the L1 norm of regression weights is minimized to find a sparse weight vector. The features with approximately zero weights are removed to create a well-selected small feature set. A fusion scheme by combining the strength from linear regression and Extreme learning machine (EML) based feedforward neural networks (NN) is proposed for classification. The experiment results conducted on the SEMAINE database of naturalistic dialogues distributed through AVEC 2011 are presented.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Dellaert, F., Polzin, T., Waibel, A.: Recognizing emotion in speech. In: Proc. 4th International Conference on Spoken Language Processing, vol. 3, pp. 1970–1973 (October 1996)
Nicholson, J., Takahashi, K., Nakatsu, R.: Emotion recognition in speech using neural networks. In: Proceedings of 6th International Conference on Neural Information Processing, Stockholm, Sweden, vol. 2, pp. 495–501 (1999)
Petrushin, V.A.: Emotion in speech: recognition and application to call centers. In: Proceedings of Artificial Neural Networks in Engineering, pp. 7–10 (November 1999)
Cen, L., Ser, W., Yu, Z.L.: Speech emotion recognition using canonical correlation analysis and probabilistic neural network. In: Proc. 7th Int. Conf. Machine Learning and Application (ICMLA), San Diego, California, USA (December 2008)
Cen, L., Ser, W., Yu, Z.L., Cen, W.: Automatic Recognition of Emotional States from Human Speechs. In: Pattern Recognition Recent Advances, IN-TECH, pp. 431–449 (February 2010)
Lee, C., Narayanan, S.: Toward detecting emotions in spoken dialogs. IEEE Trans. on Speech and Audio Processing 13(2), 293–303 (2005)
Yu, F., Chang, E., Xu, Y.Q., Shum, H.Y.: Emotion detection from speech to enrich multimedia content. In: Proc. 2th IEEE Pacific-Rim Conf. Multimedia Int., Beijing, China (October 2001)
Zhou, J., Wang, G.Y., Yang, Y., Chen, P.J.: Speech emotion recognition based on rough set and svm. In: Proc. 5th IEEE Int. Conf. Cognitive Informatics, Beijing, China, vol. 1, pp. 53–61 (July 2006)
Cen, L., Dong, M.H., Li, H.Z., Yu, Z.L., Chan, P.: Machine Learning Methods in the Application of Speech Emotion Recognition. In: Application of Machine Learning, IN-TECH, pp. 1–19 (February 2010)
Rong, J., Chen, Y.-P.P., Chowdhury, M., Li, G.: Acoustic features extraction for emotion recognition. In: Proc. IEEE/ACIS International Conference on Computer and Information Science, vol. 11(13), pp. 419–424 (July 2007)
Oudeyer, P.Y.: The production and recognition of emotions in speech: features and algorithms. Proc. International Journal of Human-Computer Studies 59, 157–183 (2003)
Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: Theory and applications. Neurocomputing 70, 489–501 (2006)
Ververidis, D., Kotropoulos, C.: Emotional speech recognition: resources, features, and methods. Speech Communication 48(9), 1163–1181 (2006)
Petrushin, V.A.: Emotion recognition in speech signal: experimental study, development, and application. In: Proc. 6th International Conference on Spoken Language Processing, Beijing, China (2000)
Amir, N.: Classifying emotions in speech: A comparison of methods. In: Proc. Eurospeech (2001)
Specht, D.F.: Probabilistic neural networks for classification, mapping or associative memory. In: Proc. IEEE Int. Conf. Neural Network, vol. 1, pp. 525–532 (July 1988)
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture model. Digital Signal Processing 10(1), 19–41 (2000)
Davitz, J.R. (ed.): The Communication of Emotional Meaning. McGraw-Hill, New York (1964)
Huttar, G.L.: Relations between prosodic variables and emotions in normal american english utterances. Journal of Speech Hearing Res. 11, 481–487 (1968)
Fonagy, I.: A new method of investigating the perception of prosodic features. Language and Speech 21, 34–49 (1978)
Havrdova, Z., Moravek, M.: Changes of the voice expression during suggestively influenced states of experiencing. Activitas Nervosa Superior 21, 33–35 (1979)
McGilloway, S., Cowie, R., Douglas-Cowie, E.: Prosodic signs of emotion in speech: preliminary results from a new technique for automatic statistical analysis. In: Proceedings of Int. Congr. Phonetic Sciences, Stockholm, Sweden, vol. 1, pp. 250–253 (1995)
Schuller, B., Valstar, M., Eyben, F., McKeown, G., Cowie, R., PanticCen, M.: AVEC 2011 - the first international audio/visual emotion challenge. In: D´Mello, S., et al. (eds.) ACII 2011, Part II., vol. 6975, pp. 415–424. Springer, Heidelberg (2011)
Boyd, S., Vandenberghe, L.: Convex optimization. Cambridge University Press, Cambridge (2004)
Donoho, D.: Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006)
Li, Y.Q., Cichocki, A., Amari, S., Ho, D.W.C., Xie, S.L.: Underdetermined blind source separation based on sparse representation. IEEE Trans. Signal Processing 54(2), 423–437 (2006)
Li, Y.Q., Amari, S., Cichocki, A., Guan, C.T.: Probability estimation for recoverability analysis of blind source separation based on sparse representation. IEEE Trans. Inf. Theory 52(7), 3139–3152 (2006)
McKeown, G., Valstar, M., Pantic, M., Cowie, R.: The semaine corpus of emotionally coloured character interactions. In: Proc. Int. Conf. Multimedia and Expo., Stockholm, Sweden, pp. 1–6 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cen, L., Yu, Z.L., Dong, M.H. (2011). Speech Emotion Recognition System Based on L1 Regularized Linear Regression and Decision Fusion. In: D’Mello, S., Graesser, A., Schuller, B., Martin, JC. (eds) Affective Computing and Intelligent Interaction. ACII 2011. Lecture Notes in Computer Science, vol 6975. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24571-8_44
Download citation
DOI: https://doi.org/10.1007/978-3-642-24571-8_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24570-1
Online ISBN: 978-3-642-24571-8
eBook Packages: Computer ScienceComputer Science (R0)