Abstract
The paper presents the comparison of human (listeners test) and automatic (SVM classifier) speech emotion recognition. The database of Polish emotional speech used during tests includes recordings of six acted emotional states (anger, sadness, happiness, fear, disgust, surprise) and the neutral state of 13 amateur speakers (2118 utterances). The automatic classifier used the set of 31 attribute evaluated features, C-SVC algorithm with the Gaussian Radial Basis Function. The mean overall score for human recognition (57.25%) turned out to be lower than for automatic recognition (64.77%).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cowie, R.: Describing the Emotional States Expressed in Speech. In: Proc. of ISCA, Belfast, pp. 11–18 (2000)
Scherer, K.R.: Vocal communications of emotion: A review of research paradigms. Speech Communication 40, 227–256 (2003)
Burkhard, F., Paeschkhe, A., Rolfes, M., Sendlmeier, W., Weiss, B.: A Database of German Emotional Speech. In: Proc. of Interspeech 2005, Lissabon, Portugal (2005)
Staroniewicz, P., Majewski, W.: Polish Emotional Speech Database – Recording and Preliminary Validation. In: Cross-Modal Analysis of Speech, Gestures, Gaze and Facial Expressions. Springer, Heidelberg (accepted, 2009)
Staroniewicz, P.: Polish emotional speech database–design. In: Proc. of 55th Open Seminar on Acoustics, Wroclaw, Poland, pp. 373–378 (2008)
Douglas-Cowie, E., Campbell, N., Cowie, R., Roach, P.: Emotional speech: Towards a new generation of databases. Speech Communication 40, 33–60 (2003)
Ververdis, D., Kotropoulos, C.: A State of the Art on Emotional Speech Databases. In: Proc. of 1st Richmedia Conf., Laussane, Switzerland, October 2003, pp. 109–119 (2003)
Hsu Ch.W., Chang Ch.-Ch., Lin Ch.-J.: A Practical Guide to Support Vector Classification. Department of Computer Science, National Taiwan University (2008), http://www.csie.ntu.edu.tw/~cjlin (last updated: May 21, 2008)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kauffmann, San Francisco (2005)
Chang, Ch.-Ch., Lin, Ch.-J.: LIBSVM: a Library for Support Vector Machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm/
Vapnik, V.N.: Statistical Learning Theory. Wiley, Chichester (1998)
Kwon, O., Chan, K., Hao, J., Lee, T.: Emotion Recognition by Speech Signals. In: Eurospeech, Geneva, Switzerland, September 1-3 (2003)
Zhou, J., Wang, G., Yang, Y., Chen, P.: Speech emotion recognition based on rough set and SVM. In: Cognitive Informatics, ICCI 2006. 5th IEEE International Conference, Beijing, July 17-19, vol. 1, pp. 53–61 (2006)
COST Action 2102, Cross-Modal Analysis of Verbal and Non-verbal Communication. Memorandum of Understanding, Brussels, July 11 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Staroniewicz, P. (2009). Recognition of Emotional State in Polish Speech - Comparison between Human and Automatic Efficiency. In: Fierrez, J., Ortega-Garcia, J., Esposito, A., Drygajlo, A., Faundez-Zanuy, M. (eds) Biometric ID Management and Multimodal Communication. BioID 2009. Lecture Notes in Computer Science, vol 5707. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04391-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-04391-8_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04390-1
Online ISBN: 978-3-642-04391-8
eBook Packages: Computer ScienceComputer Science (R0)