Influence of Speakers’ Emotional States on Voice Recognition Scores

Staroniewicz, Piotr

doi:10.1007/978-3-642-25775-9_22

Piotr Staroniewicz²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6800))

2534 Accesses
1 Citations

Abstract

The paper presents the voice recognition EER (Equal Error Rate) scores for speakers’ basic emotional states. The database of Polish emotional speech used during the tests includes recordings of six acted emotional states (anger, sadness, happiness, fear, disgusts, surprise) and the neutral state of 13 amateur speakers (2118 utterances). The voice recognition procedure was proceeded with MFCC features and GMM classifiers. The EER scores distinctly depend on speakers’ emotional states, even for a simulated database. The mean EER results tend to be only slightly less sensitive to an emotional state, even when using speech in various kinds of emotional arousal in a training set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cowie, R.: Describing the Emotional States Expressed in Speech. In: Proc. of ISCA, Belfast 2000, pp. 11–18 (2000)
Google Scholar
Douglas-Cowie, E., Campbell, N., Cowie, R., Roach, P.: Emotional speech: Towards a new generation of databases. Speech Communication 40, 33–60 (2003)
Article MATH Google Scholar
Ververdis, D., Kotropoulos, C.: A State of the Art on Emotional Speech Databases. In: Proc. of 1st Richmedia Conf., Laussane, Switzerland, pp. 109–119 (October 2003)
Google Scholar
Staroniewicz, P.: Polish emotional speech database–design. In: Proc. of 55th Open Seminar on Acoustics, Wroclaw, Poland, pp. 373–378 (2008)
Google Scholar
Staroniewicz, P., Majewski, W.: Polish Emotional Speech Database – Recording and Preliminary Validation. In: Esposito, A., Vích, R. (eds.) Cross-Modal Analysis of Speech, Gestures, Gaze and Facial Expressions. LNCS (LNAI), vol. 5641, pp. 42–49. Springer, Heidelberg (2009)
Chapter Google Scholar
Staroniewicz, P.: Recognition of Emotional State in Polish Speech – Comparison between Human and Automatic Efficiency. In: Fierrez, J., Ortega-Garcia, J., Esposito, A., Drygajlo, A., Faundez-Zanuy, M. (eds.) BioID MultiComm2009. LNCS, vol. 5707, pp. 33–40. Springer, Heidelberg (2009)
Chapter Google Scholar
Staroniewicz, P.: Test of Robustness of GMM Speaker Verification in VoIP Telephony. Archives of Acoustics 32(4), 187–192 (2007)
Google Scholar
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digital Signal Processing 10, 19–41 (2000)
Article Google Scholar
Bimbot, F., et al.: A tutorial on text-independent speaker verification. EURASIP Journal on Applied Signal Processing 4, 430–451 (2004)
Article Google Scholar
Staroniewicz, P.: Speaker Recognition for VoIP Transmission Using Gaussian Mixture Models. In: Computer Recogition Systems, pp. 739–745. Springer, Heidelberg (2005)
Chapter Google Scholar
COST Action 2102 Modal Analysis of Verbal and Non-verbal Communication. Memorandum of Understanding, Brussels, July 11 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Telecommunications, Teleinformatics and Acoustics, Wroclaw University of Technology, Wybrzeze Wyspianskiego 27, 50-370, Wroclaw, Poland
Piotr Staroniewicz

Authors

Piotr Staroniewicz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Psychology and IIASS, International Institute for Advanced Scientific Studies, Second University of Naples, Vietri sul Mare, SA, Italy
Anna Esposito
School of Computing Science, University of Glasgow, Glasgow, UK
Alessandro Vinciarelli
Department of Telecommunication and Media Informatics, Laboratory of Speech Acoustics, Budapest University of Technology and Economics, 1117, Budapest, Hungary
Klára Vicsi
TELECOM ParisTech, CNRS-LTCI UMR 5141, 75014, Paris, France
Catherine Pelachaud
Faculty of Electrical Engineering, Mathematics and Computer Science, University of Twente, 7500 AE, Enschede, The Netherlands
Anton Nijholt

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Staroniewicz, P. (2011). Influence of Speakers’ Emotional States on Voice Recognition Scores. In: Esposito, A., Vinciarelli, A., Vicsi, K., Pelachaud, C., Nijholt, A. (eds) Analysis of Verbal and Nonverbal Communication and Enactment. The Processing Issues. Lecture Notes in Computer Science, vol 6800. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25775-9_22

Download citation

DOI: https://doi.org/10.1007/978-3-642-25775-9_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25774-2
Online ISBN: 978-3-642-25775-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics