Emotional Speaker Identification by Humans and Machines

Yang, Yingchun; Chen, Li; Wang, Wenyi

doi:10.1007/978-3-642-25449-9_21

Yingchun Yang¹⁹,
Li Chen¹⁹ &
Wenyi Wang¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7098))

Included in the following conference series:

Chinese Conference on Biometric Recognition

1568 Accesses
2 Citations

Abstract

This paper concerns the problem of the effect of emotional change on humans and machines for speaker identification. A contrasting experiment is carried out between Automatic Speaker Identification (ASI) system (applying GMM-UBM and Emotional Factor Analysis (EFA) algorithm) and aural system on emotional speech corpus MASC. The experimental result is similar to that in channel-mismatched condition, i.e. the ASI system is much better than the single listener, especially when emotion compensation algorithm EFA is applied. Meanwhile,fusion of multiple listeners can significantly improve the aural system performance by 23.86% and make it outperform the ASI system.

This paper is supported by NSFC60970080 and the Special Funds for Key Program of the China No. 2009ZX01039-002-001-04.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Schmidt-Nielsen, A., Crystal, T.H.: Speaker verification by human listeners: experiments comparing human and machine performance using th2e NIST 1998 speaker verification data. Digital Signal Processing 10, 249–266 (2000)
Article Google Scholar
Kajarekar, S.S., Bratt, H., Shriberg, E., de Leon, R.: A study of intentional voice modifications for evading automatic speaker recognition. In: Speaker Odessy (2006)
Google Scholar
Hautamaki, V., Kinnunen, T., Nosratighods, M., Lee, K.-A., Ma, B., Li, H.: Approaching human listener accuracy with modern speaker verification. In: Interspeech 2010, pp. 1473–1476 (2010)
Google Scholar
The NIST year, speaker recognition evaluation plan (2010)
Google Scholar
Shriberg, E., Graciarena, M., Bratt, H., Kathol, A., Kajarekar, S., Jameel, H., Richey, C., Goodman, F.: Effects of vocal effort and speaking style on text-independent speaker verification. In: Interspeech 2007, Antwerp, pp. 950–954 (2007)
Google Scholar
Wu, T., Yang, Y., Wu, Z., Li, D.: MASC: A Speech Corpus in Mandarin for Emotion Analysis and Affective Speaker Recognition. In: ODYSSEY 2006, pp. 1–5 (June 2006)
Google Scholar
Kenny, P., Boulianne, G., Ouellet, P., Dumouchel, P.: Joint factor analysis versus eigenchannels in speaker recognition. IEEE Transaction on Audio Speech and Language Processing 15(4), 1435–1447 (2007)
Article Google Scholar
Chen, L., Yang, Y.: Applying Emotional Factor Analysis and I-Vector to Emotional Speaker Recognition. Submitted to CCBR (2011)
Google Scholar
http://speech.fit.vutbr.cz/en/software/joint-factor-analysis-matlab-demo

Download references

Author information

Authors and Affiliations

College of Computer Science & Technology, Zhejiang University, Hangzhou, China
Yingchun Yang, Li Chen & Wenyi Wang

Authors

Yingchun Yang
View author publications
You can also search for this author in PubMed Google Scholar
Li Chen
View author publications
You can also search for this author in PubMed Google Scholar
Wenyi Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National Laboratory of Pattern Recognition, Center for Biometrics and Security Research, Chinese Academy of Sciences, Institute of Automation, P.O. Box 2728, 100190, Beijing, China
Zhenan Sun & Tieniu Tan &
School of Information Science and Technolog, Sun Yat-Sen University, 510275, Guangzhou, China
Jianhuang Lai
Institute of Computing Technology, Chinese Academy of Sciences, 100190, Beijing, China
Xilin Chen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, Y., Chen, L., Wang, W. (2011). Emotional Speaker Identification by Humans and Machines. In: Sun, Z., Lai, J., Chen, X., Tan, T. (eds) Biometric Recognition. CCBR 2011. Lecture Notes in Computer Science, vol 7098. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25449-9_21

Download citation

DOI: https://doi.org/10.1007/978-3-642-25449-9_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25448-2
Online ISBN: 978-3-642-25449-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics