Toward Emotional Speaker Recognition: Framework and Preliminary Results

Yang, Yingchun; Chen, Li

doi:10.1007/978-3-642-35136-5_29

Toward Emotional Speaker Recognition: Framework and Preliminary Results

Yingchun Yang²¹ &
Li Chen²¹

Conference paper

1858 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7701))

Abstract

Besides channel and environment noises, emotion variability in speech signals has been found to be another important factor that degenerates drastically the performance of most speaker recognition systems proposed in the literature. How to make current GMM-UBM system adaptive to emotion variability is one consideration. We thus propose a framework named Deformation Compensation (DC) for emotional speaker recognition, which viewing emotion variability as deformation (some sort of distribution distortion in the feature space) and trying to take deformation compensation by making dynamic modification on the feature, model and score level. This paper reports the preliminary results which have been gained so far, including our proposed Deformation Compensation framework together with the preliminary case study on GMM-UBM.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Kinnunen, T., Li, H.: An overview of text-independent speaker recognition: From features to supervectors. Speech Communication 52(1), 12–40 (2010)
Article Google Scholar
National Institute of Standards and Technology. The NIST year 2010 speaker recognition evaluation plan (2010), http://www.itl.nist.gov/iad/mig/tests/sre/2010/index.html
Scherer, K., Johnstone, T., Bänziger, T.: Automatic verification of emotionally stressed speakers: The problem of individual differences. In: Proc. SPECOM (1998)
Google Scholar
Yang, Y., Shan, Z., Wu, Z.: Frequency Shifting for Emotional Speaker Recognition, Pattern Recognition. In: Yin, P.-Y. (ed.) Pattern Recognition, pp. 305–318. InTech (October 2009)
Google Scholar
Wu, W., Zheng, T.F., Xu, M.X., Bao, H.J.: Study on Speaker Verification on Emotional Speech. In: Proceedings of ICSLP 2006, pp. 2102–2105 (2006)
Google Scholar
Bao, H.J., Xu, M.X., Zheng, T.F.: Emotion Attribute Projection for Speaker Recognition on Emotional Speech. In: InterSpeech 2007, pp. 758–761 (2007)
Google Scholar
Shriberg, E., Kajarekar, S., Scheffer, N.: Does Session Variability Compensation in Speaker Recognition Model Intrinsic Variation Under Mismatched Conditions? In: Interspeech 2009, Brighton, United Kingdom, pp. 1551–1554 (2009)
Google Scholar
Kenny, P., Boulianne, G., Ouellet, P., Dumouchel, P.: Speaker and session variability in GMM-based speaker verification. IEEE Transactions on Audio, Speech and Language Processing 15(4), 1448–1460 (2007)
Article Google Scholar
Chen, L., Yang, Y.: Applying Emotional Factor Analysis and I-Vector to Emotional Speaker Recognition. In: Sun, Z., Lai, J., Chen, X., Tan, T. (eds.) CCBR 2011. LNCS, vol. 7098, pp. 174–179. Springer, Heidelberg (2011)
Chapter Google Scholar
Louis, T.B.: Emotions, speech and the ASR framework. Speech Communication 40, 213–225 (2003)
Article MATH Google Scholar
Huang, T., Yang, Y.: Applying pitch-dependent difference detection and modification to emotional speaker recognition. In: Interspeech 2011 (2011)
Google Scholar
Wu, T., Yang, Y., Wu, Z., Li, D.: MASC: A Speech Corpus in Mandarin for Emotion Analysis and Affective Speaker Recognition. In: ODYSSEY 2006, pp. 1–5 (June 2006)
Google Scholar
Read, D., Craik, F.I.M.: Earwitness identification: some influences on voice recognition. Journal of Experimental Psychology: Applied 1, 6–18
Google Scholar
Rosengerg, A.E.: Listener performance in speaker verification tasks. IEEE Transactions on Audio and Eletroacoustics AU-21, 221–225
Google Scholar
Hautamaki, V., Kinnunen, T., Nosratighods, M., Lee, K.-A., Ma, B., Li, H.: Approaching human listener accuracy with modern speaker verification. In: Interspeech 2010, pp. 1473–1476 (2010)
Google Scholar
Yang, Y., Chen, L., Wang, W.: Emotional Speaker Identification by Humans and Machines. In: Sun, Z., Lai, J., Chen, X., Tan, T. (eds.) CCBR 2011. LNCS, vol. 7098, pp. 167–173. Springer, Heidelberg (2011)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer Science & Technology, Zhejiang University, Hangzhou, China
Yingchun Yang & Li Chen

Authors

Yingchun Yang
View author publications
You can also search for this author in PubMed Google Scholar
Li Chen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Science and Technology, Sun Yat-Sen University, 510275, Guangzhou, P.R. China
Wei-Shi Zheng & Jianhuang Lai &
Institute of Automation, National Laboratory of Pattern Recognition, Chinese Academy of Sciences, 100190, Beijing, P.R. China
Zhenan Sun
School of Computer Science and Engineering, Beihang University, Beijing University of Aeronautics and Astronautics, 100191, Beijing, P.R. China
Yunhong Wang
Institute of Computing Technology, Chinese Academy of Sciences, 100190, Beijing, P.R. China
Xilin Chen
Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Kowloon, Hong Kong, China
Pong C. Yuen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, Y., Chen, L. (2012). Toward Emotional Speaker Recognition: Framework and Preliminary Results. In: Zheng, WS., Sun, Z., Wang, Y., Chen, X., Yuen, P.C., Lai, J. (eds) Biometric Recognition. CCBR 2012. Lecture Notes in Computer Science, vol 7701. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35136-5_29

Download citation

DOI: https://doi.org/10.1007/978-3-642-35136-5_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35135-8
Online ISBN: 978-3-642-35136-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics