Abstract
Besides channel and environment noises, emotion variability in speech signals has been found to be another important factor that degenerates drastically the performance of most speaker recognition systems proposed in the literature. How to make current GMM-UBM system adaptive to emotion variability is one consideration. We thus propose a framework named Deformation Compensation (DC) for emotional speaker recognition, which viewing emotion variability as deformation (some sort of distribution distortion in the feature space) and trying to take deformation compensation by making dynamic modification on the feature, model and score level. This paper reports the preliminary results which have been gained so far, including our proposed Deformation Compensation framework together with the preliminary case study on GMM-UBM.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Kinnunen, T., Li, H.: An overview of text-independent speaker recognition: From features to supervectors. Speech Communication 52(1), 12–40 (2010)
National Institute of Standards and Technology. The NIST year 2010 speaker recognition evaluation plan (2010), http://www.itl.nist.gov/iad/mig/tests/sre/2010/index.html
Scherer, K., Johnstone, T., Bänziger, T.: Automatic verification of emotionally stressed speakers: The problem of individual differences. In: Proc. SPECOM (1998)
Yang, Y., Shan, Z., Wu, Z.: Frequency Shifting for Emotional Speaker Recognition, Pattern Recognition. In: Yin, P.-Y. (ed.) Pattern Recognition, pp. 305–318. InTech (October 2009)
Wu, W., Zheng, T.F., Xu, M.X., Bao, H.J.: Study on Speaker Verification on Emotional Speech. In: Proceedings of ICSLP 2006, pp. 2102–2105 (2006)
Bao, H.J., Xu, M.X., Zheng, T.F.: Emotion Attribute Projection for Speaker Recognition on Emotional Speech. In: InterSpeech 2007, pp. 758–761 (2007)
Shriberg, E., Kajarekar, S., Scheffer, N.: Does Session Variability Compensation in Speaker Recognition Model Intrinsic Variation Under Mismatched Conditions? In: Interspeech 2009, Brighton, United Kingdom, pp. 1551–1554 (2009)
Kenny, P., Boulianne, G., Ouellet, P., Dumouchel, P.: Speaker and session variability in GMM-based speaker verification. IEEE Transactions on Audio, Speech and Language Processing 15(4), 1448–1460 (2007)
Chen, L., Yang, Y.: Applying Emotional Factor Analysis and I-Vector to Emotional Speaker Recognition. In: Sun, Z., Lai, J., Chen, X., Tan, T. (eds.) CCBR 2011. LNCS, vol. 7098, pp. 174–179. Springer, Heidelberg (2011)
Louis, T.B.: Emotions, speech and the ASR framework. Speech Communication 40, 213–225 (2003)
Huang, T., Yang, Y.: Applying pitch-dependent difference detection and modification to emotional speaker recognition. In: Interspeech 2011 (2011)
Wu, T., Yang, Y., Wu, Z., Li, D.: MASC: A Speech Corpus in Mandarin for Emotion Analysis and Affective Speaker Recognition. In: ODYSSEY 2006, pp. 1–5 (June 2006)
Read, D., Craik, F.I.M.: Earwitness identification: some influences on voice recognition. Journal of Experimental Psychology: Applied 1, 6–18
Rosengerg, A.E.: Listener performance in speaker verification tasks. IEEE Transactions on Audio and Eletroacoustics AU-21, 221–225
Hautamaki, V., Kinnunen, T., Nosratighods, M., Lee, K.-A., Ma, B., Li, H.: Approaching human listener accuracy with modern speaker verification. In: Interspeech 2010, pp. 1473–1476 (2010)
Yang, Y., Chen, L., Wang, W.: Emotional Speaker Identification by Humans and Machines. In: Sun, Z., Lai, J., Chen, X., Tan, T. (eds.) CCBR 2011. LNCS, vol. 7098, pp. 167–173. Springer, Heidelberg (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, Y., Chen, L. (2012). Toward Emotional Speaker Recognition: Framework and Preliminary Results. In: Zheng, WS., Sun, Z., Wang, Y., Chen, X., Yuen, P.C., Lai, J. (eds) Biometric Recognition. CCBR 2012. Lecture Notes in Computer Science, vol 7701. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35136-5_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-35136-5_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35135-8
Online ISBN: 978-3-642-35136-5
eBook Packages: Computer ScienceComputer Science (R0)