Speaker-Characterized Emotion Recognition using Online and Iterative Speaker Adaptation

Kim, Jae-Bok; Park, Jeong-Sik; Oh, Yung-Hwan

doi:10.1007/s12559-012-9132-9

Speaker-Characterized Emotion Recognition using Online and Iterative Speaker Adaptation

Published: 27 March 2012

Volume 4, pages 398–408, (2012)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

Jae-Bok Kim¹,
Jeong-Sik Park² &
Yung-Hwan Oh³

263 Accesses
1 Altmetric
Explore all metrics

Abstract

This paper proposes a novel speech emotion recognition (SER) framework for affective interaction between human and personal devices. Most of the conventional SER techniques adopt a speaker-independent model framework because of the sparseness of individual speech data. However, a large amount of individual data can be accumulated on a personal device, making it possible to construct speaker-characterized emotion models in accordance with a speaker adaptation procedure. In this study, to address problems associated with conventional adaptation approaches in SER tasks, we modified a representative adaptation technique, maximum likelihood linear regression (MLLR), on the basis of selective label refinement. We subsequently carried out the modified MLLR procedure in an online and iterative manner, using accumulated individual data, to further enhance the speaker-characterized emotion models. In the SER experiments based on an emotional corpus, our approach exhibited performance superior to that of conventional adaptation techniques as well as the speaker-independent model framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speaker independent feature selection for speech emotion recognition: A multi-task approach

Article 31 October 2020

Speaker-Aware Training of Speech Emotion Classifier with Speaker Recognition

DNN-HMM-Based Speaker-Adaptive Emotion Recognition Using MFCC and Epoch-Based Features

Article 21 July 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Suprateek S, John D. Understanding mobile handheld device use and adoption. Commu ACM. 2003;46:35–40.
Google Scholar
Ballagas R, Borchers J, Rohs M, Jennifer G. The smart phone: ubiquitous input device. IEEE Pervasive Comput. 2006;5:70–7.
Article Google Scholar
Mark A, Streefkerk J. Interacting in desktop and mobile context: emotion, trust, and task performance. Ambient Intell. 2003;2875:119–32.
Article Google Scholar
Pittermann J, Pittermann A, Minker W. Handing emotions in human–computer dialogues. Berlin: Springer; 2010. p. 19–42.
Book Google Scholar
Park JS, Kim JH, Oh YH. Feature vector classification based speech emotion recognition for service robots. IEEE Trans Consum Electron. 2009;55:1590–6.
Article Google Scholar
Ignacio LM, Carlos OR, Joaquin GR, Daniel R. Speaker dependent emotion recognition using prosodic supervectors. In: Proceedings of interspeech; 2009. pp. 1971–4.
Nwe TL, Foo SW, Silva LCD. Speech emotion recognition using hidden Markov models. Speech Commun. 2003;41:603–23.
Article Google Scholar
Ververidis D, Kotropoulos C. Emotional speech recognition: resources, features, and methods. Speech Commun. 2006;48:1162–81.
Article Google Scholar
Kwon O, Chan K, Hao J, Lee T. Emotion recognition by speech signals. In: Proceedings of Eurospeech; 2003. pp. 125–8.
Tato R, Santos R, Kompe R, Pardo JM. Emotional space improves emotion recognition. In: Proceedings of the international conference on spoken language processing (ICSLP); 2002. pp. 2029–32.
Huang R, Ma C. Toward a speaker-independent real time affect detection system. In: Proceedings of international conference on pattern recognition (ICPR); 2006. pp. 1204–7.
Leggetter CJ, Woodland PC. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Comput Speech Lang. 1995;9:171–85.
Article Google Scholar
Woodland PC, Pye D, Gales MJF. Iterative unsupervised adaptation using maximum likelihood linear regression. In: Proceedings of international conference on spoken language processing (ICSLP); 1996. pp. 1133–6.
Lee CH, Lin CH, Juang BH. A study on speaker adaptation of the parameters of continuous density hidden markov models. IEEE Trans Signal Process. 1991;39:806–14.
Article CAS Google Scholar
Matsui T, Furui S. N-best-based unsupervised speaker adaptation for speech recognition. Comput Speech Lang. 1998;12:41–50.
Article Google Scholar
Anastasakos T, Balakrishnan SV. The use of confidence measures in unsupervised adaptation of speech recognizers. In: Proceedings of international conference on spoken language processing (ICSLP); 1998. pp. 2303–6.
Grimm M, Kroschel K, Mower E, Narayanan S. Primitives-based evaluation and estimation of emotions in speech. Speech Commun. 2007;49:787–800.
Article Google Scholar
Jiang H. Confidence measures for speech recognition: a survey. Speech Commun. 2005;45:455–70.
Article Google Scholar
Pitz M, Wessel F, Ney H. Improved MLLR speaker adaptation using confidence measures for conversational speech recognition. In: Proceedings of international conference on spoken language processing (ICSLP); 2000. pp. 548–51.
Gollan C, Bacchiani M. Confidence scores for acoustic model adaptation. In: Proceedings of international conference on acoustics, speech, and signal processing (ICASSP); 2008, pp. 4289–92.
Liberman M, Davis K, Grossman M, Martey N, Bell J. Emotional prosody speech and transcripts. In: Linguistic data consortium (LDC). Philadelphia: University of Pennsylvania; 2002.

Download references

Acknowledgments

This study was financially supported by academic research fund of Mokwon University in 2012 and Defense Acquisition Program Administration and Agency for Defense Development under the contract.

Author information

Authors and Affiliations

Future IT convergence Lab, LG Electronics, 221Yangjae-dong, Seocho-gu, Seoul, South Korea
Jae-Bok Kim
Department of Intelligent Robot Engineering, Mokwon University, Mokwon Gil-21, Daejeon, South Korea
Jeong-Sik Park
Computer Science Department, Korea Advanced Institute of Science and Technology, 291 Daehak-ro, Daejeon, South Korea
Yung-Hwan Oh

Authors

Jae-Bok Kim
View author publications
You can also search for this author inPubMed Google Scholar
Jeong-Sik Park
View author publications
You can also search for this author inPubMed Google Scholar
Yung-Hwan Oh
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Jeong-Sik Park.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, JB., Park, JS. & Oh, YH. Speaker-Characterized Emotion Recognition using Online and Iterative Speaker Adaptation. Cogn Comput 4, 398–408 (2012). https://doi.org/10.1007/s12559-012-9132-9

Download citation

Received: 01 August 2011
Accepted: 13 March 2012
Published: 27 March 2012
Issue Date: December 2012
DOI: https://doi.org/10.1007/s12559-012-9132-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speaker-Characterized Emotion Recognition using Online and Iterative Speaker Adaptation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Speaker independent feature selection for speech emotion recognition: A multi-task approach

Speaker-Aware Training of Speech Emotion Classifier with Speaker Recognition

DNN-HMM-Based Speaker-Adaptive Emotion Recognition Using MFCC and Epoch-Based Features

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now