Skip to main content

IG-Based Feature Extraction and Compensation for Emotion Recognition from Speech

  • Conference paper
Book cover Affective Computing and Intelligent Interaction (ACII 2005)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3784))

Abstract

This paper presents an approach to emotion recognition from speech signals. In this approach, the intonation groups (IGs) of the input speech signals are firstly extracted. The speech features in each selected intonation group are then extracted. With the assumption of linear mapping between feature spaces in different emotional states, a feature compensation approach is proposed to characterize the feature space with better discriminability among emotional states. The compensation vector with respect to each emotional state is estimated using the Minimum Classification Error (MCE) algorithm. The IG-based feature vectors compensated by the compensation vectors are used to train the Gaussian Mixture Models (GMMs) for each emotional state. The emotional state with the GMM having the maximal likelihood ratio is determined as the final output. The experimental result shows that IG-based feature extraction and compensation can obtain encouraging performance for emotion recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Reeves, B., Nass, C.: The Media Equation: How People Treat Computers, Televi-sion, and New Media Like Real People and Places. University of Chicago Press, Chicago (1996)

    Google Scholar 

  2. Bhatti, M.W., Wang, Y., Guan, L.: A neural network approach for human emotion recogni-tion in speech. In: IEEE International Symposium on Circuits and Systems, Vancouver, Canada, pp. 181–184 (2004)

    Google Scholar 

  3. Inanoglu, Z., Caneel, R.: Emotive alert : HMM-based emotion detection in voicemail messages. In: IEEE Intelligent User Interfaces 2005, San Diego, California, USA, pp. 251–253 (2005)

    Google Scholar 

  4. Kwon, O.W., Chan, K., Hao, J., Lee, T.W.: Emotion Recognition by Speech Signals. In: 8th European Conference on Speech Communication and Technology, Geneva, Switzerland, pp. 125–128 (2003)

    Google Scholar 

  5. Chuang, Z.J., Wu, C.H.: Multi-Modal Emotion Recognition from Speech and Text. International Journal of Computational Linguistics and Chinese Language Processing 9(2), 1–18 (2004)

    Google Scholar 

  6. Rahurkar, M.A., Hansen, J.H.L.: Frequency Distribution Based Weighted Sub-Band Approach for Classification of Emotional/Stressful Content in Speech. In: 8th European Conference on Speech Communication and Technology, Geneva, Switzerland, pp. 721–724 (2003)

    Google Scholar 

  7. Deng, L., Droppo, J., Acero, A.: Recursive estimation of nonstationary noise using iterative stochastic approximation for robust speech recognition. IEEE Transactions on Speech and Audio 11(6), 568–580 (2003)

    Article  Google Scholar 

  8. Wu, J., Huo, Q.: An environment compensated minimum classification error training approach and its evaluation on aurora2 database. In: 7th International Conference on Spoken Language, Denver, Colorado, USA, pp. 453–456 (2002)

    Google Scholar 

  9. Ververidis, D., Kotropoulos, C., Pitas, I.: Automatic emotional speech classification. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, Montreal, Canada, pp. 593–596 (2005)

    Google Scholar 

  10. Levity, M., Huberz, R., Batlinery, A., Noeth, E.: Use of prosodic speech characteristics for automated detection of alcohol intoxication. In: Prosody in Speech Recognition and Understanding, Molly Pitcher Inn, Red Bank, NJ, USA (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chuang, ZJ., Wu, CH. (2005). IG-Based Feature Extraction and Compensation for Emotion Recognition from Speech. In: Tao, J., Tan, T., Picard, R.W. (eds) Affective Computing and Intelligent Interaction. ACII 2005. Lecture Notes in Computer Science, vol 3784. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11573548_46

Download citation

  • DOI: https://doi.org/10.1007/11573548_46

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29621-8

  • Online ISBN: 978-3-540-32273-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics