Skip to main content

A Study of Speech Emotion Recognition and Its Application to Mobile Services

  • Conference paper
Ubiquitous Intelligence and Computing (UIC 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4611))

Included in the following conference series:

Abstract

In this paper, a speech emotion recognition agent for mobile communication service is proposed. The proposed system can recognize five emotional states - neutral, happiness, sadness, anger, and annoyance from the speech captured by a cellular phone in real time and then it calculates the degree of affection such as love, truthfulness, weariness, trick, friendship of the person who you are interesting to know through the mobile phone. In general, a speech acquired by a cellular phone contains noise due to the mobile network and environmental noise. Thus it can causes serious performance degradation due to the distortion in emotional features of the query speech. In order to alleviate the effect of these noises, we adopt a MA (Moving Average) filter which has relatively simple structure and low computational complexity. Then a feature optimization method is implemented to further improve and stabilize the system performance. For a practical application, we create an agent that can measure the degree of affection from the person who you want to know on the mobile phone. Two pattern classification methods, k-NN and SVM with probability estimates, are compared for estimating the degree of affection. The experimental results indicate that the proposed method provides very stable and successful emotional classification performance as 72.5% over five emotional states and it shows the feasibility of the agent for mobile communication services.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dellaert, F., Polzin, T., Waibel, A.: Recognizing emotion in Speech. In: Proc. International Conf. on Spoken Language Processing, pp. 1970–1973 (1996)

    Google Scholar 

  2. Scherer, K.R.: Adding the affective dimension-A new look in speech analysis and synthesis. In: Proc. International Conf. on Spoken Language Processing, pp. 1808–1811 (1996)

    Google Scholar 

  3. Zhou, G., Hansen, J.H.L., Kaiser, J.F.: Nonlinear Feature Based Classification of Speech Under Stress. IEEE Transactions on speech and audio processing 9(3) (2001)

    Google Scholar 

  4. Yacoub, S., Simske, S., Lin, X., Burns, J.: Recognition of emotions in interactive voice response system. In: Eurospeech 2003 Proc. (2003)

    Google Scholar 

  5. Kostov, V., Fukuda, S.: Emotion in user interface. Voice Interaction system. no. 2. In: IEEE Intl. Conf. on systems, Man, Cybernetics Representation, pp. 798–803 (2000)

    Google Scholar 

  6. Oriyama, T.M., Oazwa.: Emotion recognition and synthesis system on speech. In: IEEE Intl. Conference on Multimedia Computing and Systems, pp. 840–844. IEEE Computer Society Press, Los Alamitos (1999)

    Chapter  Google Scholar 

  7. Lee, C.M., Narayanan, S., Pieraccini, R.: Classifying emotions in human-machine spoken dialogs. In: ICME 2002 (2002)

    Google Scholar 

  8. Wu, T.-F., Lin, C.-J., Weng, R.C.: Probability Estimates for Multi-class Classification by Pairwise Coupling. Journal of Machine Learning Research (2004)

    Google Scholar 

  9. Gu, L., Zahorian, S.A.: A new robust algorithm for isolated word end-point detection. In: ICASSP 2002, Orlando, USA (2002)

    Google Scholar 

  10. Noll, M.: Pitch determination of human speech by the harmonic product spectrum, the harmonic sum spectrum, and a maximum likelihood estimate. In: Proceedings of the Symposium on Computer Processing Communications, pp. 779–797 (1969)

    Google Scholar 

  11. Ross, M.J., Shaer, H.L., Cohen, A., Freudberg, R., Manley, H.J.: Average magnitude difference function pitch extractor. ASSP-22, 353–362 (1974)

    Google Scholar 

  12. Sun, X.: A pitch determination algorithm based on subharmonic-to harmonic ratio. In: ICSLP, pp. 676–679 (2000)

    Google Scholar 

  13. Liu, M., Wan, C.: A study on content-based classification retrieval of audio database. In: Proc. of the International Database Engineering & Applications Symposium, pp. 339–345 (2001)

    Google Scholar 

  14. Bong-Seok, K.: A text-independent emotion recognition algorithm using speech signal. MS Thesis, Yonsei University (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Jadwiga Indulska Jianhua Ma Laurence T. Yang Theo Ungerer Jiannong Cao

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yoon, WJ., Cho, YH., Park, KS. (2007). A Study of Speech Emotion Recognition and Its Application to Mobile Services. In: Indulska, J., Ma, J., Yang, L.T., Ungerer, T., Cao, J. (eds) Ubiquitous Intelligence and Computing. UIC 2007. Lecture Notes in Computer Science, vol 4611. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73549-6_74

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73549-6_74

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73548-9

  • Online ISBN: 978-3-540-73549-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics