Speech Emotion Pattern Recognition Agent in Mobile Communication Environment Using Fuzzy-SVM

Cho, Youn-Ho; Park, Kyu-Sik; Pak, Ro Jin

doi:10.1007/978-3-540-71441-5_46

Speech Emotion Pattern Recognition Agent in Mobile Communication Environment Using Fuzzy-SVM

Youn-Ho Cho¹,
Kyu-Sik Park¹ &
Ro Jin Pak¹

Conference paper

776 Accesses
3 Citations

Part of the book series: Advances in Soft Computing ((AINSC,volume 40))

Abstract

In this paper, we propose a speech emotion recognition agent in mobile communication environment. The agent can recognize five emotional states - neutral, happiness, sadness, anger, and annoyance from the speech captured by a cellular-phone in real time. In general, the speech through the mobile network contains both speaker environmental noise and network noise, thus it can causes serious performance degradation due to the distortion in emotional features of the query speech. In order to minimize the effect of these noises and so improve the system performance, we adopt a simple MA (Moving Average) filter which has relatively simple structure and low computational complexity. Then a SFS (Sequential Forward Selection) feature optimization method is implemented to further improve and stabilize the system performance. For a practical application to call center problem, we created another emotional engine that distinguish two emotional states - ”agitation” which includes anger, happiness and annoyance, and ”calm” which includes neutral and sadness state. Two pattern classification methods, k-NN and Fuzzy-SVM, is compared for emotional state classifications. The experimental results indicate that the proposed method provides very stable and successful emotional classification performance as 72.5% over five emotional states and 86.5% over two emotional states.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Dellaert, F., Polzin, T., Waibel, A.: Recognizing emotion in Speech. In: Proc. International Conf. on Spoken Language Processing, pp. 1970–1973 (1996)
Google Scholar
Petrushin, V.: Adding the affective dimension: A new look in speech analysis and synthesis. In: Proc. International Conf. on Spoken Language Processing, pp. 1808–1811 (1996)
Google Scholar
Scherer, K.R.: Emotion recognition in speech signal: recognition and application to call centers. In: Proc. Artificial Neural Networks in Engineering, Nov. 1999, pp. 7–10 (1999)
Google Scholar
Yacoub, S., et al.: Recognition of emotions in interactive voice response system. In: Eurospeech 2003 Proc. (2003)
Google Scholar
Kostov, V., Fukuda, S.: Emotion in user interface, Voice Interaction system. In: IEEE Intl. Conf. on systems, Man, Cybernetics Representation, vol. 2, pp. 798–803. IEEE, Los Alamitos (2000)
Google Scholar
Oriyama, T.M., Oazwa, S.: Emotion recognition and synthesis system on speech. In: IEEE Intl. Conference on Multimedia Computing and Systems, pp. 840–844. IEEE Computer Society Press, Los Alamitos (1999)
Chapter Google Scholar
Lee, C.M., Narayanan, S., Pieraccini, R.: Classifying emotions in human-machine spoken dialogs. In: ICME’02 (2002)
Google Scholar
Abe, S., Inoue, T.: Fuzzy Support Vector Machines for Multiclass Problems. In: ESANN’2002 proceedings, Bruges, Belgium, 24-26 April (2002)
Google Scholar
Jin, A.K., Duin, R.P.W., Mai, J.: Statistical Pattern Recognition: A Review. IEEE Trans. Pattern Analysis and Machine Intelligence 22(1) (2000)
Google Scholar
Gu, L., Zahorian, S.A.: A new robust algorithm for isolated word end-point detection. In: ICASSP2002, Orlando, USA (2002)
Google Scholar
Noll, M.: Pitch determination of human speech by the harmonic product spectrum, the harmonic sum spectrum, and a maximum likelihood estimate. In: Proceedings of the Symposium on Computer Processing Communications, pp. 779–797 (1969)
Google Scholar
Ross, M.J., et al.: Average magnitude difference function pitch extractor. ASSP 22, 353–362 (1974)
Google Scholar
Sun, X.: A pitch determination algorithm based on subharmonic-to harmonic ratio. In: ICSLP, pp. 676–679 (2000)
Google Scholar
Liu, M., Wan, C.: A study on content-based classification retrieval of audio database. In: Proc. of the International Database Engineering and Applications Symposium, pp. 339–345 (2001)
Google Scholar
Kang, B.-S.: A text-independent emotion recognition algorithm using speech signal. MS Thesis, Yonsei University (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Dankook University Division of Information and Computer Science San 8, Hannam-Dong, Yongsan-Ku, Seoul, 140-714, Korea
Youn-Ho Cho, Kyu-Sik Park & Ro Jin Pak

Authors

Youn-Ho Cho
View author publications
You can also search for this author in PubMed Google Scholar
Kyu-Sik Park
View author publications
You can also search for this author in PubMed Google Scholar
Ro Jin Pak
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Bing-Yuan Cao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cho, YH., Park, KS., Pak, R.J. (2007). Speech Emotion Pattern Recognition Agent in Mobile Communication Environment Using Fuzzy-SVM. In: Cao, BY. (eds) Fuzzy Information and Engineering. Advances in Soft Computing, vol 40. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71441-5_46

Download citation

DOI: https://doi.org/10.1007/978-3-540-71441-5_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71440-8
Online ISBN: 978-3-540-71441-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics