Abstract
In this paper, we propose a speech emotion recognition agent in mobile communication environment. The agent can recognize five emotional states - neutral, happiness, sadness, anger, and annoyance from the speech captured by a cellular-phone in real time. In general, the speech through the mobile network contains both speaker environmental noise and network noise, thus it can causes serious performance degradation due to the distortion in emotional features of the query speech. In order to minimize the effect of these noises and so improve the system performance, we adopt a simple MA (Moving Average) filter which has relatively simple structure and low computational complexity. Then a SFS (Sequential Forward Selection) feature optimization method is implemented to further improve and stabilize the system performance. For a practical application to call center problem, we created another emotional engine that distinguish two emotional states - ”agitation” which includes anger, happiness and annoyance, and ”calm” which includes neutral and sadness state. Two pattern classification methods, k-NN and Fuzzy-SVM, is compared for emotional state classifications. The experimental results indicate that the proposed method provides very stable and successful emotional classification performance as 72.5% over five emotional states and 86.5% over two emotional states.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Dellaert, F., Polzin, T., Waibel, A.: Recognizing emotion in Speech. In: Proc. International Conf. on Spoken Language Processing, pp. 1970–1973 (1996)
Petrushin, V.: Adding the affective dimension: A new look in speech analysis and synthesis. In: Proc. International Conf. on Spoken Language Processing, pp. 1808–1811 (1996)
Scherer, K.R.: Emotion recognition in speech signal: recognition and application to call centers. In: Proc. Artificial Neural Networks in Engineering, Nov. 1999, pp. 7–10 (1999)
Yacoub, S., et al.: Recognition of emotions in interactive voice response system. In: Eurospeech 2003 Proc. (2003)
Kostov, V., Fukuda, S.: Emotion in user interface, Voice Interaction system. In: IEEE Intl. Conf. on systems, Man, Cybernetics Representation, vol. 2, pp. 798–803. IEEE, Los Alamitos (2000)
Oriyama, T.M., Oazwa, S.: Emotion recognition and synthesis system on speech. In: IEEE Intl. Conference on Multimedia Computing and Systems, pp. 840–844. IEEE Computer Society Press, Los Alamitos (1999)
Lee, C.M., Narayanan, S., Pieraccini, R.: Classifying emotions in human-machine spoken dialogs. In: ICME’02 (2002)
Abe, S., Inoue, T.: Fuzzy Support Vector Machines for Multiclass Problems. In: ESANN’2002 proceedings, Bruges, Belgium, 24-26 April (2002)
Jin, A.K., Duin, R.P.W., Mai, J.: Statistical Pattern Recognition: A Review. IEEE Trans. Pattern Analysis and Machine Intelligence 22(1) (2000)
Gu, L., Zahorian, S.A.: A new robust algorithm for isolated word end-point detection. In: ICASSP2002, Orlando, USA (2002)
Noll, M.: Pitch determination of human speech by the harmonic product spectrum, the harmonic sum spectrum, and a maximum likelihood estimate. In: Proceedings of the Symposium on Computer Processing Communications, pp. 779–797 (1969)
Ross, M.J., et al.: Average magnitude difference function pitch extractor. ASSP 22, 353–362 (1974)
Sun, X.: A pitch determination algorithm based on subharmonic-to harmonic ratio. In: ICSLP, pp. 676–679 (2000)
Liu, M., Wan, C.: A study on content-based classification retrieval of audio database. In: Proc. of the International Database Engineering and Applications Symposium, pp. 339–345 (2001)
Kang, B.-S.: A text-independent emotion recognition algorithm using speech signal. MS Thesis, Yonsei University (2000)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cho, YH., Park, KS., Pak, R.J. (2007). Speech Emotion Pattern Recognition Agent in Mobile Communication Environment Using Fuzzy-SVM. In: Cao, BY. (eds) Fuzzy Information and Engineering. Advances in Soft Computing, vol 40. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71441-5_46
Download citation
DOI: https://doi.org/10.1007/978-3-540-71441-5_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71440-8
Online ISBN: 978-3-540-71441-5
eBook Packages: EngineeringEngineering (R0)