Abstract
Worldwide research is going on to judge the emotional state of a speaker just from the quality of human voice. This paper explores use of supervised neural network to design a classifier that can discriminate between several emotions like happiness, anger, fear, sadness & unemotional state in speech. The results found to be are significant, both in cognitive science and in speech technology. In the current paper, statistics of the pitch like, first and second formants, and Energy and speaking rate are used as relevant features. Different neural network based recognizers are created. Ensembles of such recognizers are used as an important part of decision support system for prioritizing voice messages and assigning a proper agent to response the message. The developed intelligent system can be enhanced to automatically predict and adapt to detect people’s emotional states and also to design emotional robot or computer system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Picard, R.: Affective computing. The MIT Press, Cambridge (1997)
Canh, J.E.: Generation of Affect in Synthesized Speech. In: Proceedings of AVIOS 1989, Meeting of the American Voice Input/Output Society (1989)
Murray, I.R., Arnott, J.L.: 1993 toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotions. J. Acoust. Society of America 93(2), 1097–1108 (1993)
Dellaert, F., Polzin, T., Waibel, A.: 1996 Recognizing emotions in speech. In: ICSLP 1996 (1996)
Tosa, N., Nakatsu, R.: 1996 Life-like communication agent - emotion sensing character "MIC" and feeling session character "MUSE". In: Proc. of IEEE conference on Multimedia 1996, pp. 12–19 (1996)
Banse, R., Scherer, K.R.: 1996 Acoustic profiles in vocal emotion expression. Journal of Personality and Social Psychology 70, 614–636 (1996)
Scherer, K.R.: Expression of emotion in voice and music. J. Voice 9(3), 235–248 (1995)
[Cohn/Katz 1998] Cohn, J.F., Katz, G.S.: Bimodal Expressions of Emotion by Face and Voice. Workshop on Face/Gesture Recognition and their Applications. In: The Sixth ACM International Multimedia Conference, Bristol, England (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Giripunje, S., Panat, A. (2004). Speech Recognition for Emotions with Neural Network: A Design Approach. In: Negoita, M.G., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based Intelligent Information and Engineering Systems. KES 2004. Lecture Notes in Computer Science(), vol 3214. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30133-2_84
Download citation
DOI: https://doi.org/10.1007/978-3-540-30133-2_84
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23206-3
Online ISBN: 978-3-540-30133-2
eBook Packages: Springer Book Archive