Abstract
In this paper we investigate the role of user emotions in human-machine goal-oriented conversations. There has been a growing interest in predicting emotions from acted and non-acted spontaneous speech. Much of the research work has gone in determining what are the correct labels and improving emotion prediction accuracy. In this paper we evaluate the value of user emotional state towards a computational model of emotion processing. We consider a binary representation of emotions (positive vs. negative) in the context of a goal-driven conversational system. For each human-machine interaction we acquire the temporal emotion sequence going from the initial to the final conversational state. These traces are used as features to characterize the user state dynamics. We ground the emotion traces by associating its patterns to dialog strategies and their effectiveness. In order to quantify the value of emotion indicators, we evaluate their predictions in terms of speech recognition and spoken language understanding errors as well as task success or failure. We report results on the 11.5K dialog corpus samples from the How may I Help You? corpus.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Lee, C.M., Narayanan, S.: Towards detecting emotions in spoken dialogs. IEEE Transactions on Speech and Audio Processing 13, 293–303 (2005)
Ang, J., Dhillon, R., Krupski, A., Shriberg, E., Stolcke, A.: Prosody-based automatic detection of annoyance and frustration in human-computer dialog. In: Proceedings of ICSLP, Denver, Colorado, USA, pp. 2037–2039 (2002)
Litman, D., Forbes-Riley, K.: Predicting student emotions in computer-human tutoring dialogues. In: Proceedings of the 42nd Annual Meeting of the Association for Compuational Linguistics (ACL), Barcelona, Spain (2004)
Gorin, A.L., Riccardi, G., Wright, J.H.: How may I help you? Speech Communication 23, 113–127 (1997)
Gupta, N., Tur, G., Hakkani-Tür, D., Bangalore, S., Riccardi, G., Rahim, M.: The AT&T spoken language understanding system. IEEE Transactions on Speech and Audio Processing (to appear)
Goffin, V., Allauzen, C., Bocchieri, E., Hakkani-Tür, D., Ljolje, A., Parthasarathy, S., Rahim, M., Riccardi, G., Saraclar, M.: The AT&T WATSON speech recognizer. In: Proceedings of IEEE ICASSP 2005, Philadelphia, PA, USA (2005)
Abella, A., Gorin, A.G.: Construct algebra: Analytical dialog management. In: Proceedings of the 42nd Annual Meeting of the Association for Compuational Linguistics (ACL), Washington D.C. (1999)
Shafran, I., Riley, M., Mohri, M.: Voice signatures. In: Proceedings of The 8th IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2003), St. Thomas, U.S. Virgin Islands (2003)
Schapire, R.E., Singer, Y.: BoosTexter: A boosting-based system for text categorization. Machine Learning 39, 135–168 (2000)
Karahan, M., Hakkani-Tür, D., Riccardi, G., Tur, G.: Combining classifiers for spoken language understanding. In: Proceedings of IEEE workshop on Automatic Speech Recognition and Understanding, Virgin Islands, USA (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Riccardi, G., Hakkani-Tür, D. (2005). Grounding Emotions in Human-Machine Conversational Systems. In: Maybury, M., Stock, O., Wahlster, W. (eds) Intelligent Technologies for Interactive Entertainment. INTETAIN 2005. Lecture Notes in Computer Science(), vol 3814. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11590323_15
Download citation
DOI: https://doi.org/10.1007/11590323_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30509-5
Online ISBN: 978-3-540-31651-0
eBook Packages: Computer ScienceComputer Science (R0)