Abstract
In multiparty human–agent interaction, the agent should be able to properly respond to a user by determining whether the utterance is addressed to the agent or to another person. This study proposes a model for predicting the addressee by using the acoustic information in speech and head orientation as nonverbal information. First, we conducted a Wizard-of-Oz (WOZ) experiment to collect human–agent triadic conversations. Then, we analyzed whether the acoustic features and head orientations were correlated with addressee-hood. Based on the analysis, we propose an addressee prediction model that integrates acoustic and bodily nonverbal information using SVM.
Chapter PDF
Similar content being viewed by others
References
Kendon, A.: Some Functions of Gaze Direction in Social Interaction. Acta Psychologica 26, 22–63 (1967)
Duncan, S.: Some signals and rules for taking speaking turns in conversations. Journal of Personality and Social Psychology 23(2), 283–292 (1972)
Vertegaal, R., et al.: Eye gaze patterns in conversations: there is more the conversational agents than meets the eyes. In: CHI 2001 (2001)
Takemae, Y., Otsuka, K., Mukawa, N.: Video cut editing rule based on participants’ gaze in multiparty conversation. In: The 11th ACM International Conference on Multimedia (2003)
Akker, R.o.d., Traum, D.: A comparison of addressee detection methods for multiparty conversations. In: 13th Workshop on the Semantics and Pragmatics of Dialogue (2009)
Frampton, M., et al.: Who is “You”? Combining Linguistic and Gaze Features to Resolve Second-Person References in Dialogue. In: the 12th Conference of the European Chapter of the ACL (2009)
Lunsford, R., Oviatt, S.: Human perception of intended addressee during computer-assisted meetings. In: The 8th international Conference on Multimodal interfaces, ICMI 2006 (2006)
Bohus, D., Horvitz, E.: Facilitating Multiparty Dialog with Gaze, Gesture, and Speech. In: ICMI-MLMI 2010 (2010)
Terken, J., Joris, I., Valk, L.d.: Multimodal Cues for Addressee-hood in Triadic Communication with a Human Information Retrieval Agent. In: International Conference on Multimodal interfaces, ICMI 2007 (2007)
Katzenmaier, M., Stiefelhagen, R., Schultz, T.: Identifying the Addressee in HumanHumanRobot Interactions based on Head Pose and Speech. In: international Conference on Multimodal interfaces, ICMI 2004 (2004)
Rodriguez, H., Beck, D., Lind, D., Lok, B.: Audio Analysis of Human/Virtual-Human Interaction. In: Prendinger, H., Lester, J.C., Ishizuka, M. (eds.) IVA 2008. LNCS (LNAI), vol. 5208, pp. 154–161. Springer, Heidelberg (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Baba, N., Huang, HH., Nakano, Y.I. (2011). Identifying Utterances Addressed to an Agent in Multiparty Human–Agent Conversations. In: Vilhjálmsson, H.H., Kopp, S., Marsella, S., Thórisson, K.R. (eds) Intelligent Virtual Agents. IVA 2011. Lecture Notes in Computer Science(), vol 6895. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23974-8_28
Download citation
DOI: https://doi.org/10.1007/978-3-642-23974-8_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23973-1
Online ISBN: 978-3-642-23974-8
eBook Packages: Computer ScienceComputer Science (R0)