Abstract
Many dialogue systems have been built over the years that address some subset of the many complex factors that shape the behavior of participants in a face-to-face conversation. The Ymir Turntaking Model (YTTM) is a broad computational model of conversational skills that has been in development for over a decade, continuously growing in the number of factors it addresses. In past work we have shown how it addresses realtime dialogue, communicative gesture, perception of turntaking signals (e.g. prosody, gaze, manual gesture), dialogue planning, learning of multimodal turn signals, and dynamic adaptation to human speaking style. The architectural principles of the YTTM prescribe smaller architectural granularity than most other models, and its principles allow non-destructive additive expansion. In this paper we show how the YTTM accommodates multi-party dialogue. The extension has been implemented in a virtual environment; we present data for up to 12 simulated participants participating in realtime cooperative dialogue. The system includes dynamically adjustable parameters for impatience, willingness to give turn and eagerness to speak.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Allwood, J.: An activity based approach to pragmatics. In: Black, W., Bunt, H.C. (eds.) Abduction, Belief and Context in Dialogue: Studies in Computational Pragmatics, pp. 47–80. John Benjamins, Amsterdam (2000)
Sacks, H., Schegloff, E.A., Jefferson, G.A.: A simplest systematics for the organization of turn-taking in conversation. Language 50, 696–735 (1974)
Thórisson, K.R.: Natural turn-taking needs no manual: Computational theory and model, from perception to action. In: Granström, B., House, D., I.K. (eds.) Multimodality in Language and Speech Systems, Dordrecht, The Netherlands, Kluwer Academic Publishers, pp. 173–207. Kluwer Academic Publishers, Dordrecht (2002)
Thórisson, K.R.: Communicative Humanoids: A Computational Model of Psycho-Social Dialogue Skills. PhD thesis, Massachusetts Institute of Technology (1996)
Jonsdottir, G.R., Thórisson, K.R.: Teaching computers to conduct spoken interviews: Breaking the realtime barrier with learning. In: Ruttkay, Z., Kipp, M., Nijholt, A., Vilhjálmsson, H.H. (eds.) IVA 2009. LNCS, vol. 5773, pp. 446–459. Springer, Heidelberg (2009)
Thórisson, K.R., Jonsdottir, G.R.: A granular architecture for dynamic realtime dialogue. In: Prendinger, H., Lester, J.C., Ishizuka, M. (eds.) IVA 2008. LNCS (LNAI), vol. 5208, pp. 131–138. Springer, Heidelberg (2008)
Bonaiuto, J., Thórisson, K.R.: Towards a neurocognitive model of realtime turntaking in face-to-face dialogue. In: Wachsmuth, I., Lenzen, M., G.K. (eds.) Embodied Communication in Humans And Machines, Oxford University Press, U.K (2008)
Thórisson, K.R.: A mind model for communicative creatures and humanoids. International Journal of Applied Artificial Intelligence 13(4-5), 449–486 (1999)
Jonsson, G.K., Thórisson, K.R.: Evaluating multimodal human-robot interaction: A case study of an early humanoid prototype. In: Proceedings of the 6th International Conference on Methods and Techniques in Behavioral Research (2010)
Schegloff, E.A.: Between micro and micro: Contexts and other connections. In: Alexander, J.C., Giesen, B., Munch, R., Smelser, N.J. (eds.) The Micro-Macro Link, pp. 207–234. University of California Press, Berkeley (1987)
Thórisson, K.R.: Modeling multimodal communication as a complex system. In: Wachsmuth, I., Knoblich, G. (eds.) ZiF Research Group International Workshop. LNCS (LNAI), vol. 4930, pp. 143–168. Springer, Heidelberg (2008)
Simon, H.: Near decomposability and the speed of evolution. Industrial and Corporate Change 11(3), 587–599 (2002)
Cavanagh, P.: Attention routines and the architecture of selection. In: Cognitive Neuroscience of Attention, pp. 13–28 (2004)
Driver, J.: A selective review of selective attention research from the past century. British Journal of Psychology 92, 53–78 (2001)
Lessmann, N., Kranstedt, A., Wachsmuth, I.: Towards a Cognitively Motivated Processing of Turn-Taking Signals for the Embodied Conversational Agent Max. In: AAMAS 2004 Workshop Proceedings: Embodied Conversational Agents: Balanced Perception and Action (July 2004)
Traum, D., Rickel, J.: Embodied agents for multi-party dialogue in immersive virtual worlds. In: AAMAS 2002: Proceedings of the first international joint conference on Autonomous agents and multiagent systems, pp. 766–773. ACM, New York (2002)
Raux, A., Eskenazi, M.: A multi-layer architecture for semi-synchronous event-driven dialogue management. In: ASRU, Kyoto, Japan, pp. 514–519 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Thórisson, K.R., Gislason, O., Jonsdottir, G.R., Thorisson, H.T. (2010). A Multiparty Multimodal Architecture for Realtime Turntaking. In: Allbeck, J., Badler, N., Bickmore, T., Pelachaud, C., Safonova, A. (eds) Intelligent Virtual Agents. IVA 2010. Lecture Notes in Computer Science(), vol 6356. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15892-6_37
Download citation
DOI: https://doi.org/10.1007/978-3-642-15892-6_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15891-9
Online ISBN: 978-3-642-15892-6
eBook Packages: Computer ScienceComputer Science (R0)