A Multiparty Multimodal Architecture for Realtime Turntaking

Thórisson, Kristinn R.; Gislason, Olafur; Jonsdottir, Gudny Ragna; Thorisson, Hrafn Th.

doi:10.1007/978-3-642-15892-6_37

Kristinn R. Thórisson^24,25,
Olafur Gislason²⁴,
Gudny Ragna Jonsdottir²⁵ &
…
Hrafn Th. Thorisson²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6356))

Included in the following conference series:

International Conference on Intelligent Virtual Agents

3176 Accesses
7 Citations

Abstract

Many dialogue systems have been built over the years that address some subset of the many complex factors that shape the behavior of participants in a face-to-face conversation. The Ymir Turntaking Model (YTTM) is a broad computational model of conversational skills that has been in development for over a decade, continuously growing in the number of factors it addresses. In past work we have shown how it addresses realtime dialogue, communicative gesture, perception of turntaking signals (e.g. prosody, gaze, manual gesture), dialogue planning, learning of multimodal turn signals, and dynamic adaptation to human speaking style. The architectural principles of the YTTM prescribe smaller architectural granularity than most other models, and its principles allow non-destructive additive expansion. In this paper we show how the YTTM accommodates multi-party dialogue. The extension has been implemented in a virtual environment; we present data for up to 12 simulated participants participating in realtime cooperative dialogue. The system includes dynamically adjustable parameters for impatience, willingness to give turn and eagerness to speak.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Allwood, J.: An activity based approach to pragmatics. In: Black, W., Bunt, H.C. (eds.) Abduction, Belief and Context in Dialogue: Studies in Computational Pragmatics, pp. 47–80. John Benjamins, Amsterdam (2000)
Chapter Google Scholar
Sacks, H., Schegloff, E.A., Jefferson, G.A.: A simplest systematics for the organization of turn-taking in conversation. Language 50, 696–735 (1974)
Article Google Scholar
Thórisson, K.R.: Natural turn-taking needs no manual: Computational theory and model, from perception to action. In: Granström, B., House, D., I.K. (eds.) Multimodality in Language and Speech Systems, Dordrecht, The Netherlands, Kluwer Academic Publishers, pp. 173–207. Kluwer Academic Publishers, Dordrecht (2002)
Google Scholar
Thórisson, K.R.: Communicative Humanoids: A Computational Model of Psycho-Social Dialogue Skills. PhD thesis, Massachusetts Institute of Technology (1996)
Google Scholar
Jonsdottir, G.R., Thórisson, K.R.: Teaching computers to conduct spoken interviews: Breaking the realtime barrier with learning. In: Ruttkay, Z., Kipp, M., Nijholt, A., Vilhjálmsson, H.H. (eds.) IVA 2009. LNCS, vol. 5773, pp. 446–459. Springer, Heidelberg (2009)
Chapter Google Scholar
Thórisson, K.R., Jonsdottir, G.R.: A granular architecture for dynamic realtime dialogue. In: Prendinger, H., Lester, J.C., Ishizuka, M. (eds.) IVA 2008. LNCS (LNAI), vol. 5208, pp. 131–138. Springer, Heidelberg (2008)
Chapter Google Scholar
Bonaiuto, J., Thórisson, K.R.: Towards a neurocognitive model of realtime turntaking in face-to-face dialogue. In: Wachsmuth, I., Lenzen, M., G.K. (eds.) Embodied Communication in Humans And Machines, Oxford University Press, U.K (2008)
Google Scholar
Thórisson, K.R.: A mind model for communicative creatures and humanoids. International Journal of Applied Artificial Intelligence 13(4-5), 449–486 (1999)
Article Google Scholar
Jonsson, G.K., Thórisson, K.R.: Evaluating multimodal human-robot interaction: A case study of an early humanoid prototype. In: Proceedings of the 6th International Conference on Methods and Techniques in Behavioral Research (2010)
Google Scholar
Schegloff, E.A.: Between micro and micro: Contexts and other connections. In: Alexander, J.C., Giesen, B., Munch, R., Smelser, N.J. (eds.) The Micro-Macro Link, pp. 207–234. University of California Press, Berkeley (1987)
Google Scholar
Thórisson, K.R.: Modeling multimodal communication as a complex system. In: Wachsmuth, I., Knoblich, G. (eds.) ZiF Research Group International Workshop. LNCS (LNAI), vol. 4930, pp. 143–168. Springer, Heidelberg (2008)
Chapter Google Scholar
Simon, H.: Near decomposability and the speed of evolution. Industrial and Corporate Change 11(3), 587–599 (2002)
Article Google Scholar
Cavanagh, P.: Attention routines and the architecture of selection. In: Cognitive Neuroscience of Attention, pp. 13–28 (2004)
Google Scholar
Driver, J.: A selective review of selective attention research from the past century. British Journal of Psychology 92, 53–78 (2001)
Article Google Scholar
Lessmann, N., Kranstedt, A., Wachsmuth, I.: Towards a Cognitively Motivated Processing of Turn-Taking Signals for the Embodied Conversational Agent Max. In: AAMAS 2004 Workshop Proceedings: Embodied Conversational Agents: Balanced Perception and Action (July 2004)
Google Scholar
Traum, D., Rickel, J.: Embodied agents for multi-party dialogue in immersive virtual worlds. In: AAMAS 2002: Proceedings of the first international joint conference on Autonomous agents and multiagent systems, pp. 766–773. ACM, New York (2002)
Google Scholar
Raux, A., Eskenazi, M.: A multi-layer architecture for semi-synchronous event-driven dialogue management. In: ASRU, Kyoto, Japan, pp. 514–519 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Analysis & Design of Intelligent Agents, Reykjavik University, Iceland
Kristinn R. Thórisson, Olafur Gislason & Hrafn Th. Thorisson
Icelandic Institute for Intelligent Machines, Menntavegur 1, 101, Reykjavik, Iceland
Kristinn R. Thórisson & Gudny Ragna Jonsdottir

Authors

Kristinn R. Thórisson
View author publications
You can also search for this author in PubMed Google Scholar
Olafur Gislason
View author publications
You can also search for this author in PubMed Google Scholar
Gudny Ragna Jonsdottir
View author publications
You can also search for this author in PubMed Google Scholar
Hrafn Th. Thorisson
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Comupter Science, George Mason University, 22030, Fairfax, VA, USA
Jan Allbeck
University of Pennsylvania, 19104-6389, Philadelphia, PA, USA
Norman Badler
College of Computer and Information Science, Northeastern University, 02115, Boston, MA, USA
Timothy Bickmore
CNRS-LTCI, Institut Télécom - Télécom ParisTech, 75014, Paris, France
Catherine Pelachaud
Computer and Information Science, University of Pennsylvania, 19104-6389, Philadelphia, PA, USA
Alla Safonova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Thórisson, K.R., Gislason, O., Jonsdottir, G.R., Thorisson, H.T. (2010). A Multiparty Multimodal Architecture for Realtime Turntaking. In: Allbeck, J., Badler, N., Bickmore, T., Pelachaud, C., Safonova, A. (eds) Intelligent Virtual Agents. IVA 2010. Lecture Notes in Computer Science(), vol 6356. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15892-6_37

Download citation

DOI: https://doi.org/10.1007/978-3-642-15892-6_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15891-9
Online ISBN: 978-3-642-15892-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics