Skip to main content

A Multiparty Multimodal Architecture for Realtime Turntaking

  • Conference paper
Intelligent Virtual Agents (IVA 2010)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6356))

Included in the following conference series:

Abstract

Many dialogue systems have been built over the years that address some subset of the many complex factors that shape the behavior of participants in a face-to-face conversation. The Ymir Turntaking Model (YTTM) is a broad computational model of conversational skills that has been in development for over a decade, continuously growing in the number of factors it addresses. In past work we have shown how it addresses realtime dialogue, communicative gesture, perception of turntaking signals (e.g. prosody, gaze, manual gesture), dialogue planning, learning of multimodal turn signals, and dynamic adaptation to human speaking style. The architectural principles of the YTTM prescribe smaller architectural granularity than most other models, and its principles allow non-destructive additive expansion. In this paper we show how the YTTM accommodates multi-party dialogue. The extension has been implemented in a virtual environment; we present data for up to 12 simulated participants participating in realtime cooperative dialogue. The system includes dynamically adjustable parameters for impatience, willingness to give turn and eagerness to speak.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allwood, J.: An activity based approach to pragmatics. In: Black, W., Bunt, H.C. (eds.) Abduction, Belief and Context in Dialogue: Studies in Computational Pragmatics, pp. 47–80. John Benjamins, Amsterdam (2000)

    Chapter  Google Scholar 

  2. Sacks, H., Schegloff, E.A., Jefferson, G.A.: A simplest systematics for the organization of turn-taking in conversation. Language 50, 696–735 (1974)

    Article  Google Scholar 

  3. Thórisson, K.R.: Natural turn-taking needs no manual: Computational theory and model, from perception to action. In: Granström, B., House, D., I.K. (eds.) Multimodality in Language and Speech Systems, Dordrecht, The Netherlands, Kluwer Academic Publishers, pp. 173–207. Kluwer Academic Publishers, Dordrecht (2002)

    Google Scholar 

  4. Thórisson, K.R.: Communicative Humanoids: A Computational Model of Psycho-Social Dialogue Skills. PhD thesis, Massachusetts Institute of Technology (1996)

    Google Scholar 

  5. Jonsdottir, G.R., Thórisson, K.R.: Teaching computers to conduct spoken interviews: Breaking the realtime barrier with learning. In: Ruttkay, Z., Kipp, M., Nijholt, A., Vilhjálmsson, H.H. (eds.) IVA 2009. LNCS, vol. 5773, pp. 446–459. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  6. Thórisson, K.R., Jonsdottir, G.R.: A granular architecture for dynamic realtime dialogue. In: Prendinger, H., Lester, J.C., Ishizuka, M. (eds.) IVA 2008. LNCS (LNAI), vol. 5208, pp. 131–138. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  7. Bonaiuto, J., Thórisson, K.R.: Towards a neurocognitive model of realtime turntaking in face-to-face dialogue. In: Wachsmuth, I., Lenzen, M., G.K. (eds.) Embodied Communication in Humans And Machines, Oxford University Press, U.K (2008)

    Google Scholar 

  8. Thórisson, K.R.: A mind model for communicative creatures and humanoids. International Journal of Applied Artificial Intelligence 13(4-5), 449–486 (1999)

    Article  Google Scholar 

  9. Jonsson, G.K., Thórisson, K.R.: Evaluating multimodal human-robot interaction: A case study of an early humanoid prototype. In: Proceedings of the 6th International Conference on Methods and Techniques in Behavioral Research (2010)

    Google Scholar 

  10. Schegloff, E.A.: Between micro and micro: Contexts and other connections. In: Alexander, J.C., Giesen, B., Munch, R., Smelser, N.J. (eds.) The Micro-Macro Link, pp. 207–234. University of California Press, Berkeley (1987)

    Google Scholar 

  11. Thórisson, K.R.: Modeling multimodal communication as a complex system. In: Wachsmuth, I., Knoblich, G. (eds.) ZiF Research Group International Workshop. LNCS (LNAI), vol. 4930, pp. 143–168. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  12. Simon, H.: Near decomposability and the speed of evolution. Industrial and Corporate Change 11(3), 587–599 (2002)

    Article  Google Scholar 

  13. Cavanagh, P.: Attention routines and the architecture of selection. In: Cognitive Neuroscience of Attention, pp. 13–28 (2004)

    Google Scholar 

  14. Driver, J.: A selective review of selective attention research from the past century. British Journal of Psychology 92, 53–78 (2001)

    Article  Google Scholar 

  15. Lessmann, N., Kranstedt, A., Wachsmuth, I.: Towards a Cognitively Motivated Processing of Turn-Taking Signals for the Embodied Conversational Agent Max. In: AAMAS 2004 Workshop Proceedings: Embodied Conversational Agents: Balanced Perception and Action (July 2004)

    Google Scholar 

  16. Traum, D., Rickel, J.: Embodied agents for multi-party dialogue in immersive virtual worlds. In: AAMAS 2002: Proceedings of the first international joint conference on Autonomous agents and multiagent systems, pp. 766–773. ACM, New York (2002)

    Google Scholar 

  17. Raux, A., Eskenazi, M.: A multi-layer architecture for semi-synchronous event-driven dialogue management. In: ASRU, Kyoto, Japan, pp. 514–519 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Thórisson, K.R., Gislason, O., Jonsdottir, G.R., Thorisson, H.T. (2010). A Multiparty Multimodal Architecture for Realtime Turntaking. In: Allbeck, J., Badler, N., Bickmore, T., Pelachaud, C., Safonova, A. (eds) Intelligent Virtual Agents. IVA 2010. Lecture Notes in Computer Science(), vol 6356. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15892-6_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15892-6_37

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15891-9

  • Online ISBN: 978-3-642-15892-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics