Situated Interaction in a Multilingual Spoken Information Access Framework

Laxström, Niklas; Jokinen, Kristiina; Wilcock, Graham

doi:10.1007/978-3-319-21834-2_15

Niklas Laxström⁵,
Kristiina Jokinen⁵ &
Graham Wilcock⁵

Part of the book series: Signals and Communication Technology ((SCT))

718 Accesses

Abstract

The paper describes aspects of situated interaction when a humanoid robot uses the WikiTalk system as a spoken language dialogue interface. WikiTalk is a speech-based open-domain information access system that enables the user to move around Wikipedia from topic to topic and have chunks of interesting articles read out aloud. The interactions with the robot are situated: they take place in a particular context and are driven according to the user’s interest and focus of attention. The interactions are also multimodal as both user and robot extend their communicative repertoire with multimodal signals. The robot uses face tracking, nodding and gesturing to support interaction management and the presentation of new information to the partner, while the user speaks, moves, and can touch the robot to interrupt it.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Anastasiou D, Jokinen K, Wilcock G (2013) Evaluation of WikiTalk—user studies of human-robot interaction. In: Proceedings of 15th international conference on human-computer interaction (HCII 2013), Las Vegas
Google Scholar
Bohus D, Horvitz E (2009) Model for multiparty engagement in open-world dialogue. Proceedings of SIGDIAL, pp 225–234
Google Scholar
Brethes L, Menezes P, Lerasle F, Hayet J (2004) Face tracking and hand gesture recognition for human-robot interaction. In: Proceedings of IEEE international conference on robotics and automation, pp 1901–1906
Google Scholar
Csapo A, Gilmartin E, Grizou J, Han J, Meena R, Anastasiou D, Jokinen K, Wilcock G (2012) Multimodal conversational interaction with a humanoid robot. In: Proceedings of 3rd IEEE international conference on cognitive infocommunications (CogInfoCom 2012), Kosice, pp 667–672
Google Scholar
Feldman R, Rim B (1991) Fundamentals of nonverbal behavior. Cambridge University Press, Cambridge
Google Scholar
Fujie S, Fukushima K, Kobayashi T (2004) A conversation robot with back-channel feedback function based on linguistic and non-linguistic information. In: Proceedings of 2nd international conference on autonomous robots and agents (ICARA-2004), pp 379–384
Google Scholar
Galibert O, Illouz G, Rosset S (2005) Ritel: an open-domain, human-computer dialog system. In: Proceedings of interspeech-05
Google Scholar
Han J, Campbell N, Jokinen K, Wilcock G (2012) Investigating the use of non-verbal cues in human-robot interaction with a Nao robot. In: Proceedings of 3rd IEEE international conference on cognitive infocommunications (CogInfoCom 2012), Kosice, pp 679–683
Google Scholar
Jokinen K (2010) Rational communication and affordable natural language interaction for ambient environments. In: Nakamura S, Geunbae Lee G, Mariani J, Minker W (eds) The second workshop on spoken dialogue systems technology. Springer, pp. 163–168
Google Scholar
Jokinen K, Hurtig T (2006) User expectations and real experience on a multimodal interactive system. In: Proceedings of ninth international conference on spoken language processing (Interspeech 2006), Pittsburgh, USA
Google Scholar
Jokinen K, Wilcock G (2011) Emergent verbal behaviour in human-robot interaction. In: Proceedings of 2nd international conference on cognitive infocommunications (CogInfoCom 2011), Budapest
Google Scholar
Jokinen K, Wilcock G (2012) Constructive interaction for talking about interesting topics. In: Proceedings of eighth international conference on language resources and evaluation (LREC 2012). Istanbul
Google Scholar
Jokinen K, Wilcock, G (2014) Multimodal open-domain conversations with the Nao robot. In: Mariani J, Rosset S, Garnier-Rizet M, Devillers L (eds) Natural interaction with robots, knowbots and smartphones: putting spoken dialogue systems into practice, Springer, pp 213–224
Google Scholar
Meena, R, Jokinen, K, Wilcock, G (2012) Integration of gestures and speech in human-robot interaction. In: Proceedings of 3rd IEEE international conference on cognitive infocommunications (CogInfoCom 2012), Kosice, pp 673–678
Google Scholar
Misu T, Kawahara T (2007) Speech-based interactive information guidance system using question-answering technique. In: Proceedings of ICASSP
Google Scholar
Munhall KG, Jones JA, Callan DE, Kuratate T, Vatikiotis-Bateson E (2003) Visual prosody and speech intelligibility: head movement improves auditory speech perception. Psychol Sci 15(2):133–137
Article Google Scholar
Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154
Article Google Scholar
Wilcock G (2012) WikiTalk: a spoken Wikipedia-based open-domain knowledge access system. In: Proceedings of the COLING 2012 workshop on question answering for complex domains, Mumbai, pp 57–69
Google Scholar

Download references

Acknowledgments

We thank Adam Csapo, Emer Gilmartin, Jonathan Grizou, Frank Han, Raveesh Meena and Dimitra Anastasiou for their collaboration on the Nao WikiTalk implementation and the user evaluations at eNTERFACE 2012.

Author information

Authors and Affiliations

University of Helsinki, Helsinki, Finland
Niklas Laxström, Kristiina Jokinen & Graham Wilcock

Authors

Niklas Laxström
View author publications
You can also search for this author in PubMed Google Scholar
Kristiina Jokinen
View author publications
You can also search for this author in PubMed Google Scholar
Graham Wilcock
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Niklas Laxström .

Editor information

Editors and Affiliations

School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
Alexander Rudnicky
Cupertino, California, USA
Antoine Raux
Silicon Valley, Carnegie Mellon University, Moffett Field, California, USA
Ian Lane
Mountain View, California, USA
Teruhisa Misu

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Laxström, N., Jokinen, K., Wilcock, G. (2016). Situated Interaction in a Multilingual Spoken Information Access Framework. In: Rudnicky, A., Raux, A., Lane, I., Misu, T. (eds) Situated Dialog in Speech-Based Human-Computer Interaction. Signals and Communication Technology. Springer, Cham. https://doi.org/10.1007/978-3-319-21834-2_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-21834-2_15
Published: 21 April 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21833-5
Online ISBN: 978-3-319-21834-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics