Abstract
The paper describes aspects of situated interaction when a humanoid robot uses the WikiTalk system as a spoken language dialogue interface. WikiTalk is a speech-based open-domain information access system that enables the user to move around Wikipedia from topic to topic and have chunks of interesting articles read out aloud. The interactions with the robot are situated: they take place in a particular context and are driven according to the user’s interest and focus of attention. The interactions are also multimodal as both user and robot extend their communicative repertoire with multimodal signals. The robot uses face tracking, nodding and gesturing to support interaction management and the presentation of new information to the partner, while the user speaks, moves, and can touch the robot to interrupt it.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anastasiou D, Jokinen K, Wilcock G (2013) Evaluation of WikiTalk—user studies of human-robot interaction. In: Proceedings of 15th international conference on human-computer interaction (HCII 2013), Las Vegas
Bohus D, Horvitz E (2009) Model for multiparty engagement in open-world dialogue. Proceedings of SIGDIAL, pp 225–234
Brethes L, Menezes P, Lerasle F, Hayet J (2004) Face tracking and hand gesture recognition for human-robot interaction. In: Proceedings of IEEE international conference on robotics and automation, pp 1901–1906
Csapo A, Gilmartin E, Grizou J, Han J, Meena R, Anastasiou D, Jokinen K, Wilcock G (2012) Multimodal conversational interaction with a humanoid robot. In: Proceedings of 3rd IEEE international conference on cognitive infocommunications (CogInfoCom 2012), Kosice, pp 667–672
Feldman R, Rim B (1991) Fundamentals of nonverbal behavior. Cambridge University Press, Cambridge
Fujie S, Fukushima K, Kobayashi T (2004) A conversation robot with back-channel feedback function based on linguistic and non-linguistic information. In: Proceedings of 2nd international conference on autonomous robots and agents (ICARA-2004), pp 379–384
Galibert O, Illouz G, Rosset S (2005) Ritel: an open-domain, human-computer dialog system. In: Proceedings of interspeech-05
Han J, Campbell N, Jokinen K, Wilcock G (2012) Investigating the use of non-verbal cues in human-robot interaction with a Nao robot. In: Proceedings of 3rd IEEE international conference on cognitive infocommunications (CogInfoCom 2012), Kosice, pp 679–683
Jokinen K (2010) Rational communication and affordable natural language interaction for ambient environments. In: Nakamura S, Geunbae Lee G, Mariani J, Minker W (eds) The second workshop on spoken dialogue systems technology. Springer, pp. 163–168
Jokinen K, Hurtig T (2006) User expectations and real experience on a multimodal interactive system. In: Proceedings of ninth international conference on spoken language processing (Interspeech 2006), Pittsburgh, USA
Jokinen K, Wilcock G (2011) Emergent verbal behaviour in human-robot interaction. In: Proceedings of 2nd international conference on cognitive infocommunications (CogInfoCom 2011), Budapest
Jokinen K, Wilcock G (2012) Constructive interaction for talking about interesting topics. In: Proceedings of eighth international conference on language resources and evaluation (LREC 2012). Istanbul
Jokinen K, Wilcock, G (2014) Multimodal open-domain conversations with the Nao robot. In: Mariani J, Rosset S, Garnier-Rizet M, Devillers L (eds) Natural interaction with robots, knowbots and smartphones: putting spoken dialogue systems into practice, Springer, pp 213–224
Meena, R, Jokinen, K, Wilcock, G (2012) Integration of gestures and speech in human-robot interaction. In: Proceedings of 3rd IEEE international conference on cognitive infocommunications (CogInfoCom 2012), Kosice, pp 673–678
Misu T, Kawahara T (2007) Speech-based interactive information guidance system using question-answering technique. In: Proceedings of ICASSP
Munhall KG, Jones JA, Callan DE, Kuratate T, Vatikiotis-Bateson E (2003) Visual prosody and speech intelligibility: head movement improves auditory speech perception. Psychol Sci 15(2):133–137
Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154
Wilcock G (2012) WikiTalk: a spoken Wikipedia-based open-domain knowledge access system. In: Proceedings of the COLING 2012 workshop on question answering for complex domains, Mumbai, pp 57–69
Acknowledgments
We thank Adam Csapo, Emer Gilmartin, Jonathan Grizou, Frank Han, Raveesh Meena and Dimitra Anastasiou for their collaboration on the Nao WikiTalk implementation and the user evaluations at eNTERFACE 2012.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Laxström, N., Jokinen, K., Wilcock, G. (2016). Situated Interaction in a Multilingual Spoken Information Access Framework. In: Rudnicky, A., Raux, A., Lane, I., Misu, T. (eds) Situated Dialog in Speech-Based Human-Computer Interaction. Signals and Communication Technology. Springer, Cham. https://doi.org/10.1007/978-3-319-21834-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-21834-2_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21833-5
Online ISBN: 978-3-319-21834-2
eBook Packages: EngineeringEngineering (R0)