Abstract
The chapter discusses some approaches to increasing the naturalness and flexibility of human-robot interaction, with examples from the WikiTalk dialogue system. WikiTalk enables robots to talk fluently about thousands of topics using Wikipedia-based talking. However, there are three challenging areas that need to be addressed to make the system more natural: speech interaction, face recognition, interaction history. We address these challenges and describe more context-aware approaches taking the individual partner into account when generating responses. Finally, we discuss the need for a Wikipedia-based listening capability to enable robots to follow the changing topics in human conversation. This would allow robots to join in the conversation using Wikipedia-based talking to make new topically relevant dialogue contributions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Amos B, Ludwiczuk B, Satyanarayanan M (2016) OpenFace: a general-purpose face recognition library with mobile applications. Technical report, CMU-CS-16-118, CMU
Fu Z (2018) A paper list for style transfer in text. [GitHub]. https://github.com/fuzhenxin/Style-Transfer-in-Text
Jokinen K (2015) Bridging gaps between planning and open-domain spoken dialogues. In: Gala N, Rapp R, Bel-Enguix G (eds) Language Production, Cognition, and the Lexicon. Springer, Berlin, pp 347–360
Jokinen K, Fukuda K, Iino N, Nishimura S, Nishimura T, Oota Y, Watanabe K, Yoshida Y (2019) Ethical and privacy issues in interactive service applications for elder people concerning dialogues and social robots. In: 7th Serviceology Conference. Tokyo
Jokinen K, Nishimura S, Watanabe K, Nishimura T (2018) Human-robot dialogues for explaining activities. In: 9th International Workshop on Spoken Dialogue Systems (IWSDS 2018). Singapore
Jokinen K, Wilcock G (2012) Multimodal signals and holistic interaction structuring. In: Proceedings of 24th International Conference on Computational Linguistics (COLING 2012). Mumbai, pp 527–538
Jokinen K, Wilcock G (2014) Multimodal open-domain conversations with the Nao robot. In: Mariani J, Rosset S, Garnier-Rizet M, Devillers L (eds) Natural Interaction with Robots, Knowbots and Smartphones: Putting Spoken Dialogue Systems into Practice. Springer, Berlin, pp 213–224
Kawahara T (2018) Spoken dialogue system for a human-like conversational robot ERICA. In: 9th International Workshop on Spoken Dialogue Systems (IWSDS 2018). Singapore
Kawahara T, Uesato M, Yoshino K, Takanashi K (2015) Toward adaptive generation of backchannels for attentive listening agents. In: Proceedings of the 6th International Workshop on Spoken Dialogue Systems (IWSDS 2015). Busan, South Korea
Kazemi V, Sullivan J (2014) One millisecond face alignment with an ensemble of regression trees. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1867–1874
Kim S, Banchs RE, Li H (2015) Wikification of concept mentions within spoken dialogues using domain constraints from wikipedia. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, Portugal, pp 2225–2229. http://www.aclweb.org/anthology/D15-1199
Lala D, Milhorat P, Inoue K, Ishida M, Takanashi K, Kawahara T (2017) Attentive listening system with backchanneling, response generation and flexible turn-taking. In: Proceedings of the SIGDIAL 2017 Conference. Saarbrücken, Germany, pp 127–136
Mihalcea, R., Csomai, A.: Wikify! Linking documents to encyclopedic knowledge. In: Proceedings of the 16th ACM Conference on Information and Knowledge Management (CIKM 2007), pp. 233–242. Lisbon (2007)
Milde B, Wacker J, Radomski S, Muhlhäuser M, Biemann C (2016) Ambient search: a document retrieval system for speech streams. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations. Osaka
Milne D, Witten I (2008) Learning to link with Wikipedia. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM 2008). Napa Valley, pp 509–518
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 815–823
Wilcock G, Jokinen K (215) Multilingual WikiTalk: Wikipedia-based talking robots that switch languages. In: Proceedings of the 16th Annual SIGdial Meeting on Discourse and Dialogue. Prague
Acknowledgements
The first author thanks Prof. Tatsuya Kawahara of Kyoto University for the opportunity to participate in the ERICA robot project. The second author acknowledges the support of the New Energy and Industrial Technology Development Organisation (NEDO) in Japan.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Wilcock, G., Jokinen, K. (2021). Towards Increasing Naturalness and Flexibility in Human-Robot Dialogue Systems. In: Marchi, E., Siniscalchi, S.M., Cumani, S., Salerno, V.M., Li, H. (eds) Increasing Naturalness and Flexibility in Spoken Dialogue Interaction. Lecture Notes in Electrical Engineering, vol 714. Springer, Singapore. https://doi.org/10.1007/978-981-15-9323-9_9
Download citation
DOI: https://doi.org/10.1007/978-981-15-9323-9_9
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-9322-2
Online ISBN: 978-981-15-9323-9
eBook Packages: EngineeringEngineering (R0)