Skip to main content

Towards Increasing Naturalness and Flexibility in Human-Robot Dialogue Systems

  • Chapter
  • First Online:
Increasing Naturalness and Flexibility in Spoken Dialogue Interaction

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 714))

Abstract

The chapter discusses some approaches to increasing the naturalness and flexibility of human-robot interaction, with examples from the WikiTalk dialogue system. WikiTalk enables robots to talk fluently about thousands of topics using Wikipedia-based talking. However, there are three challenging areas that need to be addressed to make the system more natural: speech interaction, face recognition, interaction history. We address these challenges and describe more context-aware approaches taking the individual partner into account when generating responses. Finally, we discuss the need for a Wikipedia-based listening capability to enable robots to follow the changing topics in human conversation. This would allow robots to join in the conversation using Wikipedia-based talking to make new topically relevant dialogue contributions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.youtube.com/watch?v=NkMkImATfYQ.

  2. 2.

    https://github.com/bmilde/ambientsearch.

  3. 3.

    https://raw.githubusercontent.com/bmilde/ambientsearch/master/demo_video_august_2016.mp4.

References

  1. Amos B, Ludwiczuk B, Satyanarayanan M (2016) OpenFace: a general-purpose face recognition library with mobile applications. Technical report, CMU-CS-16-118, CMU

    Google Scholar 

  2. Fu Z (2018) A paper list for style transfer in text. [GitHub]. https://github.com/fuzhenxin/Style-Transfer-in-Text

  3. Jokinen K (2015) Bridging gaps between planning and open-domain spoken dialogues. In: Gala N, Rapp R, Bel-Enguix G (eds) Language Production, Cognition, and the Lexicon. Springer, Berlin, pp 347–360

    Google Scholar 

  4. Jokinen K, Fukuda K, Iino N, Nishimura S, Nishimura T, Oota Y, Watanabe K, Yoshida Y (2019) Ethical and privacy issues in interactive service applications for elder people concerning dialogues and social robots. In: 7th Serviceology Conference. Tokyo

    Google Scholar 

  5. Jokinen K, Nishimura S, Watanabe K, Nishimura T (2018) Human-robot dialogues for explaining activities. In: 9th International Workshop on Spoken Dialogue Systems (IWSDS 2018). Singapore

    Google Scholar 

  6. Jokinen K, Wilcock G (2012) Multimodal signals and holistic interaction structuring. In: Proceedings of 24th International Conference on Computational Linguistics (COLING 2012). Mumbai, pp 527–538

    Google Scholar 

  7. Jokinen K, Wilcock G (2014) Multimodal open-domain conversations with the Nao robot. In: Mariani J, Rosset S, Garnier-Rizet M, Devillers L (eds) Natural Interaction with Robots, Knowbots and Smartphones: Putting Spoken Dialogue Systems into Practice. Springer, Berlin, pp 213–224

    Google Scholar 

  8. Kawahara T (2018) Spoken dialogue system for a human-like conversational robot ERICA. In: 9th International Workshop on Spoken Dialogue Systems (IWSDS 2018). Singapore

    Google Scholar 

  9. Kawahara T, Uesato M, Yoshino K, Takanashi K (2015) Toward adaptive generation of backchannels for attentive listening agents. In: Proceedings of the 6th International Workshop on Spoken Dialogue Systems (IWSDS 2015). Busan, South Korea

    Google Scholar 

  10. Kazemi V, Sullivan J (2014) One millisecond face alignment with an ensemble of regression trees. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1867–1874

    Google Scholar 

  11. Kim S, Banchs RE, Li H (2015) Wikification of concept mentions within spoken dialogues using domain constraints from wikipedia. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, Portugal, pp 2225–2229. http://www.aclweb.org/anthology/D15-1199

  12. Lala D, Milhorat P, Inoue K, Ishida M, Takanashi K, Kawahara T (2017) Attentive listening system with backchanneling, response generation and flexible turn-taking. In: Proceedings of the SIGDIAL 2017 Conference. Saarbrücken, Germany, pp 127–136

    Google Scholar 

  13. Mihalcea, R., Csomai, A.: Wikify! Linking documents to encyclopedic knowledge. In: Proceedings of the 16th ACM Conference on Information and Knowledge Management (CIKM 2007), pp. 233–242. Lisbon (2007)

    Google Scholar 

  14. Milde B, Wacker J, Radomski S, Muhlhäuser M, Biemann C (2016) Ambient search: a document retrieval system for speech streams. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations. Osaka

    Google Scholar 

  15. Milne D, Witten I (2008) Learning to link with Wikipedia. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM 2008). Napa Valley, pp 509–518

    Google Scholar 

  16. Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 815–823

    Google Scholar 

  17. Wilcock G, Jokinen K (215) Multilingual WikiTalk: Wikipedia-based talking robots that switch languages. In: Proceedings of the 16th Annual SIGdial Meeting on Discourse and Dialogue. Prague

    Google Scholar 

Download references

Acknowledgements

The first author thanks Prof. Tatsuya Kawahara of Kyoto University for the opportunity to participate in the ERICA robot project. The second author acknowledges the support of the New Energy and Industrial Technology Development Organisation (NEDO) in Japan.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kristiina Jokinen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Wilcock, G., Jokinen, K. (2021). Towards Increasing Naturalness and Flexibility in Human-Robot Dialogue Systems. In: Marchi, E., Siniscalchi, S.M., Cumani, S., Salerno, V.M., Li, H. (eds) Increasing Naturalness and Flexibility in Spoken Dialogue Interaction. Lecture Notes in Electrical Engineering, vol 714. Springer, Singapore. https://doi.org/10.1007/978-981-15-9323-9_9

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-9323-9_9

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-9322-2

  • Online ISBN: 978-981-15-9323-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics