Skip to main content

Fine-Tuning a Pre-trained Transformer-Based Encoder-Decoder Model with User-Generated Question-Answer Pairs to Realize Character-Like Chatbots

  • Conference paper
  • First Online:
Conversational AI for Natural Human-Centric Interaction

Abstract

In order to realize character-like chatbots, it is necessary to collect dialogue data of particular characters. However, collecting such data is not an easy task. To solve this problem, we have previously proposed a method called “Role-play-based question answering” in which many users play the role of a particular character to answer questions, resulting in a large number of question-answer (QA) pairs associated with that character. In this study, we investigated how character-like dialogue could be realized by fine-tuning a pre-trained Transformer-based encoder-decoder model, which has shown its effectiveness in dialogue modelling, with the QA pairs collected via role-play-based question answering. The results of automatic and manual evaluations show that, with the fine-tuned model, it is possible to significantly outperform a retrieval-based baseline and that, with 44 K QA pairs, it is possible to achieve high naturalness and characterness scores.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://store.steampowered.com/app/412830/STEINSGATE/.

  2. 2.

    https://lucene.apache.org/.

  3. 3.

    https://github.com/yoheikikuta/bert-japanese.

  4. 4.

    https://telegram.org/.

References

  1. Adiwardana D, Luong MT, So DR, Hall J, Fiedel N, Thoppilan R, Yang Z, Kulshreshtha A, Nemade G, Lu Y, Le QV (2020) Towards a human-like open-domain chatbot. arXiv:2001.09977

  2. Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S, Herbert-Voss A, Krueger G, Henighan T, Child R, Ramesh A, Ziegler D, Wu J, Winter C, Hesse C, Chen M, Sigler E, Litwin M, Gray S, Chess B, Clark J, Berner C, McCandlish S, Radford A, Sutskever I, Amodei D (2020) Language models are few-shot learners. In: Proceedings of NeurIPS, pp 1877–1901

    Google Scholar 

  3. Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL, pp 4171–4186

    Google Scholar 

  4. Dinan E, Roller S, Shuster K, Fan A, Auli M, Weston J (2019) Wizard of wikipedia: knowledge-powered conversational agents. In: Proceedings of ICLR, pp 1–18

    Google Scholar 

  5. Higashinaka R, Araki M, Tsukahara H, Mizukami M (2021) Integrated taxonomy of errors in chat-oriented dialogue systems. In: Proceeding of SIGDIAL, pp 89–98

    Google Scholar 

  6. Higashinaka R, Dohsaka K, Isozaki H (2013) Using role play for collecting question-answer pairs for dialogue agents. In: Proceedings of INTERSPEECH, pp 1097–1100

    Google Scholar 

  7. Higashinaka R, Funakoshi K, Inaba M, Tsunomori Y, Takahashi T, Akama R (2021) Dialogue system live competition: identifying problems with dialogue systems through live event. In: Proceedings of IWSDS, pp 185–199

    Google Scholar 

  8. Higashinaka R, Imamura K, Meguro T, Miyazaki C, Kobayashi N, Sugiyama H, Hirano T, Makino T, Matsuo Y (2014) Towards an open domain conversational system fully based on natural language processing. In: Proceedings of COLING, pp 928–939

    Google Scholar 

  9. Higashinaka R, Mizukami M, Kawabata H, Yamaguchi E, Adachi N, Tomita J (2018) Role play-based question-answering by real users for building chatbots with consistent personalities. In: Proceedings of SIGDIAL, pp 264–272

    Google Scholar 

  10. Kodama T, Higashinaka R, Mitsuda K, Masumura R, Aono Y, Nakamura R, Adachi N, Kawabata H (2020) Generating responses that reflect meta information in user-generated question answer pairs. In: Proceedings of LREC, pp 5433–5441

    Google Scholar 

  11. Li J, Galley M, Brockett C, Gao J, Dolan B (2016) A diversity-promoting objective function for neural conversation models. In: Proceedings of NAACL-HLT, pp 110–119

    Google Scholar 

  12. Li J, Galley M, Brockett C, Spithourakis G, Gao J, Dolan B (2016) A persona-based neural conversation model. In: Proceedings of ACL, pp 994–1003

    Google Scholar 

  13. Meguro T, Higashinaka R, Minami Y, Dohsaka K (2010) Controlling listening-oriented dialogue using partially observable Markov decision processes. In: Proceedings of COLING, pp 761–769

    Google Scholar 

  14. Qian Q, Huang M, Zhao H, Xu J, Zhu X (2018) Assigning personality/profile to a chatting machine for coherent conversation generation. In: Proceedings of IJCAI, pp 4279–4285

    Google Scholar 

  15. Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ (2019) Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv:1910.10683

  16. Rashkin H, Smith EM, Li M, Boureau YL (2019) Towards empathetic open-domain conversation models: a new benchmark and dataset. In: Proceedings of ACL, pp 5370–5381

    Google Scholar 

  17. Roller S, Dinan E, Goyal N, Ju D, Williamson M, Liu Y, Xu J, Ott M, Smith EM, Boureau YL, Weston J (2021) Recipes for building an open-domain chatbot. In: Proceedings of EACL, pp 300–325

    Google Scholar 

  18. Shazeer N, Stern M (2018) Adafactor: adaptive learning rates with sublinear memory cost. In: Proceedings of ICML, pp 4596–4604

    Google Scholar 

  19. Smith EM, Williamson M, Shuster K, Weston J, Boureau YL (2020) Can you put it all together: evaluating conversational agents’ ability to blend skills. In: Proceedings of ACL, pp 2021–2030

    Google Scholar 

  20. Sugiyama H, Narimatsu H, Mizukami M, Arimoto T, Chiba Y, Meguro T, Nakajima H (2020) Development of conversational system talking about hobby using transformer-based encoder-decoder model. In: Proceedings of special interest group on spoken language understanding and dialogue processing (in Japanese), pp 104–109

    Google Scholar 

  21. Sugiyama H, Narimatsu H, Mizukami M, Arimoto T, Chiba Y, Meguro T, Nakajima H (2021) Analysis of subjective evaluation for fine-tuning methods of transformer encoder-decoder based conversational systems. In: Proceedings of The 35th annual conference of the Japanese society for artificial intelligence (in Japanese), pp 4E1–OS–11a–03

    Google Scholar 

  22. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Proceedings of NeurIPS, pp 5998–6008

    Google Scholar 

  23. Zhang S, Dinan E, Urbanek J, Szlam A, Kiela D, Weston J (2018) Personalizing dialogue agents: i have a dog, do you have pets too? In: Proceedings of ACL, pp 2204–2213

    Google Scholar 

  24. Zhou L, Gao J, Li D, Shum HY (2020) The design and implementation of XiaoIce, an empathetic social chatbot. Comput Linguist 46(1):53–93

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Koh Mitsuda .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mitsuda, K. et al. (2022). Fine-Tuning a Pre-trained Transformer-Based Encoder-Decoder Model with User-Generated Question-Answer Pairs to Realize Character-Like Chatbots. In: Stoyanchev, S., Ultes, S., Li, H. (eds) Conversational AI for Natural Human-Centric Interaction. Lecture Notes in Electrical Engineering, vol 943. Springer, Singapore. https://doi.org/10.1007/978-981-19-5538-9_20

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-5538-9_20

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-5537-2

  • Online ISBN: 978-981-19-5538-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics