Skip to main content

Development and Research of Dialogue Agents with Long-Term Memory and Web Search

  • Conference paper
  • First Online:
Speech and Computer (SPECOM 2023)

Abstract

This paper describes the training process of neural network that can extract additional knowledge from the Internet and long-term memory with the goal to improve the quality of generated dialogue responses. Modern language models, due to their large size and high-quality data, can generate meaningful texts, including dialogues with other speakers. Meanwhile, their knowledge is frozen in time by the data they were trained on. Without re-training, such models are not able to acquire new relevant knowledge. We propose one of the possible solutions to this problem, in which the neural network model will be able to use the knowledge received from the Internet and long-term memory to generate dialogue responses. Using these methods we improved BLEU-1 metric by 43% and BLEU-2 metric by 45% on Toloka Persona Chat Rus dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://toloka.ai/datasets.

  2. 2.

    https://huggingface.co/Helsinki-NLP/opus-mt-tc-big-en-zle.

  3. 3.

    https://huggingface.co/sberbank-ai/ruBert-base.

  4. 4.

    https://huggingface.co/sberbank-ai/ruT5-base.

  5. 5.

    https://huggingface.co/sberbank-ai/ruT5-large.

  6. 6.

    https://huggingface.co/sberbank-ai/rugpt3large_based_on_gpt2.

References

  1. Adiwardana, D., et al.: Towards a human-like open-domain chatbot. arXiv preprint arXiv:2001.09977 (2020)

  2. Adolphs, L., Shuster, K., Urbanek, J., Szlam, A., Weston, J.: Reason first, then respond: modular generation for knowledge-infused dialogue. arXiv preprint arXiv:2111.05204 (2021)

  3. Efimov, P., Chertok, A., Boytsov, L., Braslavski, P.: SberQuAD – Russian reading comprehension dataset: description and analysis. In: Arampatzis, A., et al. (eds.) CLEF 2020. LNCS, vol. 12260, pp. 3–15. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58219-7_1

    Chapter  Google Scholar 

  4. Glaese, A., et al.: Improving alignment of dialogue agents via targeted human judgements. arXiv preprint arXiv:2209.14375 (2022)

  5. Izacard, G., Grave, E.: Leveraging passage retrieval with generative models for open domain question answering. arXiv preprint arXiv:2007.01282 (2020)

  6. Karpukhin, V., et al.: Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906 (2020)

  7. Komeili, M., Shuster, K., Weston, J.: Internet-augmented dialogue generation. arXiv preprint arXiv:2107.07566 (2021)

  8. Lin, Z., Madotto, A., Bang, Y., Fung, P.: The adapter-bot: all-in-one controllable conversational model. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 16081–16083 (2021)

    Google Scholar 

  9. Liu, Q., et al.: You impress me: dialogue generation via mutual persona perception. arXiv preprint arXiv:2004.05388 (2020)

  10. Matveev, Y., Makhnytkina, O., Posokhov, P., Matveev, A., Skrylnikov, S.: Personalizing hybrid-based dialogue agents. Mathematics 10(24), 4657 (2022)

    Article  Google Scholar 

  11. Nakano, R., et al.: WebGPT: browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021)

  12. Posokhov, P., Apanasovich, K., Matveeva, A., Makhnytkina, O., Matveev, A.: Personalizing dialogue agents for Russian: retrieve and refine. In: 2022 31st Conference of Open Innovations Association (FRUCT), pp. 245–252. IEEE (2022)

    Google Scholar 

  13. Posokhov, P., Matveeva, A., Makhnytkina, O., Matveev, A., Matveev, Y.: Personalizing retrieval-based dialogue agents. In: Prasanna, S.R.M., Karpov, A., Samudravijaya, K., Agrawal, S.S. (eds.) International Conference on Speech and Computer, pp. 554–566. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20980-2_47

  14. Roller, S., et al.: Recipes for building an open-domain chatbot. arXiv preprint arXiv:2004.13637 (2020)

  15. Shuster, K., Komeili, M., Adolphs, L., Roller, S., Szlam, A., Weston, J.: Language models that seek for knowledge: modular search & generation for dialogue and prompt completion. arXiv preprint arXiv:2203.13224 (2022)

  16. Shuster, K., et al.: Blenderbot 3: a deployed conversational agent that continually learns to responsibly engage. arXiv preprint arXiv:2208.03188 (2022)

  17. Thoppilan, R., et al.: Lamda: language models for dialog applications. arXiv preprint arXiv:2201.08239 (2022)

  18. Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 1–11 (2017)

    Google Scholar 

  19. Wolf, T., Sanh, V., Chaumond, J., Delangue, C.: TransferTransfo: a transfer learning approach for neural network based conversational agents. arXiv preprint arXiv:1901.08149 (2019)

  20. Xu, J., Szlam, A., Weston, J.: Beyond goldfish memory: long-term open-domain conversation. arXiv preprint arXiv:2107.07567 (2021)

  21. Xu, X., et al.: Long time no see! open-domain conversation with long-term persona memory. arXiv preprint arXiv:2203.05797 (2022)

  22. Zhang, S., et al.: Opt: open pre-trained transformer language models. arXiv preprint arXiv:2205.01068 (2022)

  23. Zhang, X., et al.: Making a MIRACL: multilingual information retrieval across a continuum of languages. arXiv preprint arXiv:2210.09984 (2022)

  24. Zhang, Y., et al.: DialoGPT: large-scale generative pre-training for conversational response generation. arXiv preprint arXiv:1911.00536 (2019)

Download references

Acknowledgment

The research was financially supported by the Russian Science Foundations (project 22-11-00128).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Olesia Makhnytkina .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Apanasovich, K., Makhnytkina, O., Matveev, Y. (2023). Development and Research of Dialogue Agents with Long-Term Memory and Web Search. In: Karpov, A., Samudravijaya, K., Deepak, K.T., Hegde, R.M., Agrawal, S.S., Prasanna, S.R.M. (eds) Speech and Computer. SPECOM 2023. Lecture Notes in Computer Science(), vol 14338. Springer, Cham. https://doi.org/10.1007/978-3-031-48309-7_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-48309-7_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-48308-0

  • Online ISBN: 978-3-031-48309-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics