Abstract
This paper describes the training process of neural network that can extract additional knowledge from the Internet and long-term memory with the goal to improve the quality of generated dialogue responses. Modern language models, due to their large size and high-quality data, can generate meaningful texts, including dialogues with other speakers. Meanwhile, their knowledge is frozen in time by the data they were trained on. Without re-training, such models are not able to acquire new relevant knowledge. We propose one of the possible solutions to this problem, in which the neural network model will be able to use the knowledge received from the Internet and long-term memory to generate dialogue responses. Using these methods we improved BLEU-1 metric by 43% and BLEU-2 metric by 45% on Toloka Persona Chat Rus dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Adiwardana, D., et al.: Towards a human-like open-domain chatbot. arXiv preprint arXiv:2001.09977 (2020)
Adolphs, L., Shuster, K., Urbanek, J., Szlam, A., Weston, J.: Reason first, then respond: modular generation for knowledge-infused dialogue. arXiv preprint arXiv:2111.05204 (2021)
Efimov, P., Chertok, A., Boytsov, L., Braslavski, P.: SberQuAD – Russian reading comprehension dataset: description and analysis. In: Arampatzis, A., et al. (eds.) CLEF 2020. LNCS, vol. 12260, pp. 3–15. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58219-7_1
Glaese, A., et al.: Improving alignment of dialogue agents via targeted human judgements. arXiv preprint arXiv:2209.14375 (2022)
Izacard, G., Grave, E.: Leveraging passage retrieval with generative models for open domain question answering. arXiv preprint arXiv:2007.01282 (2020)
Karpukhin, V., et al.: Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906 (2020)
Komeili, M., Shuster, K., Weston, J.: Internet-augmented dialogue generation. arXiv preprint arXiv:2107.07566 (2021)
Lin, Z., Madotto, A., Bang, Y., Fung, P.: The adapter-bot: all-in-one controllable conversational model. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 16081–16083 (2021)
Liu, Q., et al.: You impress me: dialogue generation via mutual persona perception. arXiv preprint arXiv:2004.05388 (2020)
Matveev, Y., Makhnytkina, O., Posokhov, P., Matveev, A., Skrylnikov, S.: Personalizing hybrid-based dialogue agents. Mathematics 10(24), 4657 (2022)
Nakano, R., et al.: WebGPT: browser-assisted question-answering with human feedback. arXiv preprint arXiv:2112.09332 (2021)
Posokhov, P., Apanasovich, K., Matveeva, A., Makhnytkina, O., Matveev, A.: Personalizing dialogue agents for Russian: retrieve and refine. In: 2022 31st Conference of Open Innovations Association (FRUCT), pp. 245–252. IEEE (2022)
Posokhov, P., Matveeva, A., Makhnytkina, O., Matveev, A., Matveev, Y.: Personalizing retrieval-based dialogue agents. In: Prasanna, S.R.M., Karpov, A., Samudravijaya, K., Agrawal, S.S. (eds.) International Conference on Speech and Computer, pp. 554–566. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20980-2_47
Roller, S., et al.: Recipes for building an open-domain chatbot. arXiv preprint arXiv:2004.13637 (2020)
Shuster, K., Komeili, M., Adolphs, L., Roller, S., Szlam, A., Weston, J.: Language models that seek for knowledge: modular search & generation for dialogue and prompt completion. arXiv preprint arXiv:2203.13224 (2022)
Shuster, K., et al.: Blenderbot 3: a deployed conversational agent that continually learns to responsibly engage. arXiv preprint arXiv:2208.03188 (2022)
Thoppilan, R., et al.: Lamda: language models for dialog applications. arXiv preprint arXiv:2201.08239 (2022)
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 1–11 (2017)
Wolf, T., Sanh, V., Chaumond, J., Delangue, C.: TransferTransfo: a transfer learning approach for neural network based conversational agents. arXiv preprint arXiv:1901.08149 (2019)
Xu, J., Szlam, A., Weston, J.: Beyond goldfish memory: long-term open-domain conversation. arXiv preprint arXiv:2107.07567 (2021)
Xu, X., et al.: Long time no see! open-domain conversation with long-term persona memory. arXiv preprint arXiv:2203.05797 (2022)
Zhang, S., et al.: Opt: open pre-trained transformer language models. arXiv preprint arXiv:2205.01068 (2022)
Zhang, X., et al.: Making a MIRACL: multilingual information retrieval across a continuum of languages. arXiv preprint arXiv:2210.09984 (2022)
Zhang, Y., et al.: DialoGPT: large-scale generative pre-training for conversational response generation. arXiv preprint arXiv:1911.00536 (2019)
Acknowledgment
The research was financially supported by the Russian Science Foundations (project 22-11-00128).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Apanasovich, K., Makhnytkina, O., Matveev, Y. (2023). Development and Research of Dialogue Agents with Long-Term Memory and Web Search. In: Karpov, A., Samudravijaya, K., Deepak, K.T., Hegde, R.M., Agrawal, S.S., Prasanna, S.R.M. (eds) Speech and Computer. SPECOM 2023. Lecture Notes in Computer Science(), vol 14338. Springer, Cham. https://doi.org/10.1007/978-3-031-48309-7_32
Download citation
DOI: https://doi.org/10.1007/978-3-031-48309-7_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48308-0
Online ISBN: 978-3-031-48309-7
eBook Packages: Computer ScienceComputer Science (R0)