An Empirical Study on Context Length for Open-Domain Dialog Generation

Shen, Xinyi; Lin, Zuoquan

doi:10.1007/978-981-99-7022-3_29

Xinyi Shen¹² &
Zuoquan Lin¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14326))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

452 Accesses

Abstract

Transformer-based open-domain dialog models have become increasingly popular in recent years. These models typically represent context as a concatenation of a dialog history. However, there is no criterion to decide how many utterances should be kept adequate in a context. We try to figure out how the choice of context length affects the model. We experiment on three questions from coarse to fine: (i) Does longer context help model training? (ii) Is it necessary to change the training context length when dealing with dialogs of different context lengths? (iii) Do different dialog samples have the same preference for context length? Our experimental results show that context length, an often overlooked setting, deserves attention when implementing Transformer-based dialog models. Code is available at https://github.com/PKUAI-LINGroup/context-study.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Adiwardana, D., et al.: Towards a human-like open-domain chatbot. arXiv preprint arXiv:2001.09977 (2020)
Bao, S., He, H., Wang, F., Wu, H., Wang, H.: PLATO: pre-trained dialogue generation model with discrete latent variable. In: Proceedings of ACL (2020)
Google Scholar
Kim, H., Kim, B., Kim, G.: Will I sound like me? Improving persona consistency in dialogues through pragmatic self-consciousness. In: Proceedings of EMNLP (2020)
Google Scholar
Li, J., Galley, M., Brockett, C., Gao, J., Dolan, B.: A diversity-promoting objective function for neural conversation models. In: Proceedings of NAACL (2016)
Google Scholar
Li, Y., Su, H., Shen, X., Li, W., Cao, Z., Niu, S.: DailyDialog: a manually labelled multi-turn dialogue dataset. In: Proceedings of IJCNLP (2017)
Google Scholar
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: Proceedings of ICLR (2019)
Google Scholar
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog (2019)
Google Scholar
Saleh, A., Deutsch, T., Casper, S., Belinkov, Y., Shieber, S.: Probing neural dialog models for conversational understanding. In: Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI (2020)
Google Scholar
Sankar, C., Subramanian, S., Pal, C., Chandar, S., Bengio, Y.: Do neural dialog systems use the conversation history effectively? An empirical study. In: Proceedings of ACL (2019)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Proceedings of NeurIPS (2017)
Google Scholar
Wolf, T., Sanh, V., Chaumond, J., Delangue, C.: Transfertransfo: a transfer learning approach for neural network based conversational agents. arXiv preprint arXiv:1901.08149 (2019)
Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: Proceedings of EMNLP (2020)
Google Scholar
Zhang, S., Dinan, E., Urbanek, J., Szlam, A., Kiela, D., Weston, J.: Personalizing dialogue agents: I have a dog, do you have pets too? In: Proceedings of ACL (2018)
Google Scholar
Zhang, Y., et al.: DIALOGPT: large-scale generative pre-training for conversational response generation. In: Proceedings of ACL (2020)
Google Scholar

Download references

Acknowledgements

This work was supported by the NSFC under grant numbers 62086009/61732001.

Author information

Authors and Affiliations

Information and Computation Science Department, Peking University, Beijing, China
Xinyi Shen & Zuoquan Lin

Authors

Xinyi Shen
View author publications
You can also search for this author in PubMed Google Scholar
Zuoquan Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xinyi Shen or Zuoquan Lin .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Fenrong Liu
SEEK Limited, Cremorne, NSW, Australia
Arun Anand Sadanandan
MIMOS (Malaysia), Kuala Lumpur, Malaysia
Duc Nghia Pham
Universitas Indonesia, Depok, Indonesia
Petrus Mursanto
Tabcorp Holdings Limited, Melbourne, VIC, Australia
Dickson Lukose

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shen, X., Lin, Z. (2024). An Empirical Study on Context Length for Open-Domain Dialog Generation. In: Liu, F., Sadanandan, A.A., Pham, D.N., Mursanto, P., Lukose, D. (eds) PRICAI 2023: Trends in Artificial Intelligence. PRICAI 2023. Lecture Notes in Computer Science(), vol 14326. Springer, Singapore. https://doi.org/10.1007/978-981-99-7022-3_29

Download citation

DOI: https://doi.org/10.1007/978-981-99-7022-3_29
Published: 10 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7021-6
Online ISBN: 978-981-99-7022-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An Empirical Study on Context Length for Open-Domain Dialog Generation