Skip to main content

An Empirical Study on Context Length for Open-Domain Dialog Generation

  • Conference paper
  • First Online:
PRICAI 2023: Trends in Artificial Intelligence (PRICAI 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14326))

Included in the following conference series:

  • 452 Accesses

Abstract

Transformer-based open-domain dialog models have become increasingly popular in recent years. These models typically represent context as a concatenation of a dialog history. However, there is no criterion to decide how many utterances should be kept adequate in a context. We try to figure out how the choice of context length affects the model. We experiment on three questions from coarse to fine: (i) Does longer context help model training? (ii) Is it necessary to change the training context length when dealing with dialogs of different context lengths? (iii) Do different dialog samples have the same preference for context length? Our experimental results show that context length, an often overlooked setting, deserves attention when implementing Transformer-based dialog models. Code is available at https://github.com/PKUAI-LINGroup/context-study.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://chat.openai.com/

  2. 2.

    https://github.com/huggingface/transformers

References

  1. Adiwardana, D., et al.: Towards a human-like open-domain chatbot. arXiv preprint arXiv:2001.09977 (2020)

  2. Bao, S., He, H., Wang, F., Wu, H., Wang, H.: PLATO: pre-trained dialogue generation model with discrete latent variable. In: Proceedings of ACL (2020)

    Google Scholar 

  3. Kim, H., Kim, B., Kim, G.: Will I sound like me? Improving persona consistency in dialogues through pragmatic self-consciousness. In: Proceedings of EMNLP (2020)

    Google Scholar 

  4. Li, J., Galley, M., Brockett, C., Gao, J., Dolan, B.: A diversity-promoting objective function for neural conversation models. In: Proceedings of NAACL (2016)

    Google Scholar 

  5. Li, Y., Su, H., Shen, X., Li, W., Cao, Z., Niu, S.: DailyDialog: a manually labelled multi-turn dialogue dataset. In: Proceedings of IJCNLP (2017)

    Google Scholar 

  6. Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: Proceedings of ICLR (2019)

    Google Scholar 

  7. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. OpenAI blog (2019)

    Google Scholar 

  8. Saleh, A., Deutsch, T., Casper, S., Belinkov, Y., Shieber, S.: Probing neural dialog models for conversational understanding. In: Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI (2020)

    Google Scholar 

  9. Sankar, C., Subramanian, S., Pal, C., Chandar, S., Bengio, Y.: Do neural dialog systems use the conversation history effectively? An empirical study. In: Proceedings of ACL (2019)

    Google Scholar 

  10. Vaswani, A., et al.: Attention is all you need. In: Proceedings of NeurIPS (2017)

    Google Scholar 

  11. Wolf, T., Sanh, V., Chaumond, J., Delangue, C.: Transfertransfo: a transfer learning approach for neural network based conversational agents. arXiv preprint arXiv:1901.08149 (2019)

  12. Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: Proceedings of EMNLP (2020)

    Google Scholar 

  13. Zhang, S., Dinan, E., Urbanek, J., Szlam, A., Kiela, D., Weston, J.: Personalizing dialogue agents: I have a dog, do you have pets too? In: Proceedings of ACL (2018)

    Google Scholar 

  14. Zhang, Y., et al.: DIALOGPT: large-scale generative pre-training for conversational response generation. In: Proceedings of ACL (2020)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the NSFC under grant numbers 62086009/61732001.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Xinyi Shen or Zuoquan Lin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shen, X., Lin, Z. (2024). An Empirical Study on Context Length for Open-Domain Dialog Generation. In: Liu, F., Sadanandan, A.A., Pham, D.N., Mursanto, P., Lukose, D. (eds) PRICAI 2023: Trends in Artificial Intelligence. PRICAI 2023. Lecture Notes in Computer Science(), vol 14326. Springer, Singapore. https://doi.org/10.1007/978-981-99-7022-3_29

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-7022-3_29

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-7021-6

  • Online ISBN: 978-981-99-7022-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics