Skip to main content
Log in

Interpreting open-domain dialogue generation by disentangling latent feature representations

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Currently end-to-end deep learning based open-domain dialogue systems remain black box models, making it easy to generate irrelevant contents with data-driven models. Specifically, latent variables are highly entangled with different semantics in the latent space due to the lack of priori knowledge to guide the training. To address this problem, this paper proposes to harness the generative model with a priori knowledge through a cognitive approach involving feature disentanglement. Particularly, the model integrates the guided-category knowledge and open-domain dialogue data for the training, leveraging the priori knowledge into the latent space, which enables the model to disentangle the latent variables. Besides, this paper proposes a new metric for open-domain dialogues, which can objectively evaluate the interpretability of the latent space distribution. Finally, this paper validates our model on different datasets and experimentally demonstrate that our model is able to generate higher quality and more interpretable dialogues than other models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data Availability Statement

The datasets generated during and analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Moor JH (2004) The turing test: The elusive standard of artificial intelligence. Comput Linguist 30(1):115–116

    Article  MathSciNet  Google Scholar 

  2. Madotto A, Wu C, Fung P (2018) Mem2seq: Effectively incorporating knowledge bases into end-to-end task-oriented dialog systems. In: Proceedings of the 56th annual meeting of the association for computational linguistics, pp 1468–1478

  3. Qin L, Xu X, Che W, Zhang Y, Liu T (2020) Dynamic fusion network for multi-domain end-to-end task-oriented dialog. In: Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5–10, 2020, pp 6344–6354

  4. Liu Q, Chen Y, Chen B, Lou J, Chen Z, Zhou B, Zhang D (2020) You impress me: dialogue generation via mutual persona perception. In: Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5–10, 2020, pp 1417–1427

  5. Gangadharaiah R, Narayanaswamy B (2020) Recursive template-based frame generation for task oriented dialog. In: Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5–10, 2020, pp 2059–2064

  6. Saha T, Patra AP, Saha S, Bhattacharyya P (2020) Towards emotion-aided multi-modal dialogue act classification. In: Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5–10, 2020, pp 4361–4372

  7. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems 27: annual conference on neural information processing systems 2014, pp 3104–3112

  8. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems 30: annual conference on neural information processing systems, pp 5998–6008

  9. Kingma DP, Welling M (2014) Auto-encoding variational bayes. In: 2nd international conference on learning representations

  10. Sohn K, Lee H, Yan X (2015) Learning structured output representation using deep conditional generative models. In: Advances in neural information processing systems 28: annual conference on neural information processing systems 2015, pp 3483–3491

  11. Wang Y, Liao J, Yu H, Leng J (2022) Semantic-aware conditional variational autoencoder for one-to-many dialogue generation. Neural Comput Appl 1–13

  12. Shi W, Zhou H, Miao N, Li L (2020) Dispersed exponential family mixture vaes for interpretable text generation. In: Proceedings of the 37th international conference on machine learning vol 119, 8840–8851

  13. Bowman SR, Vilnis L, Vinyals O, Dai AM, Józefowicz R, Bengio S (2016) Generating sentences from a continuous space. In: Proceedings of the 20th SIGNLL conference on computational natural language learning, pp 10–21 ( 2016)

  14. Zhao T, Lee K, Eskénazi M (2018) Unsupervised discrete sentence representation learning for interpretable neural dialog generation. In: Proceedings of the 56th annual meeting of the association for computational linguistics, pp 1098–1107

  15. Serban IV, Sordoni A, Lowe R, Charlin L, Pineau J, Courville AC, Bengio Y (2017) A hierarchical latent variable encoder-decoder model for generating dialogues. In: Proceedings of the thirty-first AAAI conference on artificial intelligence, February 4–9, 2017, San Francisco, California, USA, pp 3295– 3301

  16. Zhao T, Zhao R, Eskénazi M (2017) Learning discourse-level diversity for neural dialog models using conditional variational autoencoders. In: Proceedings of the 55th annual meeting of the association for computational linguistics, pp 654–664

  17. Gao J, Bi W, Liu X, Li J, Zhou G, Shi S (2019) A discrete CVAE for response generation on short-text conversation. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, pp 1898–1908

  18. Pang B, Wu YN (2021) Latent space energy-based model of symbol-vector coupling for text generation and classification. In: Proceedings of the 38th international conference on machine learning, vol 139, pp 8359–8370

  19. Higgins I, Matthey L, Pal A, Burgess CP, Glorot X, Botvinick MM, Mohamed S, Lerchner A (2017) beta-vae: Learning basic visual concepts with a constrained variational framework. In: 5th international conference on learning representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track Proceedings

  20. Mathieu E, Rainforth T, Siddharth N, Teh YW (2019) Disentangling disentanglement in variational autoencoders. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th international conference on machine learning, ICML 2019, 9–15 June 2019, Long Beach, California, USA. Proceedings of Machine Learning Research 97:4402–4412

  21. Chen S, Yan J, Su Y, Wang YF (2021) Representation decomposition for image manipulation and beyond. In: 2021 IEEE international conference on image processing, ICIP 2021, Anchorage, AK, USA, September 19-22, 2021, pp 1169–1173

  22. Li Y, Su H, Shen X, Li W, Cao Z, Niu S (2017) Dailydialog: a manually labelled multi-turn dialogue dataset. In: Proceedings of the eighth international joint conference on natural language processing, pp 986– 995

  23. Rashkin H, Smith EM, Li M, Boureau Y (2019) Towards empathetic open-domain conversation models: A new benchmark and dataset. In: Proceedings of the 57th conference of the association for computational linguistics, ACL 2019, Florence, Italy, July 28–August 2, 2019, Volume 1: Long Papers, pp 5370–5381

  24. Papineni K, Roukos S, Ward T, Zhu W (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics, pp 311–318

  25. Lin C-Y (2004) Rouge: A package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81

  26. Banerjee S, Lavie A (2005) METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization@ACL 2005, pp 65–72

  27. Li J, Galley M, Brockett C, Gao J, Dolan B (2016) A diversity-promoting objective function for neural conversation models. In: The 2016 conference of the north american chapter of the association for computational linguistics: human language technologies, pp 110–119

  28. Luong T, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 1412–1421

  29. Wang Y, Wang H, Zhang X, Chaspari T, Choe Y, Lu M (2019) An attention-aware bidirectional multi-residual recurrent neural network (abmrnn): A study about better short-term text classification. In: IEEE international conference on acoustics, speech and signal processing, ICASSP 2019, Brighton, United Kingdom, May 12–17, 2019, pp 3582–3586

  30. Jiang Z, Zheng Y, Tan H, Tang B, Zhou H (2017) Variational deep embedding: an unsupervised and generative approach to clustering. In: Proceedings of the twenty-sixth international joint conference on artificial intelligence, pp 1965–1972

  31. Wang Y, Zhang X, Lu M, Wang H, Choe Y (2020) Attention augmentation with multi-residual in bidirectional LSTM. Neurocomputing 385:340–347

    Article  Google Scholar 

Download references

Acknowledgements

This work was partly supported by National Key R &D Program of China (2021YFF0704100), the National Natural Science Foundation of China (62136002, 61936001 and 61876027), the Science and Technology Research Program of Chongqing Municipal Education Commission (KJQN202100627 and KJQN202100629), and the National Natural Science Foundation of Chongqing (cstc2019jcyj-cxttX0002,cstc2022ycjh-bgzxm0004), respectively.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hong Yu.

Ethics declarations

Conflict of interest

We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Liao, J., Yu, H. et al. Interpreting open-domain dialogue generation by disentangling latent feature representations. Neural Comput & Applic 35, 20855–20867 (2023). https://doi.org/10.1007/s00521-023-08815-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08815-3

Keywords

Navigation