skip to main content
10.1145/3446132.3446168acmotherconferencesArticle/Chapter ViewAbstractPublication PagesacaiConference Proceedingsconference-collections
research-article

Leveraging Different Context for Response Generation through Topic-guided Multi-head Attention

Published: 09 March 2021 Publication History

Abstract

Multi-turn dialogue system plays an important role in intelligent interaction. In particular, the subtask response generation in a multi- turn conversation system is a challenging task, which aims to generate more diverse and contextually relevant responses. Most of the methods focus on the sequential connection between sentence levels by using hierarchical framework and attention mechanism, but lack reflection from the overall semantic level such as topical information. Previous work would lead to a lack of full understanding of the dialogue history. In this paper, we propose a context-augmented model, named TGMA-RG, which leverages the conversational context to promote interactivity and persistence of multi-turn dialogues through topic-guided multi-head attention mechanism. Especially, we extract the topics from conversational context and design a hierarchical encoder-decoder models with a multi-head attention mechanism. Among them, we utilize topics vectors as queries of attention mechanism to obtain the corresponding weights between each utterance and each topic. Our experimental results on two publicly available datasets show that TGMA-RG improves the performance than other baselines in terms of BLEU-1, BLEU-2, Distinct-1, Distinct-2 and PPL.

References

[1]
Li hong Li, Jian feng Gao, Michel Galley. Neural Approaches to Conversational AI. SIGIR 2018.
[2]
Mikolov T, Chen K, Corrado G, Efficient estimation of word representations in vector space [J]. arXiv preprint arXiv 130137812013.
[3]
Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks[C] Proceedings of the Advances in Neural Information Processing Systems, 2014: 3104-3112.
[4]
Sordoni A, Bengio Y, Vahabi H, A hierarchical recurrent encoder-decoder for generative context-aware query suggestion. CIKM 2015
[5]
Jafarpour, S., Burges, C. J., and Ritter, A. (2010). Filter, rank, and transfer the knowledge: Learning to chat. Advances in Ranking, 10:2329–9290.
[6]
Collby K M. Artificial paranoia: a computer simulation of paranoid process. New York: Elsevier Science Inc, 1975
[7]
Barzilay R, Lee L. Catching the drift: probabilistic content models, with applications to generation and summarization NAACL 2004.
[8]
Angeli G, Liang P, Klein D. A simple domain-independent probabilistic approach to generation EMNLP 2015
[9]
Hochreiter S, Schmidhuber J. Long short-term memory. Neural computation, 1997, 9(8): 1735–1780
[10]
Hongshen Chen, Zhaochun Ren, Jiliang Tang, Yihong Eric Zhao, and Dawei Yin. 2018. Hierarchical Variational Memory Network for Dialogue Generation. In WWW 2018
[11]
I.Serban, A.Sordoni, R.Lowe, L.Charlin, J.Pineau, A. Courville, and Y. Bengio. A hierarchical latent variable encoder-decoder model for generating dialogues. AAAI 2017
[12]
Yookoon Park, Jaemin Cho, and Gunhee Kim. 2018. A hierarchical latent structure for variational conversation modeling. NAACL 2018
[13]
Mnih V, Heess N, Graves A. Recurrent models of visual attention[C]//Advances in Neural Information Processing Systems. 2014: 2204-2212
[14]
Bahdanau, D., Cho, K. & Bengio, Y. Neural Machine Translation by Jointly Learning to Align and Translate. ICLR 2015
[15]
Luong M., Hieu P., Manning C. Effective Approaches to Attention-based Neural Machine Translation. 1412–1421 (2015).
[16]
Yanran Li, Hui Su, Xiaoyu Shen, Wenjie Li, Ziqiang Cao, and Shuzi Niu. DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset. IJCNLP 2017.
[17]
Hao Zhou, Chujie Zheng, Kaili Huang, Minlie Huang, Xiaoyan Zhu KdConv: A Chinese Multi-domain Dialogue Dataset Towards Multi-turn Knowledge-driven Conversation. ACL 2020.
[18]
Jiwei Li, Will Monroe, Alan Ritter, Dan Jurafsky, Michel Galley, and Jianfeng Gao. Deep reinforcement learning for dialogue generation. EMNLP 2016b
[19]
Chen Xing, Wei Wu, Y u Wu, Jie Liu, Yalou Huang, Ming Zhou, and Wei-Ying Ma. Topic aware neural response generation. AAAI 2017
[20]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. Bleu: a Method for Automatic Evaluation of Machine Translation. ACL 2002
[21]
Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan. A Diversity-Promoting Objective Function for Neural Conversation Models. NAACL 2016
[22]
Mikolov T., Karafiát M., Burget L., Cernock`y J., and Khudanpur S. Recurrent neural network based language modsel. INTERSPEECH 2010
[23]
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Y ang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in pytorch. 2017
[24]
Kyunghyun Cho, Bart van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Y oshua Bengio. 2014. Learning phrase representations using RNN encoder–decoder for statistical machine translation. EMNLP 2014
[25]
Chen Xing, Yu Wu, Wei Wu, Yalou Huang, and Ming Zhou. Hierarchical recurrent attention network for response generation. AAAI 2018
[26]
Warren R., Greiff. A Theory of Term Weighting Based on Exploratory Data Analysis. SIGIR 1998.
[27]
Ho Chung Wu, Robert Wing Pong Luk, Kam Fai Wong, and Kui Lam Kwok. Interpreting TF-IDF Term Weights As Making Relevance Decisions. TOIS 2008
[28]
Ashish V aswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. NIPS 2017
[29]
Jon M. Kleinberg. 1999. Authoritative sources in a hyperlinked environment. J. ACM 46, 5 (September 1999), 604–632. https://doi.org/10.1145/324133.324140
[30]
Matthew Van Gundy, Davide Balzarotti, and Giovanni Vigna. 2007. Catch me, if you can: Evading network signatures with web-based polymorphic worms. In Proceedings of the first USENIX workshop on Offensive Technologies (WOOT ’07) . USENIX Association, Berkley, CA, Article 7, 9 pages.
[31]
James W. Demmel, Yozo Hida, William Kahan, Xiaoye S. Li, Soni Mukherjee, and Jason Riedy. 2005. Error Bounds from Extra Precise Iterative Refinement. Technical Report No. UCB/CSD-04-1344. University of California, Berkeley.
[32]
David Harel. 1979. First-Order Dynamic Logic. Lecture Notes in Computer Science, Vol. 68. Springer-Verlag, New York, NY. https://doi.org/10.1007/3-540-09237-4
[33]
Jason Jerald. 2015. The VR Book: Human-Centered Design for Virtual Reality. Association for Computing Machinery and Morgan & Claypool.
[34]
Prokop, Emily. 2018. The Story Behind. Mango Publishing Group. Florida, USA.
[35]
R Core Team. 2019. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
[36]
Brian K. Reid. 1980. A high-level approach to computer document formatting. In Proceedings of the 7th Annual Symposium on Principles of Programming Languages. ACM, New York, 24–31. https://doi.org/10.1145/567446.567449
[37]
John R. Smith and Shih-Fu Chang. 1997. Visual Seek: a fully automated content-based image query system. In Proceedings of the fourth ACM international conference on Multimedia (MULTIMEDIA ’96). Association for Computing Machinery, New York, NY, USA, 87–98. https://doi.org/10.1145/244130.244151
[38]
TUG 2017. Institutional members of the LaTeX Users Group. Retrieved May 27, 2017 from http://wwtug.org/instmem.html
[39]
Alper Yilmaz, Omar Javed, and Mubarak Shah. 2006. Object tracking: A survey. ACM Comput. Surv. 38, 4 (December 2006), 13–es. https://doi.org/10.1145/1177352.1177355

Cited By

View all
  • (2025)Multi-turn Natural Language UnderstandingNatural Language Understanding in Conversational AI with Deep Learning10.1007/978-3-031-74364-1_4(87-110)Online publication date: 12-Jan-2025
  • (2023)Pre-Trained Generative Architectures for Question-Asking Chatbots on Technical Text: A Case Study2023 3rd Asian Conference on Innovation in Technology (ASIANCON)10.1109/ASIANCON58793.2023.10270559(1-6)Online publication date: 25-Aug-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ACAI '20: Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence
December 2020
576 pages
ISBN:9781450388115
DOI:10.1145/3446132
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 March 2021

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ACAI 2020

Acceptance Rates

Overall Acceptance Rate 173 of 395 submissions, 44%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Multi-turn Natural Language UnderstandingNatural Language Understanding in Conversational AI with Deep Learning10.1007/978-3-031-74364-1_4(87-110)Online publication date: 12-Jan-2025
  • (2023)Pre-Trained Generative Architectures for Question-Asking Chatbots on Technical Text: A Case Study2023 3rd Asian Conference on Innovation in Technology (ASIANCON)10.1109/ASIANCON58793.2023.10270559(1-6)Online publication date: 25-Aug-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media