Abstract
Selecting appropriate information from the dialogue history and the document is a prerequisite for a high-quality response in document grounded dialogue generation task. Most of the existing works take dialogue history information as a sequence to interact with documents. In fact, dialogue history has an internal hierarchical structure, which can provide constraints on the selection of information and the interaction with document information. Therefore, this paper proposes a model that uses the hierarchical structure of dialogue history for key information selection. The main idea is to locate important information in history and document by merging both word-level and utterance-level attention of history, and then to generate a better response. The experimental results on two public data sets show that our method significantly outperforms the baseline models.
Similar content being viewed by others
Notes
We use NLG evaluation toolkit [18] from https://github.com/Maluuba/nlg-eval.
We use the code published at https://github.com/facebookresearch/ParlAI/blob/master/parlai/core/metrics.py to calculate uni-gram F1.
References
Chen K, Zhang Z, Long J, Zhang H (2016) Turning from TF-IDF to TF-IGM for term weighting in text classification. Expert Syst Appl 66:245–260. https://doi.org/10.1016/j.eswa.2016.09.009
Denkowski MJ, Lavie A (2011) Meteor 1.3: Automatic metric for reliable optimization and evaluation of machine translation systems. In: Callison-Burch C, Koehn P, Monz C, Zaidan O (eds) Proceedings of the sixth workshop on statistical machine translation, WMT@EMNLP 2011. https://aclanthology.org/W11-2107/. Association for Computational Linguistics, Edinburgh, pp 85–91
Dinan E, Roller S, Shuster K, Fan A, Auli M, Weston J (2019) Wizard of wikipedia: Knowledge-powered conversational agents. In: 7th International conference on learning representations, ICLR 2019. https://openreview.net/forum?id=r1l73iRqKm. OpenReview.net, New Orleans
Dong L, Yang N, Wang W, Wei F, Liu X, Wang Y, Gao J, Zhou M, Hon H (2019) Unified language model pre-training for natural language understanding and generation. In: Wallach HM, Larochelle H, Beygelzimer A, D’Alché-Buc F, Fox EB, Garnett R (eds) Advances in neural information processing systems 32: Annual conference on neural information processing systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pp 13,042–13,054. https://proceedings.neurips.cc/paper/2019/hash/c20bb2d9a50d5ac1f713f8b34d9aac5a-Abstract.html
Feng S, Wan H, Gunasekara RC, Patel SS, Joshi S, Lastras LA (2020) Doc2dial: a goal-oriented document-grounded dialogue dataset. In: Webber B, Cohn T, He Y, Liu Y (eds) Proceedings of the 2020 conference on empirical methods in natural language processing, EMNLP 2020, Online, November 16-20, 2020. Association for Computational Linguistics, pp 8118–8128, https://doi.org/10.18653/v1/2020.emnlp-main.652
Gu J, Ling Z, Liu Q, Chen Z, Zhu X (2020) Filtering before iteratively referring for knowledge-grounded response selection in retrieval-based chatbots. In: Cohn T, He Y, Liu Y (eds) Findings of the association for computational linguistics: EMNLP 2020, Online Event, 16-20 November 2020, Findings of ACL, vol. EMNLP 2020. https://doi.org/10.18653/v1/2020.findings-emnlp.127. Association for Computational Linguistics, pp 1412–1422
Kim B, Ahn J, Kim G (2020) Sequential latent knowledge selection for knowledge-grounded dialogue. In: 8th International conference on learning representations, ICLR 2020. https://openreview.net/forum?id=Hke0K1HKwr. OpenReview.net, Addis Ababa
Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2020) BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Jurafsky D, Chai J, Schluter N, Tetreault JR (eds) Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5-10, 2020. https://doi.org/10.18653/v1/2020.acl-main.703. Association for Computational Linguistics, pp 7871–7880
Li K, Bai Z, Wang X, Yuan C (2019) A document driven dialogue generation model. In: Sun M, Huang X, Ji H, Liu Z, Liu Y (eds) Chinese computational linguistics - 18th China National Conference, CCL 2019, Kunming, China, October 18-20, 2019, Proceedings, Lecture Notes in Computer Science, vol 11856. Springer, pp 508–520. https://doi.org/10.1007/978-3-030-32381-3_41
Li L, Xu C, Wu W, Zhao Y, Zhao X, Tao C (2020) Zero-resource knowledge-grounded dialogue generation. In: Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H (eds) Advances in neural information processing systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. https://proceedings.neurips.cc/paper/2020/hash/609c5e5089a9aa967232aba2a4d03114-Abstract.html
Li Z, Niu C, Meng F, Feng Y, Li Q, Zhou J (2019) Incremental transformer with deliberation decoder for document grounded conversations. In: Korhonen A, Traum DR, Màrquez L (eds) Proceedings of the 57th conference of the association for computational linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers. Association for Computational Linguistics, pp 12–21, https://doi.org/10.18653/v1/p19-1002
Lin CY (2004) Rouge: a package for automatic evaluation of summaries. In: Text summarization branches out, pp 74–81
Moghe N, Arora S, Banerjee S, Khapra MM (2018) Towards exploiting background knowledge for building conversation systems. In: Riloff E, Chiang D, Hockenmaier J, Tsujii J (eds) Proceedings of the 2018 conference on empirical methods in natural language processing, Brussels, Belgium, October 31 - November 4, 2018. Association for Computational Linguistics, pp 2322–2332, https://doi.org/10.18653/v1/d18-1255
Papineni K, Roukos S, Ward T, Zhu W (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics. https://doi.org/10.3115/1073083.1073135. https://aclanthology.org/P02-1040/. ACL, Philadelphia, pp 311–318
Prabhumoye S, Hashimoto K, Zhou Y, Black AW, Salakhutdinov R (2021) Focused attention improves document-grounded generation. In: Toutanova K, Rumshisky A, Zettlemoyer L, Hakkani-Tür D, Beltagy I, Bethard S, Cotterell R, Chakraborty T, Zhou Y (eds) Proceedings of the 2021 conference of the north american chapter of the association for computational linguistics: Human Language Technologies, NAACL-HLT 2021, Online, June 6-11, 2021. Association for Computational Linguistics, pp 4274–4287, https://doi.org/10.18653/v1/2021.naacl-main.338
Ren P, Chen Z, Monz C, Ma J, de Rijke M (2020) Thinking globally, acting locally: Distantly supervised global-to-local knowledge selection for background based conversation. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. https://ojs.aaai.org/index.php/AAAI/article/view/6395. AAAI Press, pp 8697–8704
Serban IV, Sordoni A, Bengio Y, Courville AC, Pineau J (2016) Building end-to-end dialogue systems using generative hierarchical neural network models. In: Schuurmans D, Wellman MP (eds) Proceedings of the Thirtieth AAAI conference on artificial intelligence, February 12-17, 2016, Phoenix, Arizona, USA. http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/11957. AAAI Press, pp 3776–3784
Sharma S, Asri LE, Schulz H, Zumer J (2017) Relevance of unsupervised metrics in task-oriented dialogue for evaluating natural language generation. arXiv:1706.09799
Shen L, Zhan H, Shen X, Feng Y (2021) Learning to select context in a hierarchical and global perspective for open-domain dialogue generation. In: IEEE International conference on acoustics, speech and signal processing, ICASSP 2021, toronto, ON, Canada, June 6-11, 2021. IEEE, pp 7438–7442, https://doi.org/10.1109/ICASSP39728.2021.9414730
Wang T, Guo J, Wu Z, Xu T (2021) IFTA: Iterative filtering by using TF-AICL algorithm for chinese encyclopedia knowledge refinement. Appl. Intell. 51(8):6265–6293. https://doi.org/10.1007/s10489-021-02220-w
Wang T, Li J, Guo J (2021) A scalable parallel chinese online encyclopedia knowledge denoising method based on entry tags and spark cluster. Appl. Intell. 51(10):7573–7599. https://doi.org/10.1007/s10489-021-02295-5
Wu Z, Galley M, Brockett C, Zhang Y, Gao X, Quirk C, Koncel-Kedziorski R, Gao J, Hajishirzi H, Ostendorf M, Dolan B (2021) A controllable model of grounded response generation. In: Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February 2-9, 2021. https://ojs.aaai.org/index.php/AAAI/article/view/17658. AAAI Press, pp 14,085–14,093
Xing C, Wu Y, Wu W, Huang Y, Zhou M (2018) Hierarchical recurrent attention network for response generation. In: McIlraith SA, Weinberger KQ (eds) Proceedings of the Thirty-Second AAAI conference on artificial intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI symposium on educational advances in artificial intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16510. AAAI Press, pp 5610–5617
Zhang H, Lan Y, Pang L, Guo J, Cheng X (2019) Recosa: Detecting the relevant contexts with self-attention for multi-turn dialogue generation. In: Korhonen A, Traum DR, Màrquez L (eds) Proceedings of the 57th conference of the association for computational linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers. Association for Computational Linguistics, pp 3721–3730, https://doi.org/10.18653/v1/p19-1362
Zhao X, Wu W, Tao C, Xu C, Zhao D, Yan R (2020) Low-resource knowledge-grounded dialogue generation. In: 8th International conference on learning representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net. https://openreview.net/forum?id=rJeIcTNtvS
Zheng W, Zhou K (2019) Enhancing conversational dialogue models with grounded knowledge. In: Zhu W, Tao D, Cheng X, Cui P, Rundensteiner EA, Carmel D, He Q, Yu JX (eds) Proceedings of the 28th ACM International Conference on Information and Knowledge Management, CIKM 2019, Beijing, China, November 3-7, 2019. ACM, pp 709–718, https://doi.org/10.1145/3357384.3357889
Zhou K, Prabhumoye S, Black AW (2018) A dataset for document grounded conversations. In: Riloff E, Chiang D, Hockenmaier J, Tsujii J (eds) Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018. Association for Computational Linguistics, pp 708–713, https://doi.org/10.18653/v1/d18-1076
Acknowledgements
We thank the anonymous reviewers for their insightful comments. The research is supported in part by the National Natural Science Foundation of China (Grant No. NSFC62076032) and the BUPT Excellent Ph.D. Students Foundation (Grant No. CX2019002).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix : A
Appendix : A
1.1 A.1: Real data examples for ‘Attention update’ in HDIS
The equations of HDIS in detail is complex, so we give the corresponding real data examples for HDIS in Fig. 8.
In hierarchical history based document information selection (HDIS) module, there are three variables: the word-level history representation Hw, the utterance-level history representation Hu and the word-level document representation Dw.
For ease of understanding, it is assumed here that the document has 5 words, the history has two utterances and each utterance has 3 words, and the dimension represented by each word is 2, we assign relatively simple values for each word representation. Note that the red values indicates that they are related to the first utterance in history, and the green values indicate that they are related to the second utterance of history.
1.2 A.2: Specific values of the key parameters
We give a description of the specific values of the key parameters of our algorithm when the best results are achieved in Fig. 7.
1.3 A.3: Real data examples for static and dynamic ways to obtain utterance-level history
The real data examples for static and dynamic ways to obtain utterance-level history in HHIS mentioned in section ‘Experiments-Implement Details.’ is shown in Fig. 9.
In hierarchical history information selection (HHIS) module, there are two variables: the word-level history representation Hw and the last time step decoding vector st− 1.
For ease of understanding, it is assumed here that the history has two utterances and each utterance has 3 words, and the dimension represented by each word is 2, we assign relatively simple values for each word representation. Note that the red values indicates that they are related to the first utterance in history, and the green values indicate that they are related to the second utterance of history.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, M., Tian, S., Bai, Z. et al. Hierarchical history based information selection for document grounded dialogue generation. Appl Intell 53, 17139–17153 (2023). https://doi.org/10.1007/s10489-022-04373-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-04373-8