Abstract
In conversational systems, we can use external knowledge to generate more diverse sentences and make these sentences contain actual knowledge. Leveraging knowledge for conversation system is important but challenging. Firstly, the conversation system needs to find the appropriate knowledge. Secondly, the knowledge needs to be coded effectively and generated into fluent utterances. In this paper, we propose a knowledge-driven conversation system to address the above challenges. This system consists of three modules, namely topic predictor, knowledge selector, and dialogue generator. The topic predictor uses a combination of non-deep learning (coarse-grained) and deep learning (fine-grained) to form a rough recall and fine sorting process, and uses them to predict conversation topics. The knowledge selector aims to find the most appropriate knowledge based on the filtered topics. We propose the Bert2Transformer model as our dialogue generator, which can generate rich and fluent utterances based on contextual and relevant knowledge. On the public corpus KdConv, our system outperforms a strong baseline and achieves state-of-the-art results. In the ablation study, we analyze the effectiveness of the proposed components in detail and investigate the performance factors that may affect the knowledge-driven conversation generation. Experimental results show the proposed system achieves a significant improvement compared with traditional baseline methods. The average BLEU scores of our system achieve 35.92 and 23.24, respectively, given appropriate knowledge and without appropriate knowledge, while the Distinct-2 scores of our system achieve 16.32 and 15.93, respectively. The training corpus is publicly available (https://github.com/wulaoshi/dialogue_train_data).










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
In this paper, a piece of knowledge means a sentence joined together by a triple knowledge.
References
Bengio Y, Ducharme R, Vincent P, Janvin C (2003) A neural probabilistic language model. J Mach Learn Res 3:1137–1155 http://jmlr.org/papers/v3/bengio03a.html
Chen C, Peng J, Wang F, Xu J, Wu H (2019) Generating multiple diverse responses with multi-mapping and posterior mapping selection. In: Kraus S (ed) Proceedings of the twenty-eighth international joint conference on artificial intelligence, IJCAI 2019, Macao, China, August 10–16, 2019, pp 4918–4924, https://doi.org/10.24963/ijcai.2019/683
Chen Q, Zhu X, Ling Z, Inkpen D, Wei S (2018) Neural natural language inference models enhanced with external knowledge. In: Gurevych I, Miyao Y (eds) Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, Association for Computational Linguistics, pp 2406–2417, https://doi.org/10.18653/v1/P18-1224
Chen W, Gong Y, Xu C, Hu H, Yao B, Wei Z, Fan Z, Hu X, Zhou B, Cheng B, Jiang D, Duan N (2021) Contextual fine-to-coarse distillation for coarse-grained response selection in open-domain conversations. CoRR arXiv:2109.13087
Conneau A, Rinott R, Lample G, Williams A, Bowman SR, Schwenk H, Stoyanov V (2018) XNLI: evaluating cross-lingual sentence representations. In: Riloff E, Chiang D, Hockenmaier J, Tsujii J (eds) Proceedings of the 2018 conference on empirical methods in natural language processing, Brussels, Belgium, October 31 - November 4, 2018, Association for Computational Linguistics, pp 2475–2485, https://doi.org/10.18653/v1/d18-1269
Cui Y, Che W, Liu T, Qin B, Yang Z, Wang S, Hu G (2019) Pre-training with whole word masking for chinese bert. arXiv preprint arXiv:1906.08101
Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein J, Doran C, Solorio T (eds) Proceedings of the 2019 Conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Volume 1 (Long and Short Papers), Association for Computational Linguistics, pp 4171–4186, https://doi.org/10.18653/v1/n19-1423
Dogra V, Singh A, Verma S, Jhanjhi N, Talib M et al (2021) Analyzing distilbert for sentiment classification of banking financial news. Intelligent computing and innovation on data science. Springer, Berlin, pp 501–510
Ghazvininejad M, Brockett C, Chang M, Dolan B, Gao J, Yih W, Galley M (2018) A knowledge-grounded neural conversation model. In: McIlraith SA, Weinberger KQ (eds) Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2–7, 2018, AAAI Press, pp 5110–5117, https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16710
Jin X, Lei W, Ren Z, Chen H, Liang S, Zhao Y, Yin D (2018) Explicit state tracking with semi-supervisionfor neural dialogue generation. In: Proceedings of the 27th ACM international conference on information and knowledge management, pp 1403–1412
Joachims T (1997) A probabilistic analysis of the rocchio algorithm with TFIDF for text categorization. In: Fisher DH (ed) Proceedings of the fourteenth international conference on machine learning (ICML 1997), Nashville, Tennessee, USA, July 8–12, 1997, Morgan Kaufmann, pp 143–151
Lei W, Jin X, Kan MY, Ren Z, He X, Yin D (2018) Sequicity: Simplifying task-oriented dialogue systems with single sequence-to-sequence architectures. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 1437–1447
Lei W, He X, de Rijke M, Chua TS (2020a) Conversational recommendation: Formulation, methods, and evaluation. In: Proceedings of the 43rd International ACM SIGIR conference on research and development in information retrieval, association for computing machinery, New York, NY, USA, SIGIR ’20, pp 2425–2428, https://doi.org/10.1145/3397271.3401419
Lei W, Zhang G, He X, Miao Y, Wang X, Chen L, Chua TS (2020b) Interactive path reasoning on graph for conversational recommendation. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery and data mining, association for computing machinery, New York, KDD ’20, p 2073–2083, https://doi.org/10.1145/3394486.3403258
Li J, Galley M, Brockett C, Gao J, Dolan B (2016a) A diversity-promoting objective function for neural conversation models. In: Knight K, Nenkova A, Rambow O (eds) NAACL HLT 2016, The 2016 Conference of the North American chapter of the association for computational linguistics: human language technologies, San Diego California, USA, June 12–17, 2016, The Association for Computational Linguistics, pp 110–119, https://doi.org/10.18653/v1/n16-1014
Li J, Monroe W, Jurafsky D (2016b) A simple, fast diverse decoding algorithm for neural generation. arXiv preprint arXiv:1611.08562
Li Y, Su H, Shen X, Li W, Cao Z, Niu S (2017) Dailydialog: a manually labelled multi-turn dialogue dataset. CoRR arXiv:1710.03957
Liang H, Lei W, Chan PY, Yang Z, Sun M, Chua TS (2020) Pirhdy: Learning pitch-, rhythm-, and dynamics-aware embeddings for symbolic music. In: Proceedings of the 28th ACM international conference on multimedia, pp 574–582
Ling Y, Cai F, Hu X, Liu J, Chen W, Chen H (2021) Context-controlled topic-aware neural response generation for open-domain dialog systems. Inf Process Manag 58(1):102392. https://doi.org/10.1016/j.ipm.2020.102392
Liu D, Gong Y, Yan Y, Fu J, Shao B, Jiang D, Lv J, Duan N (2020a) Diverse, controllable, and keyphrase-aware: A corpus and method for news multi-headline generation. In: Webber B, Cohn T, He Y, Liu Y (eds) Proceedings of the 2020 conference on empirical methods in natural language processing, EMNLP 2020, Online, November 16–20, 2020, association for computational linguistics, pp 6241–6250, https://doi.org/10.18653/v1/2020.emnlp-main.505
Liu D, Yan Y, Gong Y, Qi W, Zhang H, Jiao J, Chen W, Fu J, Shou L, Gong M, Wang P, Chen J, Jiang D, Lv J, Zhang R, Wu W, Zhou M, Duan N (2020b) GLGE: a new general language generation evaluation benchmark. CoRR arXiv:2011.11928
Liu D, Yan Y, Gong Y, Qi W, Zhang H, Jiao J, Chen W, Fu J, Shou L, Gong M, Wang P, Chen J, Jiang D, Lv J, Zhang R, Wu W, Zhou M, Duan N (2021) GLGE: a new general language generation evaluation benchmark. In: Zong C, Xia F, Li W, Navigli R (eds) Findings of the association for computational linguistics: ACL/IJCNLP 2021, Online Event, August 1–6, 2021, Association for Computational Linguistics, Findings of ACL, vol ACL/IJCNLP 2021, pp 408–420, https://doi.org/10.18653/v1/2021.findings-acl.36
Liu Z, Wang H, Niu Z, Wu H, Che W, Liu T (2020c) Towards conversational recommendation over multi-type dialogs. In: Jurafsky D, Chai J, Schluter N, Tetreault JR (eds) Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5–10, 2020, Association for Computational Linguistics, pp 1036–1049, https://doi.org/10.18653/v1/2020.acl-main.98
Loshchilov I, Hutter F (2017) Fixing weight decay regularization in adam. CoRR arXiv:1711.05101
Meng C, Ren P, Chen Z, Sun W, Ren Z, Tu Z, de Rijke M (2020) Dukenet: A dual knowledge interaction network for knowledge-grounded conversation. In: Huang J, Chang Y, Cheng X, Kamps J, Murdock V, Wen J, Liu Y (eds) Proceedings of the 43rd International ACM SIGIR conference on research and development in information retrieval, SIGIR 2020, Virtual Event, China, July 25–30, 2020, ACM, pp 1151–1160, https://doi.org/10.1145/3397271.3401097
Mihaylov T, Frank A (2018) Knowledgeable reader: Enhancing cloze-style reading comprehension with external commonsense knowledge. In: Gurevych I, Miyao Y (eds) Proceedings of the 56th annual meeting of the association for computational linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, Association for Computational Linguistics, pp 821–832, https://doi.org/10.18653/v1/P18-1076
Nogueira R, Yang W, Cho K, Lin J (2019) Multi-stage document ranking with BERT. CoRR arXiv:1910.14424
Papineni K, Roukos S, Ward T, Zhu W (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics, July 6–12, 2002, Philadelphia, PA, USA, ACL, pp 311–318, https://doi.org/10.3115/1073083.1073135
Qi W, Yan Y, Gong Y, Liu D, Duan N, Chen J, Zhang R, Zhou M (2020) Prophetnet: Predicting future n-gram for sequence-to-sequence pre-training. In: Cohn T, He Y, Liu Y (eds) Proceedings of the 2020 conference on empirical methods in natural language processing: findings, EMNLP 2020, Online Event, 16–20 November 2020, Association for Computational Linguistics, pp 2401–2410, https://doi.org/10.18653/v1/2020.findings-emnlp.217
Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I (2019) Language models are unsupervised multitask learners. OpenAI blog 1(8):9
Rashkin H, Smith EM, Li M, Boureau Y (2019) Towards empathetic open-domain conversation models: A new benchmark and dataset. In: Korhonen A, Traum DR, Màrquez L (eds) Proceedings of the 57th conference of the association for computational linguistics, ACL 2019, Florence, Italy, July 28– August 2, 2019, Volume 1: long papers, association for computational linguistics, pp 5370–5381, https://doi.org/10.18653/v1/p19-1534
Reimers N, Gurevych I (2019) Sentence-bert: Sentence embeddings using siamese bert-networks. In: Inui K, Jiang J, Ng V, Wan X (eds) Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3–7, 2019, Association for Computational Linguistics, pp 3980–3990, https://doi.org/10.18653/v1/D19-1410
Robertson SE, Zaragoza H (2009) The probabilistic relevance framework: BM25 and beyond. Found Trends Inf Retr 3(4):333–389. https://doi.org/10.1561/1500000019
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Guyon I, von Luxburg U, Bengio S, Wallach HM, Fergus R, Vishwanathan SVN, Garnett R (eds) Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, December 4–9, 2017, Long Beach, CA, USA, pp 5998–6008, http://papers.nips.cc/paper/7181-attention-is-all-you-need
Vinyals O, Le Q (2015) A neural conversational model. arXiv preprint arXiv:1506.05869
Wang F, Li X, Lei W, Huang C, Yin M, Pong TC (2015) Constructing learning maps for lecture videos by exploring wikipedia knowledge. Pacific Rim Conference on Multimedia. Springer, Berlin, pp 559–569
Wang Y, Ke P, Zheng Y, Huang K, Jiang Y, Zhu X, Huang M (2020) A large-scale chinese short-text conversation dataset. In: Zhu X, Zhang M, Hong Y, He R (eds) Natural language processing and Chinese computing - 9th CCF international conference, NLPCC 2020, Zhengzhou, China, October 14–18, 2020, Proceedings, Part I, Springer, lecture notes in computer science, vol 12430, pp 91–103, https://doi.org/10.1007/978-3-030-60450-9_8
Wu Y, Wu W, Xing C, Zhou M, Li Z (2017) Sequential matching network: A new architecture for multi-turn response selection in retrieval-based chatbots. In: Barzilay R, Kan M (eds) Proceedings of the 55th annual meeting of the association for computational linguistics, ACL 2017, Vancouver, Canada, July 30–August 4, Volume 1: Long papers, association for computational linguistics, pp 496–505, https://doi.org/10.18653/v1/P17-1046
Xue M, Zhang H, Lv J (2020) Key factors of email subject generation. In: Yang H, Pasupa K, Leung AC, Kwok JT, Chan JH, King I (eds) Neural Information Processing - 27th international conference, ICONIP 2020, Bangkok, Thailand, November 18–22, 2020, Proceedings, Part IV, Springer, communications in computer and information science, vol 1332, pp 668–675, https://doi.org/10.1007/978-3-030-63820-7_76
You Y, Li J, Reddi SJ, Hseu J, Kumar S, Bhojanapalli S, Song X, Demmel J, Keutzer K, Hsieh C (2020) Large batch optimization for deep learning: training BERT in 76 minutes. In: 8th international conference on learning representations, ICLR 2020, Addis Ababa, Ethiopia, April 26–30, 2020, OpenReview.net https://openreview.net/forum?id=Syx4wnEtvH
Yuan C, Zhou W, Li M, Lv S, Zhu F, Han J, Hu S (2019) Multi-hop selector network for multi-turn response selection in retrieval-based chatbots. In: Inui K, Jiang J, Ng V, Wan X (eds) Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3–7, 2019, Association for Computational Linguistics, pp 111–120, https://doi.org/10.18653/v1/D19-1011
Zhang H, Liu D, Lv J, Luo C (2020) Let’s be humorous: knowledge enhanced humor generation. In: Rijhwani S, Liu J, Wang Y, Dror R (eds) Proceedings of the 58th annual meeting of the association for computational linguistics: student research workshop, ACL 2020, Online, July 5–10, 2020, Association for Computational Linguistics, pp 156–161, https://www.aclweb.org/anthology/2020.acl-srw.21/
Zhang H, Gong Y, Shen Y, Li W, Lv J, Duan N, Chen W (2021a) Poolingformer: long document modeling with pooling attention. In: ICML
Zhang H, Gong Y, Shen Y, Lv J, Duan N, Chen W (2021b) Adversarial retriever-ranker for dense text retrieval. arXiv preprint arXiv:2110.03611
Zhang S, Dinan E, Urbanek J, Szlam A, Kiela D, Weston J (2018a) Personalizing dialogue agents: I have a dog, do you have pets too? In: Gurevych I, Miyao Y (eds) Proceedings of the 56th annual meeting of the association for computational linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, Association for Computational Linguistics, pp 2204–2213, https://doi.org/10.18653/v1/P18-1205
Zhang Y, Zhang X, Wang J, Liang H, Lei W, Sun Z, Jatowt A, Yang Z (2021c) Generalized relation learning with semantic correlation awareness for link prediction. In: Proceedings of the AAAI Conference on Artificial Intelligence
Zhang Z, Li J, Zhu P, Zhao H, Liu G (2018b) Modeling multi-turn conversation with deep utterance aggregation. In: Bender EM, Derczynski L, Isabelle P (eds) Proceedings of the 27th international conference on computational linguistics, COLING 2018, Santa Fe, New Mexico, USA, August 20–26, 2018, association for computational linguistics, pp 3740–3752, https://aclanthology.org/C18-1317/
Zhang Z, Han X, Liu Z, Jiang X, Sun M, Liu Q (2019) ERNIE: enhanced language representation with informative entities. In: Korhonen A, Traum DR, Màrquez L (eds) Proceedings of the 57th conference of the association for computational linguistics, ACL 2019, Florence, Italy, July 28–August 2, 2019, Volume 1: Long papers, association for computational linguistics, pp 1441–1451, https://doi.org/10.18653/v1/p19-1139
Zheng C, Cao Y, Jiang D, Huang M (2020) Difference-aware knowledge selection for knowledge-grounded conversation generation. In: Cohn T, He Y, Liu Y (eds) Proceedings of the 2020 conference on empirical methods in natural language processing: findings, EMNLP 2020, Online Event, 16-20 November 2020, Association for Computational Linguistics, pp 115–125, https://doi.org/10.18653/v1/2020.findings-emnlp.11
Zhou H, Zheng C, Huang K, Huang M, Zhu X (2020) Kdconv: a chinese multi-domain dialogue dataset towards multi-turn knowledge-driven conversation. In: Jurafsky D, Chai J, Schluter N, Tetreault JR (eds) Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5-10, 2020, association for computational linguistics, pp 7098–7108, https://doi.org/10.18653/v1/2020.acl-main.635
Acknowledgements
This work is supported by the Key Program of National Science Foundation of China (Grant No. 61836006) and partially supported by National Natural Science Fund for Distinguished Young Scholar (Grant No. 61625204).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Luo, C., Liu, D., Li, C. et al. Prediction, selection, and generation: a knowledge-driven conversation system. Neural Comput & Applic 34, 20431–20446 (2022). https://doi.org/10.1007/s00521-022-07314-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07314-1