Abstract
Proper response selection is a crucial challenge in retrieval-based chatbots. The state-of-the-art methods match a response with the word sequence of a context, or match the response with each utterance in the context and then accumulate matching information. The former architecture could lose some important local matching information in utterance–response pairs and does not explicitly capture the relationships and dependencies among utterances. The latter architecture does not consider the important global matching information because there is no match between the response and the context at word level. Hence, the above methods have a problem, without considering the fact that matching a response with different levels of a context could match different information for multi-turn response selection. In this work, we propose a hierarchical matching network to match a response with the word and utterance level of a context. At word level, we concatenate the multi-turn context as a long word sequence and then adopt a text matching model to match the response with the word sequence which can capture important matching information at word level. At utterance level, we employ the identical text matching model to match the response with each utterance in the context to capture important matching information for each utterance–response pair and then accumulate the matching information by a recurrent neural network to model the relationships of utterances. At last, the hierarchical matching information is fused to get the final matching information. Experiments on two large-scale public multi-turn response selection datasets show that the proposed model significantly outperforms the state-of-the-art baseline models.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abro WA, Qi G, Gao H, Khan MA, Ali Z (2019) Multi-turn intent determination for goal-oriented dialogue systems. In: 2019 international joint conference on neural networks (IJCNN), pp 1–8. https://doi.org/10.1109/IJCNN.2019.8852246
Abro WA, Qi G, Ali Z, Feng Y, Aamir M (2020) Multi-turn intent determination and slot filling with neural networks and regular expressions. Knowl Based Syst 208:106428
Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146
Bowman SR, Angeli G, Potts C, Manning CD (2015) A large annotated corpus for learning natural language inference. In: proceedings of the 2015 conference on empirical methods in natural language processing, pp 632–642
Chen Q, Wang W (2019) Sequential matching model for end-to-end multi-turn response selection. In: ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 7350–7354
Chen Q, Zhu X, Ling ZH, Wei S, Jiang H, Inkpen D (2017) Enhanced LSTM for natural language inference. In: proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 1657–1668
Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078
Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS 2014 workshop on deep learning
Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537
Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: proceedings of the 2019 conference of the north american chapter of the association for computational linguistics: human language technologies, Volume 1 (Long and Short Papers), pp 4171–4186
Fu Z, Cui S, Shang M, Ji F, Zhao D, Chen H, Yan R (2020) Context-to-session matching: Utilizing whole session for response selection in information-seeking dialogue systems. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, p 1605–1613
Gu JC, Ling ZH, Liu Q (2019a) Interactive matching network for multi-turn response selection in retrieval-based chatbots. In: proceedings of the 28th ACM international conference on information and knowledge management, pp 2321–2324
Gu X, Cho K, Ha JW, Kim S (2019b) DialogWAE: Multimodal response generation with conditional wasserstein auto-encoder. In: International conference on learning representations
Hu B, Lu Z, Li H, Chen Q (2014) Convolutional neural network architectures for matching natural language sentences. Adv Neural Inf Process Syst 27:2042–2050
Hua K, Feng Z, Tao C, Yan R, Zhang L (2020) Learning to detect relevant contexts and knowledge for response selection in retrieval-based dialogue systems. In: Proceedings of the 29th ACM international conference on information & knowledge management, pp 525–534
Huang PS, He X, Gao J, Deng L, Acero A, Heck L (2013) Learning deep structured semantic models for web search using clickthrough data. In: Proceedings of the 22nd ACM international conference on Information & Knowledge Management, pp 2333–2338
Ji Z, Lu Z, Li H (2014) An information retrieval approach to short text conversation. arXiv preprint arXiv:1408.6988
Kadlec R, Schmid M, Kleindienst J (2015) Improved deep learning baselines for ubuntu corpus dialogs. arXiv preprint arXiv:1510.03753
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: Third international conference on learning representations
Li FL, Qiu M, Chen H, Wang X, Gao X, Huang J, Ren J, Zhao Z, Zhao W, Wang L, et al. (2017a) Alime assist: An intelligent assistant for creating an innovative e-commerce experience. In: Proceedings of the 2017 ACM on conference on information and knowledge management, pp 2495–2498
Li J, Galley M, Brockett C, Gao J, Dolan B (2016) A diversity-promoting objective function for neural conversation models. In: Proceedings of the 2016 Conference of the North American chapter of the association for computational linguistics: human language technologies, pp 110–119
Li J, Monroe W, Shi T, Jean S, Ritter A, Jurafsky D (2017b) Adversarial learning for neural dialogue generation. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 2157–2169
Lowe R, Pow N, Serban I, Pineau J (2015) The Ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. In: Proceedings of the 16th annual meeting of the special interest group on discourse and dialogue, pp 285–294
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: 1st International conference on learning representations
Mou L, Men R, Li G, Xu Y, Zhang L, Yan R, Jin Z (2016) Natural language inference by tree-based convolution and heuristic matching. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 2: Short Papers), pp 130–136
Pennington J, Socher R, Manning C (2014) GloVe: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Peters M, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 1 (Long Papers), pp 2227–2237
Serban I, Sordoni A, Bengio Y, Courville A, Pineau J (2016) Building end-to-end dialogue systems using generative hierarchical neural network models. In: AAAI conference on artificial intelligence
Shang L, Lu Z, Li H (2015) Neural responding machine for short-text conversation. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Volume 1: Long Papers), pp 1577–1586
Shum HY, He Xd, Li D (2018) From eliza to xiaoice: challenges and opportunities with social chatbots. Front Inf Technol Electron Eng 19(1):10–26
Tao C, Wu W, Xu C, Hu W, Zhao D, Yan R (2019) One time of interaction may not be enough: Go deep with an interaction-over-interaction network for response selection in dialogues. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 1–11
Vig J, Ramea K (2019) Comparison of transfer-learning approaches for response selection in multi-turn conversations. In: Workshop on DSTC7
Voorhees EM et al (1999) The trec-8 question answering track report. Trec 99:72–82
Wan S, Lan Y, Guo J, Xu J, Pang L, Cheng X (2016) A deep architecture for semantic matching with multiple positional sentence representations. In: AAAI conference on artificial intelligence
Wang H, Lu Z, Li H, Chen E (2013) A dataset for research on short-text conversations. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 935–945
Wang H, Wu Z, Chen J (2019) Multi-turn response selection in retrieval-based chatbots with iterated attentive convolution matching network. In: Proceedings of the 28th ACM international conference on information & knowledge management, pp 1081–1090
Wang M, Lu Z, Li H, Liu Q (2015) Syntax-based deep matching of short texts. In: Twenty-Fourth international joint conference on artificial intelligence, pp 1354–1361
Wang Z, Hamza W, Florian R (2017) Bilateral multi-perspective matching for natural language sentences. In: Proceedings of the Twenty-Sixth international joint conference on artificial intelligence, IJCAI-17, pp 4144–4150
Whang T, Lee D, Lee C, Yang K, Oh D, Lim H (2019) Domain adaptive training bert for response selection. arXiv preprint arXiv:1908.04812
Wu Y, Wu W, Xing C, Zhou M, Li Z (2017) Sequential matching network: A new architecture for multi-turn response selection in retrieval-based chatbots. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 496–505
Wu Y, Wu W, Yang D, Xu C, Li Z (2018) Neural response generation with dynamic vocabularies. In: AAAI conference on artificial intelligence
Xu Z, Liu B, Wang B, Sun C, Wang X (2017) Incorporating loose-structured knowledge into conversation modeling via recall-gate lstm. In: 2017 international joint conference on neural networks (IJCNN), pp 3506–3513
Yan R, Song Y, Wu H (2016) Learning to respond with deep neural networks for retrieval-based human-computer conversation system. In: Proceedings of the 39th international ACM SIGIR conference on Research and Development in Information Retrieval, pp 55–64
Yang R, Zhang J, Gao X, Ji F, Chen H (2019) Simple and effective text matching with richer alignment features. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 4699–4709
Yuan C, Zhou W, Li M, Lv S, Zhu F, Han J, Hu S (2019) Multi-hop selector network for multi-turn response selection in retrieval-based chatbots. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 111–120
Zhang Z, Li J, Zhu P, Zhao H, Liu G (2018) Modeling multi-turn conversation with deep utterance aggregation. In: Proceedings of the 27th international conference on computational linguistics, pp 3740–3752
Zhou X, Dong D, Wu H, Zhao S, Yu D, Tian H, Liu X, Yan R (2016) Multi-view response selection for human-computer conversation. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 372–381
Zhou X, Li L, Dong D, Liu Y, Chen Y, Zhao WX, Yu D, Wu H (2018) Multi-turn response selection for chatbots with deep attention matching network. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 1118–1127
Acknowledgements
This work is partially supported by the Natural Science Foundation of China (No. 61632011).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflict of interest.
Ethical Approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ma, H., Wang, J., Lin, H. et al. Hierarchical matching network for multi-turn response selection in retrieval-based chatbots. Soft Comput 25, 9609–9624 (2021). https://doi.org/10.1007/s00500-021-05699-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-021-05699-0