ABSTRACT
Short Text Matching plays an important role in many natural language processing tasks such as information retrieval, question answering, and conversational system. Conventional text matching methods rely on predefined templates and rules, which are not applicable to short text with limited numebr of words and limit their ability to generalize to unobserved data. Many recent efforts have been made to apply deep neural network models to natural language processing tasks, which reduces the cost of feature engineering. In this paper, we present the design of Multi-Channel Information Crossing , a multi-channel convolutional neural network model for text matching, with additional attention mechanisms from sentence and text semantics. MIX compares text snippets at varied granularities to form a series of multi-channel similarity matrices, which are crossed with another set of carefully designed attention matrices to expose the rich structures of sentences to deep neural networks. We implemented MIX and deployed the system on Tencent's Venus distributed computation platform. Thanks to carefully engineered multi-channel information crossing, evaluation results suggest that MIX outperforms a wide range of state-of-the-art deep neural network models by at least 11.1% in terms of the normalized discounted cumulative gain (NDCG@3), on the English WikiQA dataset. Moreover, we also performed online A/B tests with real users on the search service of Tencent QQ Browser. Results suggest that MIX raised the number of clicks on the returned results by 5.7%, due to an increased accuracy in query-document matching, which demonstrates the superior performance of MIX in production environments.
Supplemental Material
- Steven Bird, Ewan Klein, and Edward Loper. 2009. Natural language processing with Python: analyzing text with the natural language toolkit. O'Reilly Media, Inc. Google ScholarDigital Library
- Jane Bromley, Isabelle Guyon, Yann LeCun, Eduard Sackinger, and Roopak Shah. 1994. Signature verification using a "siamese" time delay neural network Advances in Neural Information Processing Systems. 737--744. Google ScholarDigital Library
- Zhuyun Dai, Chenyan Xiong, Jamie Callan, and Zhiyuan Liu. 2018. Convolutional Neural Networks for Soft-Matching N-Grams in Ad-hoc Search Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. ACM, 126--134. Google ScholarDigital Library
- Yixing Fan, Liang Pang, JianPeng Hou, Jiafeng Guo, Yanyan Lan, and Xueqi Cheng. 2017. MatchZoo: A Toolkit for Deep Text Matching. arXiv:1707.07270 (2017).Google Scholar
- Matt W Gardner and SR Dorling. 1998. Artificial neural networks (the multilayer perceptron)-a review of applications in the atmospheric sciences. Atmospheric environment Vol. 32, 14--15 (1998), 2627--2636.Google Scholar
- Jiafeng Guo, Yixing Fan, Qingyao Ai, and W Bruce Croft. 2016. A deep relevance matching model for ad-hoc retrieval Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM, 55--64. Google ScholarDigital Library
- Baotian Hu, Zhengdong Lu, Hang Li, and Qingcai Chen. 2014. Convolutional neural network architectures for matching natural language sentences Advances in neural information processing systems. 2042--2050. Google ScholarDigital Library
- Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning deep structured semantic models for web search using clickthrough data Proceedings of the 22nd ACM international conference on Conference on information & knowledge management. ACM, 2333--2338. Google ScholarDigital Library
- Ozan Irsoy and Claire Cardie. 2014. Deep recursive neural networks for compositionality in language Advances in neural information processing systems. 2096--2104. Google ScholarDigital Library
- Kalervo Jarvelin and Jaana Kekalainen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS) Vol. 20, 4 (2002), 422--446. Google ScholarDigital Library
- Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. 2014. A convolutional neural network for modelling sentences. arXiv:1404.2188 (2014).Google Scholar
- Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv:1408.5882 (2014).Google Scholar
- Jiwei Li, Minh-Thang Luong, Dan Jurafsky, and Eudard Hovy. 2015. When are tree structures necessary for deep learning of representations? arXiv:1503.00185 (2015).Google Scholar
- Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2016. Recurrent neural network for text classification with multi-task learning. arXiv:1605.05101 (2016). Google ScholarDigital Library
- Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Shengxian Wan, and Xueqi Cheng. 2016. Text Matching as Image Recognition. In AAAI. 2793--2799. Google ScholarDigital Library
- Xipeng Qiu and Xuanjing Huang. 2015. Convolutional Neural Tensor Network Architecture for Community-Based Question Answering. In IJCAI. 1305--1311. Google ScholarDigital Library
- Stephen Robertson, Hugo Zaragoza, et al. 2009. The probabilistic relevance framework: BM25 and beyond. Foundations and Trends® in Information Retrieval Vol. 3, 4 (2009), 333--389. Google ScholarDigital Library
- Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng, and Grégoire Mesnil. 2014. A latent semantic model with convolutional-pooling structure for information retrieval. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. ACM, 101--110. Google ScholarDigital Library
- Richard Socher, Danqi Chen, Christopher D Manning, and Andrew Ng. 2013. Reasoning with neural tensor networks for knowledge base completion Advances in neural information processing systems. 926--934. Google ScholarDigital Library
- Richard Socher, Cliff C. Lin, Chris Manning, and Andrew Y. Ng. 2011. Parsing natural scenes and natural language with recursive neural networks Proceedings of the 28th international conference on machine learning (ICML-11). 129--136. Google ScholarDigital Library
- Kateryna Tymoshenko, Daniele Bonadiman, and Alessandro Moschitti. 2017. Ranking Kernels for Structures and Embeddings: A Hybrid Preference and Classification Model. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 897--902.Google ScholarCross Ref
- Yining Wang, Liwei Wang, Yuanzhi Li, Di He, Wei Chen, and Tie-Yan Liu. 2013. A theoretical analysis of NDCG ranking measures. In Proceedings of the 26th Annual Conference on Learning Theory (COLT 2013).Google Scholar
- Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, and Russell Power. 2017. End-to-end neural ad-hoc ranking with kernel pooling Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 55--64. Google ScholarDigital Library
- Xiaobing Xue, Jiwoon Jeon, and W Bruce Croft. 2008. Retrieval models for question and answer archives. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 475--482. Google ScholarDigital Library
- Yi Yang, Wen-tau Yih, and Christopher Meek. 2015. Wikiqa: A challenge dataset for open-domain question answering Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2013--2018.Google Scholar
- Wenpeng Yin and Hinrich Schütze. 2015. Multigrancnn: An architecture for general matching of text chunks on multiple levels of granularity. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Vol. Vol. 1. 63--73.Google ScholarCross Ref
Index Terms
- MIX: Multi-Channel Information Crossing for Text Matching
Recommendations
A Deep Neural Network Framework for English Hindi Question Answering
In this article, we propose a unified deep neural network framework for multilingual question answering (QA). The proposed network deals with the multilingual questions and answers snippets. The input to the network is a pair of factoid question and ...
P-CNN: Enhancing text matching with positional convolutional neural network
AbstractIn recent years, positional information has shown good performance in deep neural networks for text matching. Most positional deep neural networks focus on modeling positional information based on the word-level matching signals, whereas the ...
Highlights- Classify text positional information into multiple perspectives.
- Incorporate positional information with interaction signals for text matching.
- Propose a position-sensible convolution filter for the convolutional neural network.
A simple and efficient text matching model based on deep interaction
Highlights- We propose a novel model, namely Deep Interaction Text Matching (DITM).
- The proposed model can well capture the interaction information.
- This approach outperforms most of the state-of-the-art methods on multiple tasks.
- The ...
AbstractIn recent years, text matching has gained increasing research focus and shown great improvements. However, due to the long-distance dependency and polysemy, existing text matching models cannot effectively capture the contextual and implicit ...
Comments