skip to main content
10.1145/3219819.3219928acmotherconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

MIX: Multi-Channel Information Crossing for Text Matching

Authors Info & Claims
Published:19 July 2018Publication History

ABSTRACT

Short Text Matching plays an important role in many natural language processing tasks such as information retrieval, question answering, and conversational system. Conventional text matching methods rely on predefined templates and rules, which are not applicable to short text with limited numebr of words and limit their ability to generalize to unobserved data. Many recent efforts have been made to apply deep neural network models to natural language processing tasks, which reduces the cost of feature engineering. In this paper, we present the design of Multi-Channel Information Crossing , a multi-channel convolutional neural network model for text matching, with additional attention mechanisms from sentence and text semantics. MIX compares text snippets at varied granularities to form a series of multi-channel similarity matrices, which are crossed with another set of carefully designed attention matrices to expose the rich structures of sentences to deep neural networks. We implemented MIX and deployed the system on Tencent's Venus distributed computation platform. Thanks to carefully engineered multi-channel information crossing, evaluation results suggest that MIX outperforms a wide range of state-of-the-art deep neural network models by at least 11.1% in terms of the normalized discounted cumulative gain (NDCG@3), on the English WikiQA dataset. Moreover, we also performed online A/B tests with real users on the search service of Tencent QQ Browser. Results suggest that MIX raised the number of clicks on the returned results by 5.7%, due to an increased accuracy in query-document matching, which demonstrates the superior performance of MIX in production environments.

Skip Supplemental Material Section

Supplemental Material

a1513p.mp4

mp4

3.1 MB

References

  1. Steven Bird, Ewan Klein, and Edward Loper. 2009. Natural language processing with Python: analyzing text with the natural language toolkit. O'Reilly Media, Inc. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Jane Bromley, Isabelle Guyon, Yann LeCun, Eduard Sackinger, and Roopak Shah. 1994. Signature verification using a "siamese" time delay neural network Advances in Neural Information Processing Systems. 737--744. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Zhuyun Dai, Chenyan Xiong, Jamie Callan, and Zhiyuan Liu. 2018. Convolutional Neural Networks for Soft-Matching N-Grams in Ad-hoc Search Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. ACM, 126--134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Yixing Fan, Liang Pang, JianPeng Hou, Jiafeng Guo, Yanyan Lan, and Xueqi Cheng. 2017. MatchZoo: A Toolkit for Deep Text Matching. arXiv:1707.07270 (2017).Google ScholarGoogle Scholar
  5. Matt W Gardner and SR Dorling. 1998. Artificial neural networks (the multilayer perceptron)-a review of applications in the atmospheric sciences. Atmospheric environment Vol. 32, 14--15 (1998), 2627--2636.Google ScholarGoogle Scholar
  6. Jiafeng Guo, Yixing Fan, Qingyao Ai, and W Bruce Croft. 2016. A deep relevance matching model for ad-hoc retrieval Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM, 55--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Baotian Hu, Zhengdong Lu, Hang Li, and Qingcai Chen. 2014. Convolutional neural network architectures for matching natural language sentences Advances in neural information processing systems. 2042--2050. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning deep structured semantic models for web search using clickthrough data Proceedings of the 22nd ACM international conference on Conference on information & knowledge management. ACM, 2333--2338. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ozan Irsoy and Claire Cardie. 2014. Deep recursive neural networks for compositionality in language Advances in neural information processing systems. 2096--2104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Kalervo Jarvelin and Jaana Kekalainen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS) Vol. 20, 4 (2002), 422--446. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. 2014. A convolutional neural network for modelling sentences. arXiv:1404.2188 (2014).Google ScholarGoogle Scholar
  12. Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv:1408.5882 (2014).Google ScholarGoogle Scholar
  13. Jiwei Li, Minh-Thang Luong, Dan Jurafsky, and Eudard Hovy. 2015. When are tree structures necessary for deep learning of representations? arXiv:1503.00185 (2015).Google ScholarGoogle Scholar
  14. Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2016. Recurrent neural network for text classification with multi-task learning. arXiv:1605.05101 (2016). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Shengxian Wan, and Xueqi Cheng. 2016. Text Matching as Image Recognition. In AAAI. 2793--2799. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Xipeng Qiu and Xuanjing Huang. 2015. Convolutional Neural Tensor Network Architecture for Community-Based Question Answering. In IJCAI. 1305--1311. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Stephen Robertson, Hugo Zaragoza, et al. 2009. The probabilistic relevance framework: BM25 and beyond. Foundations and Trends® in Information Retrieval Vol. 3, 4 (2009), 333--389. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Yelong Shen, Xiaodong He, Jianfeng Gao, Li Deng, and Grégoire Mesnil. 2014. A latent semantic model with convolutional-pooling structure for information retrieval. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. ACM, 101--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Richard Socher, Danqi Chen, Christopher D Manning, and Andrew Ng. 2013. Reasoning with neural tensor networks for knowledge base completion Advances in neural information processing systems. 926--934. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Richard Socher, Cliff C. Lin, Chris Manning, and Andrew Y. Ng. 2011. Parsing natural scenes and natural language with recursive neural networks Proceedings of the 28th international conference on machine learning (ICML-11). 129--136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Kateryna Tymoshenko, Daniele Bonadiman, and Alessandro Moschitti. 2017. Ranking Kernels for Structures and Embeddings: A Hybrid Preference and Classification Model. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 897--902.Google ScholarGoogle ScholarCross RefCross Ref
  22. Yining Wang, Liwei Wang, Yuanzhi Li, Di He, Wei Chen, and Tie-Yan Liu. 2013. A theoretical analysis of NDCG ranking measures. In Proceedings of the 26th Annual Conference on Learning Theory (COLT 2013).Google ScholarGoogle Scholar
  23. Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, and Russell Power. 2017. End-to-end neural ad-hoc ranking with kernel pooling Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 55--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Xiaobing Xue, Jiwoon Jeon, and W Bruce Croft. 2008. Retrieval models for question and answer archives. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 475--482. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Yi Yang, Wen-tau Yih, and Christopher Meek. 2015. Wikiqa: A challenge dataset for open-domain question answering Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2013--2018.Google ScholarGoogle Scholar
  26. Wenpeng Yin and Hinrich Schütze. 2015. Multigrancnn: An architecture for general matching of text chunks on multiple levels of granularity. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Vol. Vol. 1. 63--73.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. MIX: Multi-Channel Information Crossing for Text Matching

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
          July 2018
          2925 pages
          ISBN:9781450355520
          DOI:10.1145/3219819

          Copyright © 2018 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 19 July 2018

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          KDD '18 Paper Acceptance Rate107of983submissions,11%Overall Acceptance Rate1,133of8,635submissions,13%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader