Abstract
Deep semantic matching aims at discriminating the relationship between documents based on deep neural networks. In recent years, it becomes increasingly popular to organize documents with a graph structure, then leverage both the intrinsic document features and the extrinsic neighbor features to derive discrimination. Most of the existing works mainly care about how to utilize the presented neighbors, whereas limited effort is made to filter appropriate neighbors. We argue that the neighbor features could be highly noisy and partially useful. Thus, a lack of effective neighbor selection will not only incur a great deal of unnecessary computation cost but also restrict the matching accuracy severely.
In this work, we propose a novel framework, Cascaded Deep Semantic Matching (CDSM), for accurate and efficient semantic matching on textual graphs. CDSM is highlighted for its two-stage workflow. In the first stage, a lightweight CNN-based ad-hod neighbor selector is deployed to filter useful neighbors for the matching task with a small computation cost. We design both one-step and multi-step selection methods. In the second stage, a high-capacity graph-based matching network is employed to compute fine-grained relevance scores based on the well-selected neighbors. It is worth noting that CDSM is a generic framework which accommodates most of the mainstream graph-based semantic matching networks. The major challenge is how the selector can learn to discriminate the neighbors’ usefulness which has no explicit labels. To cope with this problem, we design a weak-supervision strategy for optimization, where we train the graph-based matching network at first and then the ad-hoc neighbor selector is learned on top of the annotations from the matching network. We conduct extensive experiments with three large-scale datasets, showing that CDSM notably improves the semantic matching accuracy and efficiency thanks to the selection of high-quality neighbors. The source code is released at https://github.com/jingjyyao/CDSM.
- [1] . 2020. A survey on graph neural networks for knowledge graph completion. arXiv:2007.12374. Retrieved from https://arxiv.org/abs/2007.12374.Google Scholar
- [2] . 2020. UniLMv2: Pseudo-masked language models for unified language model pre-training. In Proceedings of the 37th International Conference on Machine Learning. PMLR, 642–652.Google Scholar
- [3] . 2003. Latent dirichlet allocation. The Journal of Machine Learning Research 3 (2003), 993–1022.Google ScholarDigital Library
- [4] . 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 4171–4186.Google Scholar
- [5] . 2016. A deep relevance matching model for ad-hoc retrieval. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 55–64.Google ScholarDigital Library
- [6] . 2017. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 1025–1035.Google ScholarDigital Library
- [7] . 2017. Inductive representation learning on large graphs. In Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems. 1024–1034.Google Scholar
- [8] . 2020. Graph neural news recommendation with unsupervised preference disentanglement. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 4255–4264.Google ScholarCross Ref
- [9] . 2013. Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management. 2333–2338.Google ScholarDigital Library
- [10] . 2020. Dense passage retrieval for open-domain question answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 6769–6781.Google ScholarCross Ref
- [11] . 2020. Colbert: Efficient and effective passage search via contextualized late interaction over bert. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 39–48.Google ScholarDigital Library
- [12] . 1998. An introduction to latent semantic analysis. Discourse Processes 25, 2-3 (1998), 259–284.Google ScholarCross Ref
- [13] . 2021. AdsGNN: Behavior-graph augmented relevance modeling in sponsored search. In Proceedings of the SIGIR’21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. , , , , , and (Eds.), ACM, 223–232.Google ScholarDigital Library
- [14] . 2019. RoBERTa: A robustly optimized BERT pretraining approach. arXiv:1907.11692. Retrieved from https://arxiv.org/abs/1907.11692.Google Scholar
- [15] . 2021. Sparse, dense, and attentional representations for text retrieval. Transactions of the Association for Computational Linguistics 9 (2021), 329–345. Google ScholarCross Ref
- [16] . 2016. Deep sentence embedding using long short-term memory networks: Analysis and application to information retrieval. IEEE/ACM Transactions on Audio, Speech, and Language Processing 24, 4 (2016), 694–707.Google ScholarDigital Library
- [17] . 2019. Sentence-BERT: Sentence embeddings using siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. , , , and (Eds.), Association for Computational Linguistics, 3980–3990.Google ScholarCross Ref
- [18] . 2009. The Probabilistic Relevance Framework: BM25 and Beyond. Now Publishers Inc.Google ScholarDigital Library
- [19] . 2017. Bidirectional attention flow for machine comprehension. In Proceedings of the 5th International Conference on Learning Representations. OpenReview.net.Google Scholar
- [20] . 2014. Learning semantic representations using convolutional neural networks for web search. In Proceedings of the 23rd International Conference on World Wide Web. 373–374.Google ScholarDigital Library
- [21] . 2018. Graph attention networks. In Proceedings of the 6th International Conference on Learning Representations. OpenReview.net.Google Scholar
- [22] . 2018. Billion-scale commodity embedding for e-commerce recommendation in alibaba. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 839–848.Google ScholarDigital Library
- [23] . 2021. A survey on knowledge graph embeddings for link prediction. Symmetry 13, 3 (2021), 485.Google ScholarCross Ref
- [24] . 2021. KEPLER: A unified model for knowledge embedding and pre-trained language representation. Transactions of the Association for Computational Linguistics 9 (2021), 176–194. Retrieved from Google ScholarCross Ref
- [25] . 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv:1609.08144. Retrieved from https://arxiv.org/abs/1609.08144.Google Scholar
- [26] . 2018. Deep learning for matching in search and recommendation. In Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval. 1365–1368.Google ScholarDigital Library
- [27] . 2021. GraphFormers: GNN-nested transformers for representation learning on textual graph. In Proceedings of the 35th Conference on Neural Information Processing Systems.Google Scholar
- [28] . 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1480–1489.Google ScholarCross Ref
- [29] . 2020. Employing personal word embeddings for personalized search. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. , , , , , , and (Eds.), ACM, 1359–1368.Google ScholarDigital Library
- [30] . 2018. Graph convolutional neural networks for web-scale recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 974–983.Google ScholarDigital Library
- [31] . 2021. MIRA: Leveraging multi-intention co-click information in web-scale document retrieval using deep neural networks. In Proceedings of the WWW’21: The Web Conference 2021. , , , , and (Eds.), ACM/IW3C2, 227–238.Google ScholarDigital Library
- [32] . 2021. TextGNN: Improving text encoder via graph neural network in sponsored search. In Proceedings of the WWW’21: The Web Conference 2021. , , , , and (Eds.), ACM/IW3C2, 2848–2857.Google ScholarDigital Library
Index Terms
- CDSM: Cascaded Deep Semantic Matching on Textual Graphs Leveraging Ad-hoc Neighbor Selection
Recommendations
A Deep Relevance Matching Model for Ad-hoc Retrieval
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge ManagementIn recent years, deep neural networks have led to exciting breakthroughs in speech recognition, computer vision, and natural language processing (NLP) tasks. However, there have been few positive results of deep models on ad-hoc retrieval tasks. This is ...
A Semantic-Based Ontology Matching Process for PDMS
Globe '09: Proceedings of the 2nd International Conference on Data Management in Grid and Peer-to-Peer SystemsIn Peer Data Management Systems (PDMS), ontology matching can be employed to reconcile peer ontologies and find correspondences between their elements. However, traditional approaches to ontology matching mainly rely on linguistic and/or structural ...
Using transformations to improve semantic matching
K-CAP '03: Proceedings of the 2nd international conference on Knowledge captureMany AI tasks require determining whether two knowledge representations encode the same knowledge. Solving this matching problem is hard because representations may encode the same content but differ substantially in form. Previous approaches to this ...
Comments