research-article

CDSM: Cascaded Deep Semantic Matching on Textual Graphs Leveraging Ad-hoc Neighbor Selection

Authors:
Jing Yao

Microsoft Research Asia, Beijing, China

Microsoft Research Asia, Beijing, China

0000-0002-0527-6095
View Profile

,
Zheng Liu

Microsoft Research Asia, Beijing, China

Microsoft Research Asia, Beijing, China

0000-0003-3437-2630
View Profile

,
Junhan Yang

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China

0000-0002-0826-4524
View Profile

,
Zhicheng Dou

Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China

Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China

0000-0002-9781-948X
View Profile

,
Xing Xie

Microsoft Research Asia, Beijing, China

Microsoft Research Asia, Beijing, China

0000-0002-8608-8482
View Profile

,
Ji-Rong Wen

Beijing Key Laboratory of Big Data Management and Analysis Methods, Key Laboratory of Data Engineering and Knowledge Engineering, MOE, Beijing, China

Beijing Key Laboratory of Big Data Management and Analysis Methods, Key Laboratory of Data Engineering and Knowledge Engineering, MOE, Beijing, China

0000-0002-9777-9676
View Profile

ACM Transactions on Intelligent Systems and Technology Volume 14 Issue 2Article No.: 32pp 1–24https://doi.org/10.1145/3573204

Published:16 February 2023Publication History

ACM Transactions on Intelligent Systems and Technology

Abstract

Deep semantic matching aims at discriminating the relationship between documents based on deep neural networks. In recent years, it becomes increasingly popular to organize documents with a graph structure, then leverage both the intrinsic document features and the extrinsic neighbor features to derive discrimination. Most of the existing works mainly care about how to utilize the presented neighbors, whereas limited effort is made to filter appropriate neighbors. We argue that the neighbor features could be highly noisy and partially useful. Thus, a lack of effective neighbor selection will not only incur a great deal of unnecessary computation cost but also restrict the matching accuracy severely.

In this work, we propose a novel framework, Cascaded Deep Semantic Matching (CDSM), for accurate and efficient semantic matching on textual graphs. CDSM is highlighted for its two-stage workflow. In the first stage, a lightweight CNN-based ad-hod neighbor selector is deployed to filter useful neighbors for the matching task with a small computation cost. We design both one-step and multi-step selection methods. In the second stage, a high-capacity graph-based matching network is employed to compute fine-grained relevance scores based on the well-selected neighbors. It is worth noting that CDSM is a generic framework which accommodates most of the mainstream graph-based semantic matching networks. The major challenge is how the selector can learn to discriminate the neighbors’ usefulness which has no explicit labels. To cope with this problem, we design a weak-supervision strategy for optimization, where we train the graph-based matching network at first and then the ad-hoc neighbor selector is learned on top of the annotations from the matching network. We conduct extensive experiments with three large-scale datasets, showing that CDSM notably improves the semantic matching accuracy and efficiency thanks to the selection of high-quality neighbors. The source code is released at https://github.com/jingjyyao/CDSM.

REFERENCES

[1] Arora Siddhant. 2020. A survey on graph neural networks for knowledge graph completion. arXiv:2007.12374. Retrieved from https://arxiv.org/abs/2007.12374.Google Scholar
[2] Bao Hangbo, Dong Li, Wei Furu, Wang Wenhui, Yang Nan, Liu Xiaodong, Wang Yu, Gao Jianfeng, Piao Songhao, Zhou Ming, and Hon Hsiao-Wuen. 2020. UniLMv2: Pseudo-masked language models for unified language model pre-training. In Proceedings of the 37th International Conference on Machine Learning. PMLR, 642–652.Google Scholar
[3] Blei David M., Ng Andrew Y., and Jordan Michael I.. 2003. Latent dirichlet allocation. The Journal of Machine Learning Research 3 (2003), 993–1022.Google ScholarDigital Library
[4] Devlin Jacob, Chang Ming-Wei, Lee Kenton, and Toutanova Kristina. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 4171–4186.Google Scholar
[5] Guo Jiafeng, Fan Yixing, Ai Qingyao, and Croft W. Bruce. 2016. A deep relevance matching model for ad-hoc retrieval. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 55–64.Google ScholarDigital Library
[6] Hamilton William L., Ying Rex, and Leskovec Jure. 2017. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 1025–1035.Google ScholarDigital Library
[7] Hamilton William L., Ying Zhitao, and Leskovec Jure. 2017. Inductive representation learning on large graphs. In Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems. 1024–1034.Google Scholar
[8] Hu Linmei, Xu Siyong, Li Chen, Yang Cheng, Shi Chuan, Duan Nan, Xie Xing, and Zhou Ming. 2020. Graph neural news recommendation with unsupervised preference disentanglement. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 4255–4264.Google ScholarCross Ref
[9] Huang Po-Sen, He Xiaodong, Gao Jianfeng, Deng Li, Acero Alex, and Heck Larry. 2013. Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management. 2333–2338.Google ScholarDigital Library
[10] Karpukhin Vladimir, Oguz Barlas, Min Sewon, Lewis Patrick S. H., Wu Ledell, Edunov Sergey, Chen Danqi, and Yih Wen-tau. 2020. Dense passage retrieval for open-domain question answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 6769–6781.Google ScholarCross Ref
[11] Khattab Omar and Zaharia Matei. 2020. Colbert: Efficient and effective passage search via contextualized late interaction over bert. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 39–48.Google ScholarDigital Library
[12] Landauer Thomas K., Foltz Peter W., and Laham Darrell. 1998. An introduction to latent semantic analysis. Discourse Processes 25, 2-3 (1998), 259–284.Google ScholarCross Ref
[13] Li Chaozhuo, Pang Bochen, Liu Yuming, Sun Hao, Liu Zheng, Xie Xing, Yang Tianqi, Cui Yanling, Zhang Liangjie, and Zhang Qi. 2021. AdsGNN: Behavior-graph augmented relevance modeling in sponsored search. In Proceedings of the SIGIR’21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. Diaz Fernando, Shah Chirag, Suel Torsten, Castells Pablo, Jones Rosie, and Sakai Tetsuya (Eds.), ACM, 223–232.Google ScholarDigital Library
[14] Liu Yinhan, Ott Myle, Goyal Naman, Du Jingfei, Joshi Mandar, Chen Danqi, Levy Omer, Lewis Mike, Zettlemoyer Luke, and Stoyanov Veselin. 2019. RoBERTa: A robustly optimized BERT pretraining approach. arXiv:1907.11692. Retrieved from https://arxiv.org/abs/1907.11692.Google Scholar
[15] Luan Yi, Eisenstein Jacob, Toutanova Kristina, and Collins Michael. 2021. Sparse, dense, and attentional representations for text retrieval. Transactions of the Association for Computational Linguistics 9 (2021), 329–345. Google ScholarCross Ref
[16] Palangi Hamid, Deng Li, Shen Yelong, Gao Jianfeng, He Xiaodong, Chen Jianshu, Song Xinying, and Ward Rabab. 2016. Deep sentence embedding using long short-term memory networks: Analysis and application to information retrieval. IEEE/ACM Transactions on Audio, Speech, and Language Processing 24, 4 (2016), 694–707.Google ScholarDigital Library
[17] Reimers Nils and Gurevych Iryna. 2019. Sentence-BERT: Sentence embeddings using siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.Inui Kentaro, Jiang Jing, Ng Vincent, and Wan Xiaojun (Eds.), Association for Computational Linguistics, 3980–3990.Google ScholarCross Ref
[18] Robertson Stephen and Zaragoza Hugo. 2009. The Probabilistic Relevance Framework: BM25 and Beyond. Now Publishers Inc.Google ScholarDigital Library
[19] Seo Min Joon, Kembhavi Aniruddha, Farhadi Ali, and Hajishirzi Hannaneh. 2017. Bidirectional attention flow for machine comprehension. In Proceedings of the 5th International Conference on Learning Representations. OpenReview.net.Google Scholar
[20] Shen Yelong, He Xiaodong, Gao Jianfeng, Deng Li, and Mesnil Grégoire. 2014. Learning semantic representations using convolutional neural networks for web search. In Proceedings of the 23rd International Conference on World Wide Web. 373–374.Google ScholarDigital Library
[21] Velickovic Petar, Cucurull Guillem, Casanova Arantxa, Romero Adriana, Liò Pietro, and Bengio Yoshua. 2018. Graph attention networks. In Proceedings of the 6th International Conference on Learning Representations. OpenReview.net.Google Scholar
[22] Wang Jizhe, Huang Pipei, Zhao Huan, Zhang Zhibo, Zhao Binqiang, and Lee Dik Lun. 2018. Billion-scale commodity embedding for e-commerce recommendation in alibaba. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 839–848.Google ScholarDigital Library
[23] Wang Meihong, Qiu Linling, and Wang Xiaoli. 2021. A survey on knowledge graph embeddings for link prediction. Symmetry 13, 3 (2021), 485.Google ScholarCross Ref
[24] Wang Xiaozhi, Gao Tianyu, Zhu Zhaocheng, Zhang Zhengyan, Liu Zhiyuan, Li Juanzi, and Tang Jian. 2021. KEPLER: A unified model for knowledge embedding and pre-trained language representation. Transactions of the Association for Computational Linguistics 9 (2021), 176–194. Retrieved from Google ScholarCross Ref
[25] Wu Yonghui, Schuster Mike, Chen Zhifeng, Le Quoc V., Norouzi Mohammad, Macherey Wolfgang, Krikun Maxim, Cao Yuan, Gao Qin, Macherey Klaus, Klingner Jeff, Shah Apurva, Johnson Melvin, Liu Xiaobing, Kaiser Lukasz, Gouws Stephan, Kato Yoshikiyo, Kudo Taku, Kazawa Hideto, Stevens Keith, Kurian George, Patil Nishant, Wang Wei, Young Cliff, Smith Jason, Riesa Jason, Rudnick Alex, Vinyals Oriol, Corrado Greg, Hughes Macduff, and Dean Jeffrey. 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv:1609.08144. Retrieved from https://arxiv.org/abs/1609.08144.Google Scholar
[26] Xu Jun, He Xiangnan, and Li Hang. 2018. Deep learning for matching in search and recommendation. In Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval. 1365–1368.Google ScholarDigital Library
[27] Yang Junhan, Liu Zheng, Xiao Shitao, Li Chaozhuo, Lian Defu, Agrawal Sanjay, Amit S., Sun Guangzhong, and Xie Xing. 2021. GraphFormers: GNN-nested transformers for representation learning on textual graph. In Proceedings of the 35th Conference on Neural Information Processing Systems.Google Scholar
[28] Yang Zichao, Yang Diyi, Dyer Chris, He Xiaodong, Smola Alex, and Hovy Eduard. 2016. Hierarchical attention networks for document classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1480–1489.Google ScholarCross Ref
[29] Yao Jing, Dou Zhicheng, and Wen Ji-Rong. 2020. Employing personal word embeddings for personalized search. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. Huang Jimmy, Chang Yi, Cheng Xueqi, Kamps Jaap, Murdock Vanessa, Wen Ji-Rong, and Liu Yiqun (Eds.), ACM, 1359–1368.Google ScholarDigital Library
[30] Ying Rex, He Ruining, Chen Kaifeng, Eksombatchai Pong, Hamilton William L., and Leskovec Jure. 2018. Graph convolutional neural networks for web-scale recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 974–983.Google ScholarDigital Library
[31] Zhang Yusi, Liu Chuanjie, Luo Angen, Xue Hui, Shan Xuan, Luo Yuxiang, Xia Yiqian, Yan Yuanchi, and Wang Haidong. 2021. MIRA: Leveraging multi-intention co-click information in web-scale document retrieval using deep neural networks. In Proceedings of the WWW’21: The Web Conference 2021.Leskovec Jure, Grobelnik Marko, Najork Marc, Tang Jie, and Zia Leila (Eds.), ACM/IW3C2, 227–238.Google ScholarDigital Library
[32] Zhu Jason, Cui Yanling, Liu Yuming, Sun Hao, Li Xue, Pelger Markus, Yang Tianqi, Zhang Liangjie, Zhang Ruofei, and Zhao Huasha. 2021. TextGNN: Improving text encoder via graph neural network in sponsored search. In Proceedings of the WWW’21: The Web Conference 2021.Leskovec Jure, Grobelnik Marko, Najork Marc, Tang Jie, and Zia Leila (Eds.), ACM/IW3C2, 2848–2857.Google ScholarDigital Library

Index Terms

CDSM: Cascaded Deep Semantic Matching on Textual Graphs Leveraging Ad-hoc Neighbor Selection
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Language models
      2. Similarity measures

Recommendations

A Deep Relevance Matching Model for Ad-hoc Retrieval
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

In recent years, deep neural networks have led to exciting breakthroughs in speech recognition, computer vision, and natural language processing (NLP) tasks. However, there have been few positive results of deep models on ad-hoc retrieval tasks. This is ...
Read More
A Semantic-Based Ontology Matching Process for PDMS
Globe '09: Proceedings of the 2nd International Conference on Data Management in Grid and Peer-to-Peer Systems

In Peer Data Management Systems (PDMS), ontology matching can be employed to reconcile peer ontologies and find correspondences between their elements. However, traditional approaches to ontology matching mainly rely on linguistic and/or structural ...
Read More
Using transformations to improve semantic matching
K-CAP '03: Proceedings of the 2nd international conference on Knowledge capture

Many AI tasks require determining whether two knowledge representations encode the same knowledge. Solving this matching problem is hard because representations may encode the same content but differ substantially in form. Previous approaches to this ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Intelligent Systems and Technology Volume 14, Issue 2
April 2023
430 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/3582879
Editor:
Huan Liu
Arizona State University, USA
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 16 February 2023
- Online AM: 2 December 2022
- Accepted: 7 November 2022
- Revised: 18 September 2022
- Received: 30 November 2021
Published in tist Volume 14, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Semantic matching
textual graph
neighbor selection
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 168
  Total Downloads
- Downloads (Last 12 months)88
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

HTML Format

View this article in HTML Format .

View HTML Format

CDSM: Cascaded Deep Semantic Matching on Textual Graphs Leveraging Ad-hoc Neighbor Selection

ACM Transactions on Intelligent Systems and Technology

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

A Deep Relevance Matching Model for Ad-hoc Retrieval

A Semantic-Based Ontology Matching Process for PDMS

Using transformations to improve semantic matching

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Full Text

HTML Format

Caption

CDSM: Cascaded Deep Semantic Matching on Textual Graphs Leveraging Ad-hoc Neighbor Selection

ACM Transactions on Intelligent Systems and Technology

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

A Deep Relevance Matching Model for Ad-hoc Retrieval

A Semantic-Based Ontology Matching Process for PDMS

Using transformations to improve semantic matching

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Full Text

HTML Format

Share this Publication link

Share on Social Media