skip to main content
10.1145/3626772.3657773acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Grand: A Fast and Accurate Graph Retrieval Framework via Knowledge Distillation

Published: 11 July 2024 Publication History

Abstract

Graph retrieval aims to find the most similar graphs in a graph database given a query graph, which is a fundamental problem with many real-world applications in chemical engineering, code analysis, etc. To date, existing neural graph retrieval methods generally fall into two categories: Embedding Based Paradigm (Ebp) and Matching Based Paradigm (Mbp). The Ebp models learn an individual vectorial representation for each graph and the retrieval process can be accelerated by pre-computing these representations. The Mbp models learn a neural matching function to compare graphs on a pair-by-pair basis, in which the fine-grained pairwise comparison leads to higher retrieval accuracy but severely degrades retrieval efficiency. In this paper, to combine the advantage of Ebp in retrieval efficiency with that of Mbp in retrieval accuracy, we propose a novel Graph RetrievAl framework via KNowledge Distillation, namely GRAND. The key point is to leverage the idea of knowledge distillation to transfer the fine-grained graph comparison knowledge from an Mbp model to an Ebp model, such that the Ebp model can generate better graph representations and thus yield higher retrieval accuracy. At the same time, we can still pre-compute and index the improved graph representations to retain the retrieval speed of Ebp. Towards this end, we propose to perform knowledge distillation from three perspectives: score, node, and subgraph levels. In addition, we propose to perform mutual two-way knowledge transfer between Mbp and Ebp, such that Mbp and Ebp complement and benefit each other. Extensive experiments on three real-world datasets show that GRAND improves the performance of Ebp by a large margin and the improvement is consistent for different combinations of Ebp and Mbp models. For example, GRAND achieves performance gains of mostly more than 10% and up to 16.88% in terms of Recall@K on different datasets.

References

[1]
Yunsheng Bai, Hao Ding, Song Bian, Ting Chen, Yizhou Sun, and Wei Wang. 2019. Simgnn: A neural network approach to fast graph similarity computation. In WSDM.
[2]
Yunsheng Bai, Hao Ding, Ken Gu, Yizhou Sun, and Wei Wang. 2020. Learningbased efficient graph similarity computation via multi-scale convolutional set matching. In AAAI.
[3]
Horst Bunke and Kim Shearer. 1998. A graph distance metric based on the maximal common subgraph. Pattern recognition letters (1998).
[4]
Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li. 2007. Learning to rank: from pairwise approach to listwise approach. In ICML.
[5]
Xiaoyang Chen, Hongwei Huo, Jun Huan, and Jeffrey Scott Vitter. 2019. An efficient algorithm for graph edit distance computation. Knowledge-Based Systems (2019).
[6]
Zhengdao Chen, Lei Chen, Soledad Villar, and Joan Bruna. 2020. Can graph neural networks count substructures?. In NeurIPS.
[7]
Tyler Derr, Hamid Karimi, Xiaorui Liu, Jiejun Xu, and Jiliang Tang. 2021. Deep adversarial network alignment. In CIKM.
[8]
Matthias Fey, Jan E Lenssen, Christopher Morris, Jonathan Masci, and Nils M Kriege. 2020. Deep Graph Matching Consensus. In ICLR.
[9]
Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. 2017. Neural message passing for quantum chemistry. In ICML.
[10]
Kunal Goyal, Utkarsh Gupta, Abir De, and Soumen Chakrabarti. 2020. Deep Neural Matching Models for Graph Retrieval. In SIGIR.
[11]
William L Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In NIPS.
[12]
Peter E Hart, Nils J Nilsson, and Bertram Raphael. 1968. A formal basis for the heuristic determination of minimum cost paths. IEEE transactions on Systems Science and Cybernetics (1968).
[13]
Mark Heimann, Haoming Shen, Tara Safavi, and Danai Koutra. 2018. Regal: Representation learning-based graph alignment. In CIKM.
[14]
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).
[15]
Elad Hoffer and Nir Ailon. 2015. Deep metric learning using triplet network. In SIMBAD. Springer.
[16]
Jui-Ting Huang, Ashish Sharma, Shuying Sun, Li Xia, David Zhang, Philip Pronin, Janani Padmanabhan, Giuseppe Ottaviano, and Linjun Yang. 2020. Embeddingbased retrieval in facebook search. In KDD.
[17]
Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2019. Billion-scale similarity search with gpus. IEEE Transactions on Big Data (2019).
[18]
George Karypis and Vipin Kumar. 1998. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on scientific Computing (1998).
[19]
Jongik Kim, Dong-Hoon Choi, and Chen Li. 2019. Inves: Incremental Partitioning- Based Verification for Graph Similarity Search. In EDBT.
[20]
Thomas N Kipf and MaxWelling. 2017. Semi-supervised classification with graph convolutional networks. In ICLR.
[21]
Yujia Li, Chenjie Gu, Thomas Dullien, Oriol Vinyals, and Pushmeet Kohli. 2019. Graph matching networks for learning the similarity of graph structured objects. In ICML.
[22]
Yongjiang Liang and Peixiang Zhao. 2017. Similarity search in graph databases: A multi-layered indexing approach. In ICDE.
[23]
Xiang Ling, Lingfei Wu, Saizhuo Wang, Tengfei Ma, Fangli Xu, Alex X Liu, Chunming Wu, and Shouling Ji. 2021. Multilevel Graph Matching Networks for Deep Graph Similarity Learning. TNNLS (2021).
[24]
Xiang Ling, Lingfei Wu, Saizhuo Wang, Gaoning Pan, Tengfei Ma, Fangli Xu, Alex X Liu, Chunming Wu, and Shouling Ji. 2021. Deep Graph Matching and Searching for Semantic Code Retrieval. TKDD (2021).
[25]
Xin Liu, Haojie Pan, Mutian He, Yangqiu Song, Xin Jiang, and Lifeng Shang. 2020. Neural subgraph isomorphism counting. In KDD.
[26]
Michel Neuhaus and Horst Bunke. 2007. Bridging the gap between graph edit distance and kernel machines. World Scientific.
[27]
Ferran Parés, Dario Garcia Gasulla, Armand Vilalta, Jonatan Moreno, Eduard Ayguadé, Jesús Labarta, Ulises Cortés, and Toyotaro Suzumura. 2017. Fluid communities: A competitive, scalable and diverse community detection algorithm. In Complex Networks & Their Applications.
[28]
Can Qin, Handong Zhao, Lichen Wang, Huan Wang, Yulun Zhang, and Yun Fu. 2021. Slow learning and fast inference: Efficient graph similarity computation via knowledge distillation. Advances in Neural Information Processing Systems 34 (2021), 14110--14121.
[29]
Zongyue Qin, Yunsheng Bai, and Yizhou Sun. 2020. GHashing: Semantic Graph Hashing for Approximate Similarity Search in Graph Databases. In KDD.
[30]
Sayan Ranu, Minh Hoang, and Ambuj Singh. 2014. Answering top-k representative queries on graph databases. In SIGMOD.
[31]
Haichuan Shang, Xuemin Lin, Ying Zhang, Jeffrey Xu Yu, and Wei Wang. 2010. Connected substructure similarity search. In SIGMOD.
[32]
Damian Szklarczyk, Alberto Santos, Christian Von Mering, Lars Juhl Jensen, Peer Bork, and Michael Kuhn. 2016. STITCH 5: augmenting protein-chemical interaction networks with tissue and affinity data. Nucleic acids research (2016).
[33]
Michael Taylor, John Guiver, Stephen Robertson, and Tom Minka. 2008. Softrank: optimizing non-smooth rank metrics. In WSDM.
[34]
Yuanyuan Tian, Richard C Mceachin, Carlos Santos, David J States, and JigneshM Patel. 2007. SAGA: a subgraph matching tool for biological graphs. Bioinformatics (2007).
[35]
Esko Ukkonen. 1992. Approximate string-matching with q-grams and maximal matches. Theoretical computer science (1992).
[36]
Petar Veli?kovi?, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018. Graph attention networks. In ICLR.
[37]
Guoren Wang, Bin Wang, Xiaochun Yang, and Ge Yu. 2010. Efficiently indexing large sparse graphs for similarity search. TKDE (2010).
[38]
Fen Xia, Tie-Yan Liu, Jue Wang, Wensheng Zhang, and Hang Li. 2008. Listwise approach to learning to rank: theory and algorithm. In ICML.
[39]
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How powerful are graph neural networks?. In ICLR.
[40]
Xifeng Yan, Philip S Yu, and Jiawei Han. 2005. Substructure similarity search in graph databases. In SIGMOD.
[41]
Cheng Yang, Jiawei Liu, and Chuan Shi. 2021. Extract the knowledge of graph neural networks and go beyond it: An effective knowledge distillation framework. In Proceedings of the web conference 2021. 1227--1237.
[42]
Zhitao Ying, Andrew Wang, Jiaxuan You, Chengtao Wen, Arquimedes Canedo, and Jure Leskovec. 2020. Neural subgraph matching. arXiv preprint arXiv:2007.03092 (2020).
[43]
Ye Yuan, Guoren Wang, Jeffery Yu Xu, and Lei Chen. 2015. Efficient distributed subgraph similarity matching. The VLDB Journal (2015).
[44]
Zhiping Zeng, Anthony KH Tung, Jianyong Wang, Jianhua Feng, and Lizhu Zhou. 2009. Comparing stars: On approximating graph edit distance. In VLDB.
[45]
Shijie Zhang, Jiong Yang, and Wei Jin. 2010. Sapper: Subgraph indexing and approximate matching in large graphs. In VLDB.
[46]
Ying Zhang, Tao Xiang, Timothy M Hospedales, and Huchuan Lu. 2018. Deep mutual learning. In CVPR.
[47]
Zhen Zhang, Jiajun Bu, Martin Ester, Zhao Li, Chengwei Yao, Zhi Yu, and Can Wang. 2021. H2MN: Graph Similarity Learning with Hierarchical Hypergraph Matching Networks. In KDD.
[48]
Xiang Zhao, Chuan Xiao, Xuemin Lin, Qing Liu, and Wenjie Zhang. 2013. A partition-based approach to structure similarity search. In VLDB.
[49]
Xiang Zhao, Chuan Xiao, Xuemin Lin, Wei Wang, and Yoshiharu Ishikawa. 2013. Efficient processing of graph similarity queries with edit distance constraints. The VLDB Journal (2013).
[50]
Gaoping Zhu, Xuemin Lin, Ke Zhu, Wenjie Zhang, and Jeffrey Xu Yu. 2012. TreeSpan: efficiently computing similarity all-matching. In SIGMOD.

Cited By

View all
  • (2024)PAIR: Pre-denosing Augmented Image Retrieval Model for Defending Adversarial PatchesProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681398(5771-5779)Online publication date: 28-Oct-2024

Index Terms

  1. Grand: A Fast and Accurate Graph Retrieval Framework via Knowledge Distillation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
    July 2024
    3164 pages
    ISBN:9798400704314
    DOI:10.1145/3626772
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 July 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. gnn
    2. graph retrieval
    3. knowledge distillation

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SIGIR 2024
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)256
    • Downloads (Last 6 weeks)37
    Reflects downloads up to 16 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)PAIR: Pre-denosing Augmented Image Retrieval Model for Defending Adversarial PatchesProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681398(5771-5779)Online publication date: 28-Oct-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media