Skip to main content
Log in

Fast subgraph query processing and subgraph matching via static and dynamic equivalences

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Subgraph query processing (also known as subgraph search) and subgraph matching are fundamental graph problems in many application domains. A lot of efforts have been made to develop practical solutions for these problems. Despite the efforts, existing algorithms showed limited running time and scalability in dealing with large and/or many graphs. In this paper, we propose a new subgraph search algorithm using equivalences of vertices in order to reduce search space: (1) static equivalence of vertices in a query graph that leads to an efficient matching order of the vertices and (2) dynamic equivalence of candidate vertices in a data graph, which enables us to capture and remove redundancies in search space. These techniques for subgraph search also lead to an improved algorithm for subgraph matching. Experiments show that our approach outperforms state-of-the-art subgraph search and subgraph matching algorithms by up to several orders of magnitude with respect to query processing time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29

Similar content being viewed by others

Notes

  1. https://github.com/SNUCSE-CTA/VEQ

References

  1. Aberger, C.R., Lamb, A., Tu, S., Nötzli, A., Olukotun, K., Ré, C.: Emptyheaded: A relational engine for graph processing. ACM Trans. Datab. Syst. (TODS) 42(4), 1–44 (2017)

    Article  MathSciNet  Google Scholar 

  2. Bhattarai, B., Liu, H., Huang, H.H.: Ceci: compact embedding cluster index for scalable subgraph matching. In: Proceedings of the 2019 International Conference on Management of Data, pp. 1447–1462 (2019)

  3. Bi, F., Chang, L., Lin, X., Qin, L., Zhang, W.: Efficient subgraph matching by postponing cartesian products. In: Proceedings of ACM SIGMOD, pp. 1199–1214 (2016)

  4. Bonnici, V., Ferro, A., Giugno, R., Pulvirenti, A., Shasha, D.: Enhancing graph database indexing by suffix tree structure. In: IAPR International Conference on Pattern Recognition in Bioinformatics, pp. 195–203. Springer, Berlin (2010)

  5. Bonnici, V., Giugno, R., Pulvirenti, A., Shasha, D., Ferro, A.: A subgraph isomorphism algorithm and its application to biochemical data. BMC Bioinf. 14(7), 1–13 (2013)

    Google Scholar 

  6. Cannataro, M., Guzzi, P.H.: Data Management of Protein Interaction Networks, vol. 17. John Wiley and Sons, New Jersey (2012)

    Google Scholar 

  7. Carletti, V., Foggia, P., Saggese, A., Vento, M.: Challenging the time complexity of exact subgraph isomorphism for huge and dense graphs with vf3. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 804–818 (2017)

    Article  Google Scholar 

  8. Cordella, L.P., Foggia, P., Sansone, C., Vento, M.: A (sub)graph isomorphism algorithm for matching large graphs. IEEE Trans. Pattern Anal. Mach. Intell. 26(10), 1367–1372 (2004)

    Article  Google Scholar 

  9. Di Natale, R., Ferro, A., Giugno, R., Mongiovì, M., Pulvirenti, A., Shasha, D.: Sing: Subgraph search in non-homogeneous graphs. BMC Bioinf. 11(1), 96 (2010)

    Article  Google Scholar 

  10. Fan, W.: Graph pattern matching revised for social network analysis. In: Proceedings of ICDT, pp. 8–21 (2012)

  11. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman and Co. (1979)

  12. Giugno, R., Bonnici, V., Bombieri, N., Pulvirenti, A., Ferro, A., Shasha, D.: Grapes: a software for parallel searching on biological graphs targeting multi-core architectures. PLoS ONE 8(10), e76911 (2013)

    Article  Google Scholar 

  13. Han, M., Kim, H., Gu, G., Park, K., Han, W.S.: Efficient subgraph matching: Harmonizing dynamic programming, adaptive matching order, and failing set together. In: Proceedings of ACM SIGMOD, pp. 1429–1446 (2019)

  14. Han, W.S., Lee, J., Lee, J.H.: Turbo iso: Towards Ultrafast and Robust Subgraph Isomorphism Search in Large Graph Databases. In: Proceedings of ACM SIGMOD, pp. 337–348 (2013)

  15. Han, W.S., Lee, J., Pham, M.D., Yu, J.X.: igraph: a framework for comparisons of disk-based graph indexing techniques. Proc. VLDB Endow. 3(1–2), 449–459 (2010)

    Article  Google Scholar 

  16. He, H., Singh, A.K.: Graphs-at-a-time: query language and access methods for graph databases. In: Proceedings of ACM SIGMOD, pp. 405–418 (2008)

  17. Kankanamge, C., Sahu, S., Mhedbhi, A., Chen, J., Salihoglu, S.: Graphflow: an active graph database. In: Proceedings of the 2017 ACM International Conference on Management of Data, pp. 1695–1698 (2017)

  18. Katsarou, F., Ntarmos, N., Triantafillou, P.: Performance and scalability of indexed subgraph query processing methods. Proc. VLDB Endow. 8(12), 1566–1577 (2015)

    Article  Google Scholar 

  19. Kim, H., Choi, Y., Park, K., Lin, X., Hong, S.H., Han, W.S.: Versatile equivalences: Speeding up subgraph query processing and subgraph matching. In: Proceedings of ACM SIGMOD, pp. 925–937 (2021)

  20. Kim, J., Shin, H., Han, W.S., Hong, S., Chafi, H.: Taming subgraph isomorphism for rdf query processing. Proc. VLDB Endow. 8(11) (2015)

  21. Kim, K., Seo, I., Han, W.S., Lee, J.H., Hong, S., Chafi, H., Shin, H., Jeong, G.: Turboflux: A fast continuous subgraph matching system for streaming graph data. In: Proceedings of ACM SIGMOD, pp. 411–426 (2018)

  22. Klein, K., Kriege, N., Mutzel, P.: Ct-index: Fingerprint-based graph indexing combining cycles and trees. In: Proceedings of IEEE ICDE, pp. 1115–1126 (2011)

  23. Lee, J., Han, W.S., Kasperovics, R., Lee, J.H.: An in-depth comparison of subgraph isomorphism algorithms in graph databases. Proc. VLDB Endow. 6(2), 133–144 (2012)

    Article  Google Scholar 

  24. Leskovec, J., Krevl, A.: SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data (2014)

  25. Liang, Y., Zhao, P.: Workload-aware subgraph query caching and processing in large graphs. In: Proceedings of IEEE ICDE, pp. 1754–1757 (2019)

  26. McCreesh, C., Prosser, P., Solnon, C., Trimble, J.: When subgraph isomorphism is really hard, and why this matters for graph databases. J. Artif. Intell. Res. 61, 723–759 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  27. McCreesh, C., Prosser, P., Trimble, J.: Heuristics and really hard instances for subgraph isomorphism problems. In: IJCAI, pp. 631–638 (2016)

  28. McCreesh, C., Prosser, P., Trimble, J.: The glasgow subgraph solver: using constraint programming to tackle hard subgraph isomorphism problem variants. In: International Conference on Graph Transformation, pp. 316–324. Springer (2020)

  29. Mhedhbi, A., Salihoglu, S.: Optimizing subgraph queries by combining binary and worst-case optimal joins. Proc. VLDB Endow. 12(11), 1692–1704 (2019)

    Article  Google Scholar 

  30. Park, H., Kim, M.S.: Evograph: an effective and efficient graph upscaling method for preserving graph properties. In: Proceedings of ACM SIGKDD, pp. 2051–2059 (2018)

  31. Pržulj, N., Corneil, D.G., Jurisica, I.: Efficient estimation of graphlet frequency distributions in protein-protein interaction networks. Bioinformatics 22(8), 974–980 (2006)

    Article  Google Scholar 

  32. Qiao, M., Zhang, H., Cheng, H.: Subgraph matching: on compression and computation. Proc. VLDB Endow. 11(2), 176–188 (2017)

    Article  Google Scholar 

  33. Ren, X., Wang, J.: Exploiting vertex relationships in speeding up subgraph isomorphism over large graphs. Proc. VLDB Endow. 8(5), 617–628 (2015)

    Article  Google Scholar 

  34. Ren, X., Wang, J.: Multi-query optimization for subgraph isomorphism search. Proc. VLDB Endow. 10(3), 121–132 (2016)

    Article  Google Scholar 

  35. Rivero, C.R., Jamil, H.M.: Efficient and scalable labeled subgraph matching using sgmatch. Knowl. Inf. Syst. 51(1), 61–87 (2017)

    Article  Google Scholar 

  36. Sahu, S., Mhedhbi, A., Salihoglu, S., Lin, J., Özsu, M.T.: The ubiquity of large graphs and surprising challenges of graph processing. Proc. VLDB Endow. 11(4), 420–431 (2017)

    Article  Google Scholar 

  37. Shang, H., Zhang, Y., Lin, X., Yu, J.X.: Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. Proc. VLDB Endow. 1(1), 364–375 (2008)

    Article  Google Scholar 

  38. Snijders, T.A., Pattison, P.E., Robins, G.L., Handcock, M.S.: New specifications for exponential random graph models. Sociol. Methodol. 36(1), 99–153 (2006)

    Article  Google Scholar 

  39. Sun, S., Luo, Q.: Scaling up subgraph query processing with efficient subgraph matching. In: Proceedings of IEEE ICDE, pp. 220–231 (2019)

  40. Sun, S., Luo, Q.: In-memory subgraph matching: An in-depth study. In: Proceedings of ACM SIGMOD, pp. 1083–1098 (2020)

  41. Sun, S., Luo, Q.: Subgraph matching with effective matching order and indexing. IEEE Transactions on Knowledge and Data Engineering (2020)

  42. Sun, S., Sun, X., Che, Y., Luo, Q., He, B.: Rapidmatch: a holistic approach to subgraph query processing. Proc. VLDB Endow. 14(2), 176–188 (2020)

    Article  Google Scholar 

  43. Ullmann, J.R.: An algorithm for subgraph isomorphism. J. ACM 23(1), 31–42 (1976)

    Article  MathSciNet  Google Scholar 

  44. Wang, J., Ntarmos, N., Triantafillou, P.: Graphcache: a caching system for graph queries, pp. 13–24 (2017)

  45. Wang, J., Ren, X., Anirban, S., Wu, X.W.: Correct filtering for subgraph isomorphism search in compressed vertex-labeled graphs. Inf. Sci. 482, 363–373 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  46. Yan, X., Yu, P.S., Han, J.: Graph indexing: a frequent structure-based approach. In: Proceedings of ACM SIGMOD, pp. 335–346 (2004)

  47. Yanardag, P., Vishwanathan, S.: Deep graph kernels. In: Proceedings of ACM SIGKDD, pp. 1365–1374 (2015)

  48. Zhang, S., Li, S., Yang, J.: GADDI: Distance index based subgraph matching in biological networks. In: Proceedings of ACM EDBT, pp. 192–203 (2009)

  49. Zhao, P., Han, J.: On graph query optimization in large networks. Proc. VLDB Endow. 3(1–2), 340–351 (2010)

    Article  Google Scholar 

  50. Zhao, P., Yu, J.X., Philip, S.Y.: Graph indexing: Tree+ delta\(>=\) graph. In: Proceedings of VLDB, pp. 938–949 (2007)

  51. Zou, L., Chen, L., Yu, J.X., Lu, Y.: A novel spectral coding in a large graph database. In: Proceedings of EDBT, pp. 181–192 (2008)

Download references

Acknowledgements

Hyunjoon Kim was partly supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2020-0-01373, Artificial Intelligence Graduate School Program (Hanyang University)) and the research fund of Hanyang University (HY-202100000003161). Kunsoo Park was supported by Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) (No. 2018-0-00551, Framework of Practical Algorithms for NP-hard Graph Problems). Wook-Shin Han was supported by the National Research Foundation of Korea (NRF) grant (No. NRF-2021R1A2B5B03001551) and Institute of Information & communications Technology Planning & Evaluation (IITP) grant (No. 2018-0-01398).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Kunsoo Park or Wook-Shin Han.

Ethics declarations

Conflict of interest

Seok-Hee Hong works in the same university as Alan Fekete of the editorial board. Wook-Shin Han has a conflict of interest with Kyu-Young Whang in the editorial board.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A preliminary version [19] of this paper was presented at Proceedings of the 2021 International Conference on Management of Data (SIGMOD 2021)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, H., Choi, Y., Park, K. et al. Fast subgraph query processing and subgraph matching via static and dynamic equivalences. The VLDB Journal 32, 343–368 (2023). https://doi.org/10.1007/s00778-022-00749-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-022-00749-x

Keywords

Navigation