Skip to main content

Advertisement

Log in

GPU-accelerated relaxed graph pattern matching algorithms

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Graph pattern matching is widely used in real-world applications, such as social network analysis. Since the traditional subgraph isomorphism is NP-complete and often too restrictive to catch sensible matches, relaxed graph pattern matching models are used. However, existing algorithms suffer from limited linear scalability and restricted degrees of parallelism. In this paper, we propose fast parallel algorithms, GPGS and GPDS, for graph simulation and dual simulation, respectively. They make most use of the GPU performance by adopting the edge-centric processing model. We perform parallel computations on the data graph edges to evaluate the matching constraints for each vertex allowing for fast and scalable algorithms. To the best of our knowledge, we present the first GPU-based algorithms for graph simulation and dual simulation. Extensive experiments on synthetic and real-world data graphs demonstrate that our algorithms significantly outperform existing methods, achieving up to 74.8\(\times \) acceleration for GPGS and up to 114.2\(\times \) acceleration for GPDS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Algorithm 2
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

The datasets generated and analyzed in this study can be obtained from the corresponding author upon reasonable request.

References

  1. Sakr S, Bonifati A, Voigt H, Iosup A, Ammar K, Angles R, Aref W, Arenas M, Besta M, Boncz PA et al (2021) The future is big graphs: a community view on graph processing systems. Commun ACM 64(9):62–71

    Article  Google Scholar 

  2. Shafiei H, Dadlani A (2022) Detection of fickle trolls in large-scale online social networks. J Big Data 9(1):1–21

    Article  Google Scholar 

  3. Yu K, Zhao T, Zhao P, Zhang J (2017) Extraction of protein-protein interactions using natural language processing based pattern matching. In: 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp 1292–1295 . IEEE

  4. Noel S, Harley E, Tam KH, Limiero M, Share M (2016) CyGraph: graph-based analytics and visualization for cybersecurity. Elsevier

  5. Kulkarni P, Joglekar Y (2014) Generating and analyzing test cases from software requirements using nlp and hadoop. Int J Curr Eng Technol 4(6):3934–3937

    Google Scholar 

  6. Hains GJ, Khmelevsky Y, Tachon T (2019) From natural language to graph queries. In: 2019 IEEE Canadian Conference of Electrical and Computer Engineering (CCECE), pp 1–4. IEEE

  7. Osman AH, Barukub OM (2020) Graph-based text representation and matching: a review of the state of the art and future challenges. IEEE Access 8:87562–87583

    Article  Google Scholar 

  8. Liu C, Chen C, Han J, Yu PS (2006) Gplag: detection of software plagiarism by program dependence graph analysis. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 872–881

  9. Cho J, Shivakumar N, Garcia-Molina H (2000) Finding replicated web collections. ACM Sigmod Rec 29(2):355–366

    Article  Google Scholar 

  10. Milner R (1989) Communication and concurrency. Prentice hall Englewood Cliffs

  11. Ma S, Cao Y, Fan W, Huai J, Wo T (2011) Capturing topology in graph pattern matching. arXiv preprint arXiv:1201.0229

  12. Fard A, Nisar MU, Ramaswamy L, Miller JA, Saltz M (2013) A distributed vertex-centric approach for pattern matching in massive graphs. In: 2013 IEEE International Conference on Big Data, pp 403–411 . IEEE

  13. Fan W, Li J, Ma S, Tang N, Wu Y, Wu Y (2010) Graph pattern matching: from intractable to polynomial time. Proc VLDB Endow 3(1–2):264–275

    Article  Google Scholar 

  14. Wu X, Theodoratos D, Skoutas D, Lan M (2020) Leveraging double simulation to efficiently evaluate hybrid patterns on data graphs. In: International Conference on Web Information Systems Engineering, pp 255–269. Springer

  15. Ma S, Cao Y, Huai J, Wo T (2012) Distributed graph pattern matching. In: Proceedings of the 21st International Conference on World Wide Web, pp 949–958

  16. Fan W, Wang X, Wu Y, Deng D (2014) Distributed graph simulation: impossibility and possibility. Proc VLDB Endow 7(12):1083–1094

    Article  Google Scholar 

  17. Schätzle A, Przyjaciel-Zablocki M, Berberich T, Lausen G (2016) S2x: graph-parallel querying of rdf with graphx. In: Biomedical Data Management and Graph Online Querying: VLDB 2015 Workshops, Big-O (Q) and DMAH, Waikoloa, HI, USA, August 31–September 4, 2015, Revised Selected Papers 1, pp 155–168 . Springer

  18. Kao J-S, Chou J (2016) Distributed incremental pattern matching on streaming graphs. In: Proceedings of the ACM Workshop on High Performance Graph Processing, pp 43–50

  19. Li J, Li J, Wang X (2018) A vertex-centric graph simulation algorithm for large graphs. In: Big Data: 6th CCF Conference, Big Data 2018, Xi’an, China, October 11–13, 2018, Proceedings 6, pp 238–254 . Springer

  20. Fan W, Yu W, Xu J, Zhou J, Luo X, Yin Q, Lu P, Cao Y, Xu R (2018) Parallelizing sequential graph computations. ACM Trans Database Syst (TODS) 43(4):1–39

    Article  MathSciNet  Google Scholar 

  21. Bouhenni S, Yahiaoui S, Nouali-Taboudjemat N, Kheddouci H (2022) Efficient parallel edge-centric approach for relaxed graph pattern matching. J Supercomput 78(2):1642–1671

    Article  Google Scholar 

  22. Ullmann JR (1976) An algorithm for subgraph isomorphism. J ACM (JACM) 23(1):31–42

    Article  MathSciNet  Google Scholar 

  23. Cordella L.P, Foggia P, Sansone C, Vento M (2001) An improved algorithm for matching large graphs. In: 3rd IAPR-TC15 Workshop on Graph-based Representations in Pattern Recognition, pp 149–159

  24. Bonnici V, Giugno R, Pulvirenti A, Shasha D, Ferro A (2013) A subgraph isomorphism algorithm and its application to biochemical data. BMC Bioinform 14(7):1–13

    Google Scholar 

  25. Carletti V, Foggia P, Saggese A, Vento M (2017) Challenging the time complexity of exact subgraph isomorphism for huge and dense graphs with vf3. IEEE Trans Pattern Anal Mach Intell 40(4):804–818

    Article  Google Scholar 

  26. Han W-S, Lee J, Lee J-H (2013) Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp 337–348

  27. Bi F, Chang L, Lin X, Qin L, Zhang W (2016) Efficient subgraph matching by postponing cartesian products. In: Proceedings of the 2016 International Conference on Management of Data, pp 1199–1214

  28. Han M, Kim H, Gu G, Park K, Han W-S (2019) Efficient subgraph matching: harmonizing dynamic programming, adaptive matching order, and failing set together. In: Proceedings of the 2019 International Conference on Management of Data, pp 1429–1446

  29. Sun S, Luo Q (2020) Subgraph matching with effective matching order and indexing. IEEE Trans Knowl Data Eng 34(1):491–505

    Article  Google Scholar 

  30. Lv L, Liu J, Li Q, Li J (2022) Optimization of subgraph matching over knowledge graph based on subgraph indexing. In: 2022 5th International Conference on Artificial Intelligence and Big Data (ICAIBD), pp 543–546. IEEE

  31. Archibald B, Dunlop F, Hoffmann R, McCreesh C, Prosser P, Trimble J (2019) Sequential and parallel solution-biased search for subgraph algorithms. In: Integration of Constraint Programming, Artificial Intelligence, and Operations Research: 16th International Conference, CPAIOR 2019, Thessaloniki, Greece, June 4–7, 2019, Proceedings 16, pp 20–38. Springer

  32. Ansari ZA, Abulaish M et al (2021) An efficient subgraph isomorphism solver for large graphs. IEEE Access 9:61697–61709

    Article  Google Scholar 

  33. Moayed H, Mansoori EG, Moosavi MR (2023) An efficient pruning method for subgraph matching in large-scale graphs. J Supercomput 79(10):10511–10532

    Article  Google Scholar 

  34. Raman R, Rest O, Hong S, Wu Z, Chafi H, Banerjee J (2014) Pgx. ISO: parallel and efficient in-memory engine for subgraph isomorphism. In: Proceedings of Workshop on GRAph Data Management Experiences and Systems, pp 1–6

  35. Lai L, Qin L, Lin X, Zhang Y, Chang L, Yang S (2016) Scalable distributed subgraph enumeration. Proc VLDB Endow 10(3):217–228

    Article  Google Scholar 

  36. Qiao M, Zhang H, Cheng H (2017) Subgraph matching: on compression and computation. Proc VLDB Endow 11(2):176–188

    Article  Google Scholar 

  37. Ammar K, McSherry F, Salihoglu S, Joglekar M (2018) Distributed evaluation of subgraph queries using worstcase optimal lowmemory dataflows. arXiv preprint arXiv:1802.03760

  38. Carletti V, Foggia P, Ritrovato P, Vento M, Vigilante V (2019) A parallel algorithm for subgraph isomorphism. In: Graph-Based Representations in Pattern Recognition: 12th IAPR-TC-15 International Workshop, GbRPR 2019, Tours, France, June 19–21, 2019, Proceedings 12, pp 141–151. Springer

  39. Serafini M, De Francisci Morales G, Siganos G (2017) Qfrag: distributed graph search via subgraph isomorphism. In: Proceedings of the 2017 Symposium on Cloud Computing, pp 214–228

  40. Bhattarai B, Liu H, Huang HH (2019) Ceci: compact embedding cluster index for scalable subgraph matching. In: Proceedings of the 2019 International Conference on Management of Data, pp 1447–1462

  41. Sun S, Luo Q (2018) Parallelizing recursive backtracking based subgraph matching on a single machine. In: 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS), pp 1–9 . IEEE

  42. Jin X, Lai L (2019) Mpmatch: a multi-core parallel subgraph matching algorithm. In: 2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW), pp 241–248. IEEE

  43. Sun Z, Wang H, Wang H, Shao B, Li J (2012) Efficient subgraph matching on billion node graphs. arXiv preprint arXiv:1205.6691

  44. Zeng K, Yang J, Wang H, Shao B, Wang Z (2013) A distributed graph engine for web scale RDF data. Proc VLDB Endow 6(4):265–276

    Article  Google Scholar 

  45. Shao Y, Cui B, Chen L, Ma L, Yao J, Xu N (2014) Parallel subgraph listing in a large-scale graph. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp 625–636

  46. Gao J, Zhou C, Zhou J, Yu JX (2014) Continuous pattern detection over billion-edge graph using distributed framework. In: 2014 IEEE 30th International Conference on Data Engineering, pp 556–567. IEEE

  47. Reza T, Klymko C, Ripeanu M, Sanders G, Pearce R (2017) Towards practical and robust labeled pattern matching in trillion-edge graphs. In: 2017 IEEE International Conference on Cluster Computing (CLUSTER), pp 1–12. IEEE

  48. Reza T, Ripeanu M, Tripoul N, Sanders G, Pearce R (2018) Prunejuice: pruning trillion-edge graphs to a precise pattern-matching solution. In: SC18: International Conference for High Performance Computing, Networking, Storage and Analysis, pp 265–281. IEEE

  49. Stein M, Frömmgen A, Kluge R, Wang L, Wilberg A, Koldehofe B, Mühlhäuser M (2018) Scaling topology pattern matching: A distributed approach. In: Proceedings of the 33rd Annual ACM Symposium on Applied Computing, pp 996–1005

  50. Wang Z, Gu R, Hu W, Yuan C, Huang Y (2019) Benu: distributed subgraph enumeration with backtracking-based framework. In: 2019 IEEE 35th International Conference on Data Engineering (ICDE), pp 136–147. IEEE

  51. Wang Z, Hu W, Yuan C, Gu R, Huang Y (2020) Distributed subgraph enumeration via backtracking-based framework. arXiv preprint arXiv:2006.12819

  52. Yang Z, Lai L, Lin X, Hao K, Zhang W (2021) Huge: an efficient and scalable subgraph enumeration system. In: Proceedings of the 2021 International Conference on Management of Data, pp 2049–2062

  53. Lin X, Zhang R, Wen Z, Wang H, Qi J (2014) Efficient subgraph matching using gpus. In: Databases Theory and Applications: 25th Australasian Database Conference, ADC 2014, Brisbane, QLD, Australia, July 14–16, 2014. Proceedings 25, pp 74–85. Springer

  54. Tran H.-N, Kim J-j, He B (2015) Fast subgraph matching on large graphs using graphics processors. In: Database Systems for Advanced Applications: 20th International Conference, DASFAA 2015, Hanoi, Vietnam, April 20–23, 2015, Proceedings, Part I 20, pp 299–315. Springer

  55. Zeng L, Zou L, Özsu MT, Hu L, Zhang F (2020) Gsi: Gpu-friendly subgraph isomorphism. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp 1249–1260. IEEE

  56. Guo W, Li Y, Tan K-L (2020) Exploiting reuse for gpu subgraph enumeration. IEEE Trans Knowl Data Eng 34(9):4231–4244

    Article  Google Scholar 

  57. Guo W, Li Y, Sha M, He B, Xiao X, Tan K-L (2020) Gpu-accelerated subgraph enumeration on partitioned graphs. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp 1067–1082

  58. Chen J, Gu Y, Wang Q, Li C, Yu G (2020) Partition-oriented subgraph matching on gpu. In: Web and Big Data: 4th International Joint Conference, APWeb-WAIM 2020, Tianjin, China, September 18–20, 2020, Proceedings, Part I 4, pp 53–68. Springer

  59. Xiang L, Khan A, Serra E, Halappanavar M, Sukumaran-Rajam A (2021) cuts: scaling subgraph isomorphism on distributed multi-gpu systems using trie based data structure. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp 1–14

  60. Wei Y, Jiang P (2022) Stmatch: accelerating graph pattern matching on gpu with stack-based loop optimizations. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, pp 1–13

  61. Zeng L, Zou L, Özsu MT (2022) Sgsi–a scalable gpu-friendly subgraph isomorphism algorithm. IEEE Trans Knowl Data Eng

  62. Sun X, Luo Q (2023) Efficient gpu-accelerated subgraph matching. Proc ACM Manag Data 1(2):1–26

    Google Scholar 

  63. Sun S, Luo Q (2020) In-memory subgraph matching: an in-depth study. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp 1083–1098

  64. Bouhenni S, Yahiaoui S, Nouali-Taboudjemat N, Kheddouci H (2021) A survey on distributed graph pattern matching in massive graphs. ACM Comput Surv (CSUR) 54(2):1–35

    Article  Google Scholar 

  65. Henzinger MR, Henzinger TA, Kopke PW (1995) Computing simulations on finite and infinite graphs. In: Proceedings of IEEE 36th Annual Foundations of Computer Science, pp 453–462. IEEE

  66. Fan W, Wang X, Wu Y (2013) Incremental graph pattern matching. ACM Trans Database Syst (TODS) 38(3):1–47

    Article  MathSciNet  Google Scholar 

  67. Fan W, Wang X, Wu Y (2013) Diversified top-k graph pattern matching. Proc VLDB Endow 6(13):1510–1521

    Article  Google Scholar 

  68. Malewicz G, Austern MH, Bik AJ, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp 135–146

  69. Benachour A, Yahiaoui S, El Baz D, Nouali-Taboudjemat N, Kheddouci H (2023) Fast parallel algorithms for finding elementary circuits of a directed graph: a gpu-based approach. J Supercomput 79(5):4791–4819

    Article  Google Scholar 

  70. Leskovec J, Krevl A (2014) SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data

  71. Chakrabarti D, Zhan Y, Faloutsos C (2004) R-mat: a recursive model for graph mining. In: Proceedings of the 2004 SIAM International Conference on Data Mining, pp 442–446. SIAM

Download references

Acknowledgements

This work was funded by the Franco-Algerian program PHC Tassili BiGreen n\(^{\circ }\)18 MDU 111. The experiments were carried out utilizing the GPU station YUVA II, which was made available by the Research Center on Scientific and Technical Information CERIST (Algeria).

Author information

Authors and Affiliations

Authors

Contributions

AB: Conceived and developed the algorithms, conducted experiments, and authored the initial manuscript. SY: Contributed to the research design, conducted the validation and analysis of the results, provided supervision, and engaged in the review and editing process. SB: Contributed to the research design, provided experimental resources, and contributed to the review and editing of the manuscript. HK: Provided supervision and contributed to the review and editing process. NN-T: Provided supervision and contributed to the review and editing process. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Amira Benachour.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Benachour, A., Yahiaoui, S., Bouhenni, S. et al. GPU-accelerated relaxed graph pattern matching algorithms. J Supercomput 80, 21811–21836 (2024). https://doi.org/10.1007/s11227-024-06283-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-024-06283-7

Keywords

Navigation