Abstract
Graph pattern matching is widely used in real-world applications, such as social network analysis. Since the traditional subgraph isomorphism is NP-complete and often too restrictive to catch sensible matches, relaxed graph pattern matching models are used. However, existing algorithms suffer from limited linear scalability and restricted degrees of parallelism. In this paper, we propose fast parallel algorithms, GPGS and GPDS, for graph simulation and dual simulation, respectively. They make most use of the GPU performance by adopting the edge-centric processing model. We perform parallel computations on the data graph edges to evaluate the matching constraints for each vertex allowing for fast and scalable algorithms. To the best of our knowledge, we present the first GPU-based algorithms for graph simulation and dual simulation. Extensive experiments on synthetic and real-world data graphs demonstrate that our algorithms significantly outperform existing methods, achieving up to 74.8\(\times \) acceleration for GPGS and up to 114.2\(\times \) acceleration for GPDS.













Similar content being viewed by others
Data availability
The datasets generated and analyzed in this study can be obtained from the corresponding author upon reasonable request.
References
Sakr S, Bonifati A, Voigt H, Iosup A, Ammar K, Angles R, Aref W, Arenas M, Besta M, Boncz PA et al (2021) The future is big graphs: a community view on graph processing systems. Commun ACM 64(9):62–71
Shafiei H, Dadlani A (2022) Detection of fickle trolls in large-scale online social networks. J Big Data 9(1):1–21
Yu K, Zhao T, Zhao P, Zhang J (2017) Extraction of protein-protein interactions using natural language processing based pattern matching. In: 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp 1292–1295 . IEEE
Noel S, Harley E, Tam KH, Limiero M, Share M (2016) CyGraph: graph-based analytics and visualization for cybersecurity. Elsevier
Kulkarni P, Joglekar Y (2014) Generating and analyzing test cases from software requirements using nlp and hadoop. Int J Curr Eng Technol 4(6):3934–3937
Hains GJ, Khmelevsky Y, Tachon T (2019) From natural language to graph queries. In: 2019 IEEE Canadian Conference of Electrical and Computer Engineering (CCECE), pp 1–4. IEEE
Osman AH, Barukub OM (2020) Graph-based text representation and matching: a review of the state of the art and future challenges. IEEE Access 8:87562–87583
Liu C, Chen C, Han J, Yu PS (2006) Gplag: detection of software plagiarism by program dependence graph analysis. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 872–881
Cho J, Shivakumar N, Garcia-Molina H (2000) Finding replicated web collections. ACM Sigmod Rec 29(2):355–366
Milner R (1989) Communication and concurrency. Prentice hall Englewood Cliffs
Ma S, Cao Y, Fan W, Huai J, Wo T (2011) Capturing topology in graph pattern matching. arXiv preprint arXiv:1201.0229
Fard A, Nisar MU, Ramaswamy L, Miller JA, Saltz M (2013) A distributed vertex-centric approach for pattern matching in massive graphs. In: 2013 IEEE International Conference on Big Data, pp 403–411 . IEEE
Fan W, Li J, Ma S, Tang N, Wu Y, Wu Y (2010) Graph pattern matching: from intractable to polynomial time. Proc VLDB Endow 3(1–2):264–275
Wu X, Theodoratos D, Skoutas D, Lan M (2020) Leveraging double simulation to efficiently evaluate hybrid patterns on data graphs. In: International Conference on Web Information Systems Engineering, pp 255–269. Springer
Ma S, Cao Y, Huai J, Wo T (2012) Distributed graph pattern matching. In: Proceedings of the 21st International Conference on World Wide Web, pp 949–958
Fan W, Wang X, Wu Y, Deng D (2014) Distributed graph simulation: impossibility and possibility. Proc VLDB Endow 7(12):1083–1094
Schätzle A, Przyjaciel-Zablocki M, Berberich T, Lausen G (2016) S2x: graph-parallel querying of rdf with graphx. In: Biomedical Data Management and Graph Online Querying: VLDB 2015 Workshops, Big-O (Q) and DMAH, Waikoloa, HI, USA, August 31–September 4, 2015, Revised Selected Papers 1, pp 155–168 . Springer
Kao J-S, Chou J (2016) Distributed incremental pattern matching on streaming graphs. In: Proceedings of the ACM Workshop on High Performance Graph Processing, pp 43–50
Li J, Li J, Wang X (2018) A vertex-centric graph simulation algorithm for large graphs. In: Big Data: 6th CCF Conference, Big Data 2018, Xi’an, China, October 11–13, 2018, Proceedings 6, pp 238–254 . Springer
Fan W, Yu W, Xu J, Zhou J, Luo X, Yin Q, Lu P, Cao Y, Xu R (2018) Parallelizing sequential graph computations. ACM Trans Database Syst (TODS) 43(4):1–39
Bouhenni S, Yahiaoui S, Nouali-Taboudjemat N, Kheddouci H (2022) Efficient parallel edge-centric approach for relaxed graph pattern matching. J Supercomput 78(2):1642–1671
Ullmann JR (1976) An algorithm for subgraph isomorphism. J ACM (JACM) 23(1):31–42
Cordella L.P, Foggia P, Sansone C, Vento M (2001) An improved algorithm for matching large graphs. In: 3rd IAPR-TC15 Workshop on Graph-based Representations in Pattern Recognition, pp 149–159
Bonnici V, Giugno R, Pulvirenti A, Shasha D, Ferro A (2013) A subgraph isomorphism algorithm and its application to biochemical data. BMC Bioinform 14(7):1–13
Carletti V, Foggia P, Saggese A, Vento M (2017) Challenging the time complexity of exact subgraph isomorphism for huge and dense graphs with vf3. IEEE Trans Pattern Anal Mach Intell 40(4):804–818
Han W-S, Lee J, Lee J-H (2013) Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp 337–348
Bi F, Chang L, Lin X, Qin L, Zhang W (2016) Efficient subgraph matching by postponing cartesian products. In: Proceedings of the 2016 International Conference on Management of Data, pp 1199–1214
Han M, Kim H, Gu G, Park K, Han W-S (2019) Efficient subgraph matching: harmonizing dynamic programming, adaptive matching order, and failing set together. In: Proceedings of the 2019 International Conference on Management of Data, pp 1429–1446
Sun S, Luo Q (2020) Subgraph matching with effective matching order and indexing. IEEE Trans Knowl Data Eng 34(1):491–505
Lv L, Liu J, Li Q, Li J (2022) Optimization of subgraph matching over knowledge graph based on subgraph indexing. In: 2022 5th International Conference on Artificial Intelligence and Big Data (ICAIBD), pp 543–546. IEEE
Archibald B, Dunlop F, Hoffmann R, McCreesh C, Prosser P, Trimble J (2019) Sequential and parallel solution-biased search for subgraph algorithms. In: Integration of Constraint Programming, Artificial Intelligence, and Operations Research: 16th International Conference, CPAIOR 2019, Thessaloniki, Greece, June 4–7, 2019, Proceedings 16, pp 20–38. Springer
Ansari ZA, Abulaish M et al (2021) An efficient subgraph isomorphism solver for large graphs. IEEE Access 9:61697–61709
Moayed H, Mansoori EG, Moosavi MR (2023) An efficient pruning method for subgraph matching in large-scale graphs. J Supercomput 79(10):10511–10532
Raman R, Rest O, Hong S, Wu Z, Chafi H, Banerjee J (2014) Pgx. ISO: parallel and efficient in-memory engine for subgraph isomorphism. In: Proceedings of Workshop on GRAph Data Management Experiences and Systems, pp 1–6
Lai L, Qin L, Lin X, Zhang Y, Chang L, Yang S (2016) Scalable distributed subgraph enumeration. Proc VLDB Endow 10(3):217–228
Qiao M, Zhang H, Cheng H (2017) Subgraph matching: on compression and computation. Proc VLDB Endow 11(2):176–188
Ammar K, McSherry F, Salihoglu S, Joglekar M (2018) Distributed evaluation of subgraph queries using worstcase optimal lowmemory dataflows. arXiv preprint arXiv:1802.03760
Carletti V, Foggia P, Ritrovato P, Vento M, Vigilante V (2019) A parallel algorithm for subgraph isomorphism. In: Graph-Based Representations in Pattern Recognition: 12th IAPR-TC-15 International Workshop, GbRPR 2019, Tours, France, June 19–21, 2019, Proceedings 12, pp 141–151. Springer
Serafini M, De Francisci Morales G, Siganos G (2017) Qfrag: distributed graph search via subgraph isomorphism. In: Proceedings of the 2017 Symposium on Cloud Computing, pp 214–228
Bhattarai B, Liu H, Huang HH (2019) Ceci: compact embedding cluster index for scalable subgraph matching. In: Proceedings of the 2019 International Conference on Management of Data, pp 1447–1462
Sun S, Luo Q (2018) Parallelizing recursive backtracking based subgraph matching on a single machine. In: 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS), pp 1–9 . IEEE
Jin X, Lai L (2019) Mpmatch: a multi-core parallel subgraph matching algorithm. In: 2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW), pp 241–248. IEEE
Sun Z, Wang H, Wang H, Shao B, Li J (2012) Efficient subgraph matching on billion node graphs. arXiv preprint arXiv:1205.6691
Zeng K, Yang J, Wang H, Shao B, Wang Z (2013) A distributed graph engine for web scale RDF data. Proc VLDB Endow 6(4):265–276
Shao Y, Cui B, Chen L, Ma L, Yao J, Xu N (2014) Parallel subgraph listing in a large-scale graph. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp 625–636
Gao J, Zhou C, Zhou J, Yu JX (2014) Continuous pattern detection over billion-edge graph using distributed framework. In: 2014 IEEE 30th International Conference on Data Engineering, pp 556–567. IEEE
Reza T, Klymko C, Ripeanu M, Sanders G, Pearce R (2017) Towards practical and robust labeled pattern matching in trillion-edge graphs. In: 2017 IEEE International Conference on Cluster Computing (CLUSTER), pp 1–12. IEEE
Reza T, Ripeanu M, Tripoul N, Sanders G, Pearce R (2018) Prunejuice: pruning trillion-edge graphs to a precise pattern-matching solution. In: SC18: International Conference for High Performance Computing, Networking, Storage and Analysis, pp 265–281. IEEE
Stein M, Frömmgen A, Kluge R, Wang L, Wilberg A, Koldehofe B, Mühlhäuser M (2018) Scaling topology pattern matching: A distributed approach. In: Proceedings of the 33rd Annual ACM Symposium on Applied Computing, pp 996–1005
Wang Z, Gu R, Hu W, Yuan C, Huang Y (2019) Benu: distributed subgraph enumeration with backtracking-based framework. In: 2019 IEEE 35th International Conference on Data Engineering (ICDE), pp 136–147. IEEE
Wang Z, Hu W, Yuan C, Gu R, Huang Y (2020) Distributed subgraph enumeration via backtracking-based framework. arXiv preprint arXiv:2006.12819
Yang Z, Lai L, Lin X, Hao K, Zhang W (2021) Huge: an efficient and scalable subgraph enumeration system. In: Proceedings of the 2021 International Conference on Management of Data, pp 2049–2062
Lin X, Zhang R, Wen Z, Wang H, Qi J (2014) Efficient subgraph matching using gpus. In: Databases Theory and Applications: 25th Australasian Database Conference, ADC 2014, Brisbane, QLD, Australia, July 14–16, 2014. Proceedings 25, pp 74–85. Springer
Tran H.-N, Kim J-j, He B (2015) Fast subgraph matching on large graphs using graphics processors. In: Database Systems for Advanced Applications: 20th International Conference, DASFAA 2015, Hanoi, Vietnam, April 20–23, 2015, Proceedings, Part I 20, pp 299–315. Springer
Zeng L, Zou L, Özsu MT, Hu L, Zhang F (2020) Gsi: Gpu-friendly subgraph isomorphism. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp 1249–1260. IEEE
Guo W, Li Y, Tan K-L (2020) Exploiting reuse for gpu subgraph enumeration. IEEE Trans Knowl Data Eng 34(9):4231–4244
Guo W, Li Y, Sha M, He B, Xiao X, Tan K-L (2020) Gpu-accelerated subgraph enumeration on partitioned graphs. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp 1067–1082
Chen J, Gu Y, Wang Q, Li C, Yu G (2020) Partition-oriented subgraph matching on gpu. In: Web and Big Data: 4th International Joint Conference, APWeb-WAIM 2020, Tianjin, China, September 18–20, 2020, Proceedings, Part I 4, pp 53–68. Springer
Xiang L, Khan A, Serra E, Halappanavar M, Sukumaran-Rajam A (2021) cuts: scaling subgraph isomorphism on distributed multi-gpu systems using trie based data structure. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp 1–14
Wei Y, Jiang P (2022) Stmatch: accelerating graph pattern matching on gpu with stack-based loop optimizations. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, pp 1–13
Zeng L, Zou L, Özsu MT (2022) Sgsi–a scalable gpu-friendly subgraph isomorphism algorithm. IEEE Trans Knowl Data Eng
Sun X, Luo Q (2023) Efficient gpu-accelerated subgraph matching. Proc ACM Manag Data 1(2):1–26
Sun S, Luo Q (2020) In-memory subgraph matching: an in-depth study. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp 1083–1098
Bouhenni S, Yahiaoui S, Nouali-Taboudjemat N, Kheddouci H (2021) A survey on distributed graph pattern matching in massive graphs. ACM Comput Surv (CSUR) 54(2):1–35
Henzinger MR, Henzinger TA, Kopke PW (1995) Computing simulations on finite and infinite graphs. In: Proceedings of IEEE 36th Annual Foundations of Computer Science, pp 453–462. IEEE
Fan W, Wang X, Wu Y (2013) Incremental graph pattern matching. ACM Trans Database Syst (TODS) 38(3):1–47
Fan W, Wang X, Wu Y (2013) Diversified top-k graph pattern matching. Proc VLDB Endow 6(13):1510–1521
Malewicz G, Austern MH, Bik AJ, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp 135–146
Benachour A, Yahiaoui S, El Baz D, Nouali-Taboudjemat N, Kheddouci H (2023) Fast parallel algorithms for finding elementary circuits of a directed graph: a gpu-based approach. J Supercomput 79(5):4791–4819
Leskovec J, Krevl A (2014) SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data
Chakrabarti D, Zhan Y, Faloutsos C (2004) R-mat: a recursive model for graph mining. In: Proceedings of the 2004 SIAM International Conference on Data Mining, pp 442–446. SIAM
Acknowledgements
This work was funded by the Franco-Algerian program PHC Tassili BiGreen n\(^{\circ }\)18 MDU 111. The experiments were carried out utilizing the GPU station YUVA II, which was made available by the Research Center on Scientific and Technical Information CERIST (Algeria).
Author information
Authors and Affiliations
Contributions
AB: Conceived and developed the algorithms, conducted experiments, and authored the initial manuscript. SY: Contributed to the research design, conducted the validation and analysis of the results, provided supervision, and engaged in the review and editing process. SB: Contributed to the research design, provided experimental resources, and contributed to the review and editing of the manuscript. HK: Provided supervision and contributed to the review and editing process. NN-T: Provided supervision and contributed to the review and editing process. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Benachour, A., Yahiaoui, S., Bouhenni, S. et al. GPU-accelerated relaxed graph pattern matching algorithms. J Supercomput 80, 21811–21836 (2024). https://doi.org/10.1007/s11227-024-06283-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-024-06283-7