Skip to main content
Log in

δ-Transitive closures and triangle consistency checking: a new way to evaluate graph pattern queries in large graph databases

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Recently, graph databases have been received much attention in the research community due to their extensive applications in practice, such as social networks, biological networks, and World Wide Web, which bring forth a lot of challenging data management problems including subgraph search, shortest path queries, reachability verification, pattern matching, and so on. Among them, the graph pattern matching is to find all matches in a data graph G for a given pattern graph Q and is more general and flexible compared with other problems mentioned above. In this paper, we address a kind of graph matching, the so-called graph matching with δ, by which an edge in Q is allowed to match a path of length ≤ δ in G. In order to reduce the search space when exploring G to find matches, we propose a new index structure and a novel pruning technique to eliminate a lot of unqualified vertices before join operations are carried out. Extensive experiments have been conducted, which show that our approach makes great improvements in running time compared to existing ones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Shasha D, Wang JTL, Giugno R (2002) Algorithmics and applications of tree and graph searching. In: ACM SIGMOD-SIGACT-SIGART Symposium Principles Database Systems, p 39

  2. Cheng J, Ke Y, Ng W, Lu A (2007) Fg-index: towards verification-free query processing on graph databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp 857–872

  3. Jiang H, Wang H, Yu PS, Zhou S (2007) GString: a novel approach for efficient search in graph databases. In: Proceedings of the 23rd International Conference on ICDE, pp 566–575. IEEE

  4. Tian Y, McEachin RC, Santos C, States DJ, Patel JM (2007) SAGA: a subgraph matching tool for biological graphs. Bioinformatics 23(2):232–239

    Article  Google Scholar 

  5. Cheng J, Yu JX (2009) On-line exact shortest distance query processing. In: Proceedings of the 12th International Conference on Extending Database Technology Advances Database Technology, EDBT 09, pp 481–492

  6. Cohen E, Halperin E, Kaplan H, Zwick U (2003) Reachability and distance queries via 2-Hop labels. SIAM J Comput 32:1338–1355

    Article  MathSciNet  Google Scholar 

  7. Chen Y, Chen Y (2008) An efficient algorithm for answering graph reachability queries. In: Proceedings of the ICDE, pp 893–902

  8. Wang H, He H, Yang J, Yu PS, Yu JX (2006) Dual labeling: answering graph reachability queries in constant time. In: Proceedings of the International Conference on ICDE, pp 75–86

  9. Chen Y, Chen YB (2011) Decomposing DAGs into spanning trees: a new way to compress transitive closures. In: Proceedings of the 27th International Conference on Data Engineering (ICDE 2011), IEEE, April 2011, pp 1007–1018

  10. Moustafa WE, Kimmig A, Deshpande A, Getoor L (2014) Subgraph pattern matching over uncertain graphs with identity linkage uncertainty. In: Proceedings of the International Conference on ICDE, pp 904–915

  11. Tong H, Gallagher B, Faloutsos C, Eliassi-Rad T (2007) Fast best-effort pattern matching in large attributed graphs. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery Data Mining, pp 737–746

  12. Cheng J, Yu JX, Ding B, Yu PS, Wang H (2008) Fast graph pattern matching. In: Proceedings of the International Conference on ICDE, pp 913–922

  13. Zou L, Chen L, Özsu M (2009) Distance-join: pattern match query in a large graph. VLDB 2(1):886–897

    Google Scholar 

  14. Tian Y, Patel JM (2008) TALE: a tool for approximate large graph matching. In: Proceedings of the International Conference on ICDE, pp 963–972

  15. Conte D, Foggia P, Sansone C, Vento M (2004) Thirty years of graph matching in pattern recognition. Int J Pattern Recognit Artif Intell 18(3):265–298

    Article  Google Scholar 

  16. Melnik S, Garcia-Molina H (2002) Similarity flooding: a versatile graph matching algorithm and its application to schema matching. In: Proceedings of the ICDE

  17. He H, Singh AK (2008) Closure—tree: an index structure for graph queries. In: Proceedings of the ICDE, pp 405–418

  18. Toresen S (2007) An efficient solution to inexact graph matching with applications to computer vision. Ph.D. thesis, Department of Computer and Information Science. Norwegian University of Science and Technology

  19. Garey, Johnson DS (1990) Computers and intractability: a guide to the theory of Np-completeness. W.H. Freeman & Co, New York

    MATH  Google Scholar 

  20. Hopcroft JE, Wong J (1974) Linear time algorithm for isomorphism of planar graphs. In: Proceedings of the 6th Annual ACM Symposium Theory of Computing, pp 172–184

  21. Luks EM (1982) Isomorphism of graphs of bounded valence graphs can be tested in polynomial time. J Comput Syst Sci 25:42–65

    Article  MathSciNet  Google Scholar 

  22. Yan X, Yu PS, Han J (2004) Graph indexing: a frequent structure-based approach. In: Proceedings of the ACM SIGMOD, pp 335–346

  23. Zhang S, Hu M, Yang J (2007) TreePi: a novel graph indexing method. In: Proceedings of the International Conference on Data Engineering, pp 966–975

  24. Williams DW, Huan J, Wang W (2007) Graph database indexing using structured graph decomposition Department of Computer Science. In: Proceedings of the 23rd International Conference on ICDE, pp 976–985

  25. Zhao P, Yu JX, Yu PS (2007) Graph indexing: tree + delta > = graph. In: Proceedings of the International Conference on VLDB, October 2007, pp 938–949

  26. Zhao P, Jiawei H (2010) On graph query optimization in large networks. In: Proceedings of the VLDB, pp. 340–351

  27. Trißl S, Leser U (2007) Fast and practical indexing and querying of very large graphs. In: Proceedings of the SIGMOD’2007, pp. 845–856

  28. Cordella LP, Foggia P, Sansone C, Vento M (2000) Fast graph matching for detecting CAD image components. In: Proceedings of the 15th International Conference Pattern Recognition, pp 1034–1037

  29. Cordella LP, Foggia P, Sansone C, Tortorlla F, Vento M (1998) Graph matching: a fast algorihm and its evaluation. In: Proceedings of the 15th International Conference on Pattern Recognition, pp 1852–1854

  30. Cohen E, Halperin E, Kaplan H, Zwick U (2003) Reachability and distance queries via 2-hop labels. SIAM J Comput 32(5):1338–1355

    Article  MathSciNet  Google Scholar 

  31. Linial N, London E, Rabinovich Y (1995) The geometry of graphs and some of its algorithmic applications. Combinatorica 15(2):215–245

    Article  MathSciNet  Google Scholar 

  32. Shahabi C, Kolahdouzan MR, Sharifzadeh M (2003) A road network embedding technique for K-nearest neighbor search in moving object databases. Geoinformatica 7(3):255–273

    Article  Google Scholar 

  33. Abello JM, Pardalos PM, Resende MGC (eds) (2002) Handbook of massive data sets. Springer, Berlin

    MATH  Google Scholar 

  34. Jiawei H, Kamber M, Pei J (2012) Data mining: concepts and techniques. Elsevier/Morgan Kaufmann, Amsterdam

    MATH  Google Scholar 

  35. Henzinger MR, Henzinger T, Kopke P (1995) Computing simulations on finite and infinite graphs. In: Proceedings of the FOCS

  36. Li J, Cao Y, Ma S (2017) Relaxing graph pattern matching with explanations. In: Proceedings of the International Conference CIKM’17, November 6–10, Singapore

  37. Fan W, Wang X, Wu Y (2013) Incremental graph pattern matching. ACM Trans Database Syst 38(3):18.1–18.44

    Article  MathSciNet  Google Scholar 

  38. Fredman ML (1976) New bounds on the complexity of the shortest path problem. SIAM J Comput 5(1):83–89

    Article  MathSciNet  Google Scholar 

  39. Ahuja RK, Mehlhorn K, Orlin JB, Tarjan RE (1990) Faster algorithms for the shortest path problem. J ACM 37:213–223

    Article  MathSciNet  Google Scholar 

  40. Steinbrunn M, Moerkotte G, Kemper A (1997) Heuristic and randomized optimization for the join ordering problem. VLDB J 6(3):191–208

    Article  Google Scholar 

  41. Wu Y, Patel JM, Jagadish HV (2003) Structural join order selection for xml query optimization. In: Proceedings of the ICDE

  42. Krishnamurthy R, Boral H, Zaniolo C (1986) Optimization of non-recursive queries. In: Proceedings of the VLDB, Kyoto, Japan, pp 128–137

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yangjun Chen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Y., Guo, B. & Huang, X. δ-Transitive closures and triangle consistency checking: a new way to evaluate graph pattern queries in large graph databases. J Supercomput 76, 8140–8174 (2020). https://doi.org/10.1007/s11227-019-02762-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-019-02762-4

Keywords

Navigation