skip to main content
10.1145/1376616.1376660acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Graphs-at-a-time: query language and access methods for graph databases

Published: 09 June 2008 Publication History

Abstract

With the prevalence of graph data in a variety of domains, there is an increasing need for a language to query and manipulate graphs with heterogeneous attributes and structures. We propose a query language for graph databases that supports arbitrary attributes on nodes, edges, and graphs. In this language, graphs are the basic unit of information and each query manipulates one or more collections of graphs. To allow for flexible compositions of graph structures, we extend the notion of formal languages from strings to the graph domain. We present a graph algebra extended from the relational algebra in which the selection operator is generalized to graph pattern matching and a composition operator is introduced for rewriting matched graphs. Then, we investigate access methods of the selection operator. Pattern matching over large graphs is challenging due to the NP-completeness of subgraph isomorphism. We address this by a combination of techniques: use of neighborhood subgraphs and profiles, joint reduction of the search space, and optimization of the search order. Experimental results on real and synthetic large graphs demonstrate that our graph specific optimizations outperform an SQL-based implementation by orders of magnitude.

References

[1]
S. Asthana et al. Predicting protein complex membership using probabilistic network reliability. Genome Research, May 2004.
[2]
S. Berretti, A. D. Bimbo, and E. Vicario. Efficient matching and indexing of graph models in content-based retrieval. In IEEE Trans. on Pattern Analysis and Machine Intelligence, volume 23, 2001.
[3]
S. Boag, D. Chamberlin, M. F. Fernández, D. Florescu, J. Robie, and J. Siméon. XQuery 1.0: An XML query language. W3C, http://www.w3.org/TR/xquery/, 2007.
[4]
C. Branden and J. Tooze. Introduction to protein structure. Garland, 2 edition, 1998.
[5]
S. Chaudhuri. An overview of query optimization in relational systems. In PODS, pages 34--43, 1998.
[6]
L. Chen, A. Gupta, and M. E. Kurul. Stack-based algorithms for pattern matching on dags. In Proc. of VLDB '05, pages 493--504, 2005.
[7]
J. Cheng, Y. Ke, W. Ng, and A. Lu. FG-Index: towards verification-free query processing on graph databases. In Proc. of SIGMOD '07, 2007.
[8]
J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computation of reachability labeling for large graphs. In EDBT, pages 961--979, 2006.
[9]
E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. SIAM J. Comput., 32(5):1338--1355, 2003.
[10]
M. P. Consens and A. O. Mendelzon. GraphLog: a visual formalism for real life recursion. In PODS, 1990.
[11]
P. Erdõs and A. Rényi. On random graphs I. Publ. Math. Debrecen, (6):290--297.
[12]
Gene Ontology. http://www.geneontology.org/.
[13]
R. H. Guting. GraphDB: Modeling and querying graphs in databases. In Proc. of VLDB'94, pages 297--308, 1994.
[14]
M. Gyssens, J. Paredaens, and D. van Gucht. A graph-oriented object database model. In Proc. of PODS '90, pages 417--424, 1990.
[15]
H. He and A. K. Singh. Closure-Tree: An Index Structure for Graph Queries. In Proc. of ICDE'06, Atlanta, 2006.
[16]
J. Hopcroft and R. Karp. An n 5/2 algorithm for maximum matchings in bipartite graphs. SIAM J. Computing, 1973.
[17]
J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison Wesley, 1979.
[18]
H. V. Jagadish, L. V. S. Lakshmanan, D. Srivastava, and K. Thompson. TAX: A tree algebra for XML. In Proc. of DBPL'01, 2001.
[19]
H. Jiang, H. Wang, P. S. Yu, and S. Zhou. GString: A novel approach for efficient search in graph databases. In ICDE, 2007.
[20]
J. Lee, J. Oh, and S. Hwang. STRG-Index: Spatio-temporal region graph indexing for large video databases. In Proc. of SIGMOD, 2005.
[21]
U. Leser. A query language for biological networks. Bioinformatics, 21:ii33--ii39, 2005.
[22]
F. Manola and E. Miller. RDF Primer. W3C, http://www.w3.org/TR/rdf-primer/, 2004.
[23]
E. Prud'hommeaux and A. Seaborne. SPARQL query language for RDF. W3C, http://www.w3.org/TR/rdf-sparql-query/, 2007.
[24]
R. Ramakrishnan and J. Gehrke. Database Management Systems, chapter 24 Deductive Databases. McGraw-Hill, third edition, 2003.
[25]
J. Rekers and A. Schurr. A graph grammar approach to graphical parsing. In 11th International IEEE Symposium on Visual Languages, 1995.
[26]
G. Rozenberg (Ed.). Handbook on Graph Grammars and Computing by Graph Transformation: Foundations, volume 1. World Scientific, 1997.
[27]
R. Schenkel, A. Theobald, and G. Weikum. Efficient creation and incremental maintenance of the HOPI index for complex XML document collections. In Proc. of ICDE '05, pages 360--371, 2005.
[28]
N. Shadbolt, T. Berners-Lee, and W. Hall. The semantic web revisited. IEEE Intelligent Systems, 21(3):96--101, 2006.
[29]
D. Shasha, J. T. L. Wang, and R. Giugno. Algorithmics and applications of tree and graph searching. In Proc. of PODS, 2002.
[30]
L. Sheng, Z. M. Ozsoyoglu, and G. Ozsoyoglu. A graph query language and its query processing. In ICDE, 1999.
[31]
Y. Tian, R. C. McEachin, C. Santos, D. J. States, and J. M. Patel. SAGA: a subgraph matching tool for biological graphs. Bioinformatics, 23(2), 2007.
[32]
S. Tri$ßl and U. Leser. Fast and practical indexing and querying of very large graphs. In Proc. of SIGMOD '07, pages 845--856, 2007.
[33]
H. Wang, H. He, J. Yang, P. S. Yu, and J. X. Yu. Dual labeling: Answering graph reachability queries in constant time. In Proc. of ICDE '06, page 75, 2006.
[34]
D. W. Williams, J. Huan, and W. Wang. Graph database indexing using structured graph decomposition. In ICDE, 2007.
[35]
X. Yan, P. S. Yu, and J. Han. Graph Indexing: A frequent structure-based approach. In Proc. of SIGMOD, 2004.
[36]
S. Zhang, M. Hu, and J. Yang. TreePi: A novel graph indexing method. In ICDE, 2007.
[37]
P. Zhao, J. X. Yu, and P. S. Yu. Graph indexing: Tree delta >= graph. In Proc. of VLDB, pages 938--949, 2007.

Cited By

View all
  • (2025)GLumin: Fast Connectivity Check Based on LUTs For Efficient Graph Pattern MiningProceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3710848.3710889(455-468)Online publication date: 28-Feb-2025
  • (2025)BⓈX: Subgraph Matching with Batch Backtracking SearchProceedings of the ACM on Management of Data10.1145/37096653:1(1-27)Online publication date: 11-Feb-2025
  • (2025)Steering veridical large language model analyses by correcting and enriching generated database queries: first steps toward ChatGPT bioinformaticsBriefings in Bioinformatics10.1093/bib/bbaf04526:1Online publication date: 6-Feb-2025
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '08: Proceedings of the 2008 ACM SIGMOD international conference on Management of data
June 2008
1396 pages
ISBN:9781605581026
DOI:10.1145/1376616
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 June 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. graph algebra
  2. graph query language
  3. query optimization

Qualifiers

  • Research-article

Conference

SIGMOD/PODS '08
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)141
  • Downloads (Last 6 weeks)10
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)GLumin: Fast Connectivity Check Based on LUTs For Efficient Graph Pattern MiningProceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3710848.3710889(455-468)Online publication date: 28-Feb-2025
  • (2025)BⓈX: Subgraph Matching with Batch Backtracking SearchProceedings of the ACM on Management of Data10.1145/37096653:1(1-27)Online publication date: 11-Feb-2025
  • (2025)Steering veridical large language model analyses by correcting and enriching generated database queries: first steps toward ChatGPT bioinformaticsBriefings in Bioinformatics10.1093/bib/bbaf04526:1Online publication date: 6-Feb-2025
  • (2025)ASM: Adaptive Subgraph Matching via Efficient Compression and Label FilterWeb and Big Data. APWeb-WAIM 2024 International Workshops10.1007/978-981-96-0055-7_3(30-42)Online publication date: 31-Jan-2025
  • (2024)Improving the Relevance, Speed, and Computational Efficiency of Semantic Search through Database Indexing: A ReviewOptimization Algorithms - Classics and Recent Advances10.5772/intechopen.112232Online publication date: 10-Jul-2024
  • (2024)Categorical Multi-Query Subgraph Matching on Labeled GraphElectronics10.3390/electronics1321419113:21(4191)Online publication date: 25-Oct-2024
  • (2024)Top-k Graph Similarity Search Algorithm Based on Chi-Square Statistics in Probabilistic GraphsElectronics10.3390/electronics1301019213:1(192)Online publication date: 1-Jan-2024
  • (2024)Cardinality Estimation of Subgraph Matching: A Filtering-Sampling ApproachProceedings of the VLDB Endowment10.14778/3654621.365463517:7(1697-1709)Online publication date: 1-Mar-2024
  • (2024)Understanding High-Performance Subgraph Pattern Matching: A Systems PerspectiveProceedings of the 7th Joint Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)10.1145/3661304.3661897(1-12)Online publication date: 14-Jun-2024
  • (2024)A Comprehensive Survey and Experimental Study of Subgraph Matching: Trends, Unbiasedness, and InteractionProceedings of the ACM on Management of Data10.1145/36393152:1(1-29)Online publication date: 26-Mar-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media