Skip to main content

Query Language and Access Methods for Graph Databases

  • Chapter
  • First Online:
Managing and Mining Graph Data

Part of the book series: Advances in Database Systems ((ADBS,volume 40))

Abstract

With the prevalence of graph data in a variety of domains, there is an increasing need for a language to query and manipulate graphs with heterogeneous attributes and structures. We present a graph query language (GraphQL) that supports bulk operations on graphs with arbitrary structures and annotated at- tributes. In this language, graphs are the basic unit of information and each query manipulates one or more collections of graphs at a time. The core of GraphQL is a graph algebra extended from the relational algebra in which the selection operator is generalized to graph pattern matching and a composition operator is introduced for rewriting matched graphs. Then, we investigate access methods of the selection operator. Pattern matching over large graphs is challenging due to the NP-completeness of subgraph isomorphism. We address this by a combination of techniques: use of neighborhood subgraphs and pro- files, joint reduction of the search space, and optimization of the search order. Experimental results on real and synthetic large graphs demonstrate that graph specific optimizations outperform an SQL-based implementation by orders of magnitude.

This is a revised and extended version of the article “Graphs-at-a-time: Query Language and Access Methods for Graph Databases”, Huahai He and Ambuj K. Singh, In Proceedings of the 2008 ACMSIGMOD Conference, http://doi.acm.org/10.1145/1376616.1376660. Reprinted with permission of ACM.

Work done while at the University of California, Santa Barbara.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. S. Al-Khalifa, H. V. Jagadish, J. M. Patel, Y. Wu, N. Koudas, and D. Srivastava. Structural joins: A primitive for efficient xml query pattern matching. In ICDE, pages 141–, 2002.

    Google Scholar 

  2. S. Asthana et al. Predicting protein complex membership using probabilistic network reliability. Genome Research, May 2004.

    Google Scholar 

  3. S. Berretti, A. D. Bimbo, and E. Vicario. Efficient matching and indexing of graph models in content-based retrieval. In IEEE Trans. on Pattern Analysis and Machine Intelligence, volume 23, 2001.

    Google Scholar 

  4. S. Boag, D. Chamberlin, M. F. Fernandez, D. Florescu, J. Robie, and J. Simeon. XQuery 1.0: An XML query language. W3C, http://www.w3.org/TR/xquery/,2007.

  5. C. Branden and J. Tooze. Introduction to protein structure. Garland, 2 edition, 1998.

    Google Scholar 

  6. N. Bruno, N. Koudas, and D. Srivastava. Holistic twig joins: optimal XML pattern matching. In SIGMOD Conference, pages 310–321, 2002.

    Google Scholar 

  7. S. Chaudhuri. An overview of query optimization in relational systems. In PODS, pages 34–43, 1998.

    Google Scholar 

  8. L. Chen, A. Gupta, and M. E. Kurul. Stack-based algorithms for pattern matching on dags. In Proc. of VLDB ’05, pages 493–504, 2005.

    Google Scholar 

  9. J. Cheng, Y. Ke, W. Ng, and A. Lu. FG-Index: towards verification-free query processing on graph databases. In Proc. of SIGMOD ’07, 2007.

    Google Scholar 

  10. ] J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast computation of reachability labeling for large graphs. In EDBT, pages 961–979, 2006.

    Google Scholar 

  11. E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. SIAM J. Comput., 32(5):1338–1355, 2003.

    Article  MATH  MathSciNet  Google Scholar 

  12. M. P. Consens and A. O. Mendelzon. GraphLog: a visual formalism for real life recursion. In PODS, 1990.

    Google Scholar 

  13. P. Erdos and A. Renyi. On random graphs I. Publ. Math. Debrecen, (6):290–297, 1959.

    MathSciNet  Google Scholar 

  14. Gene Ontology. http://www.geneontology.org/.

  15. R. H. Guting. GraphDB: Modeling and querying graphs in databases. In Proc. of VLDB’94, pages 297–308, 1994.

    Google Scholar 

  16. M. Gyssens, J. Paredaens, and D. van Gucht. A graph-oriented object database model. In Proc. of PODS ’90, pages 417–424, 1990.

    Google Scholar 

  17. H. He and A. K. Singh. Closure-Tree: An Index Structure for Graph Queries. In Proc. of ICDE ’06, Atlanta, USA, 2006.

    Google Scholar 

  18. H. He and A. K. Singh. Graphs-at-a-time: Query Language and Access Methods for Graph Databases. In Proc. of SIGMOD ’08, pages 405–418, Vancouver, Canada, 2008.

    Google Scholar 

  19. J. Hopcroft and R. Karp. An n 5/2 algorithm for maximum matchings in bipartite graphs. SIAM J. Computing, 1973.

    Google Scholar 

  20. J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison Wesley, 1979.

    Google Scholar 

  21. H. V. Jagadish, S. Al-Khalifa, A. Chapman, L. V. S. Lakshmanan, A. Nierman, S. Paparizos, J. M. Patel, D. Srivastava, N. Wiwatwattana, Y. Wu, and C. Yu. TIMBER: A native XML database. VLDB J., 11(4):274–291, 2002.

    Article  MATH  Google Scholar 

  22. H. V. Jagadish, L. V. S. Lakshmanan, D. Srivastava, and K. Thompson. TAX: A tree algebra for XML. In Proc. of DBPL ’01, 2001.

    Google Scholar 

  23. H. Jiang, H. Wang, P. S. Yu, and S. Zhou. GString: A novel approach for efficient search in graph databases. In ICDE, 2007.

    Google Scholar 

  24. J. Lee, J. Oh, and S. Hwang. STRG-Index: Spatio-temporal region graph indexing for large video databases. In Proc. of SIGMOD, 2005.

    Google Scholar 

  25. U. Leser. A query language for biological networks. Bioinformatics, 21:ii33–ii39, 2005.

    Article  Google Scholar 

  26. F. Manola and E. Miller. RDF Primer. W3C, http://www.w3.org/TR/rdf-primer/,2004.

  27. E. Prud’hommeaux and A. Seaborne. SPARQL query language for RDF. W3C, http://www.w3.org/TR/rdf-sparql-query/,2007.

  28. R. Ramakrishnan and J. Gehrke. Database Management Systems, chapter 24 Deductive Databases. McGraw-Hill, third edition, 2003.

  29. J. Rekers and A. Schurr. A graph grammar approach to graphical parsing. In 11th International IEEE Symposium on Visual Languages, 1995.

    Google Scholar 

  30. G. Rozenberg (Ed.). Handbook on Graph Grammars and Computing by Graph Transformation: Foundations, volume 1. World Scientific, 1997.

    Google Scholar 

  31. R. Schenkel, A. Theobald, and G. Weikum. Efficient creation and incremental maintenance of the HOPI index for complex XML document collections. In Proc. of ICDE ’05, pages 360–371, 2005.

    Google Scholar 

  32. N. Shadbolt, T. Berners-Lee, and W. Hall. The semantic web revisited. IEEE Intelligent Systems, 21(3):96–101, 2006.

    Article  Google Scholar 

  33. J. Shanmugasundaram, K. Tufte, C. Zhang, G. He, D. J. DeWitt, and J. F. Naughton. Relational databases for querying XML documents: Limitations and opportunities. In VLDB, pages 302–314, 1999.

    Google Scholar 

  34. D. Shasha, J. T. L. Wang, and R. Giugno. Algorithmics and applications of tree and graph searching. In Proc. of PODS, 2002.

    Google Scholar 

  35. L. Sheng, Z. M. Ozsoyoglu, and G. Ozsoyoglu. A graph query language and its query processing. In ICDE, 1999.

    Google Scholar 

  36. Y. Tian, R. C. McEachin, C. Santos, D. J. States, and J. M. Patel. SAGA: a subgraph matching tool for biological graphs. Bioinformatics, 23(2), 2007.

    Google Scholar 

  37. S. Trißl and U. Leser. Fast and practical indexing and querying of very large graphs. In Proc. of SIGMOD ’07, pages 845–856, 2007.

    Google Scholar 

  38. H. Wang, H. He, J. Yang, P. S. Yu, and J. X. Yu. Dual labeling: Answering graph reachability queries in constant time. In Proc. of ICDE ’06, page 75, 2006.

    Google Scholar 

  39. D. W. Williams, J. Huan, and W. Wang. Graph database indexing using structured graph decomposition. In ICDE, 2007.

    Google Scholar 

  40. X. Yan, P. S. Yu, and J. Han. Graph Indexing: A frequent structure-based approach. In Proc. of SIGMOD, 2004.

    Google Scholar 

  41. S. Zhang, M. Hu, and J. Yang. TreePi: A novel graph indexing method. In ICDE, 2007.

    Google Scholar 

  42. P. Zhao, J. X. Yu, and P. S. Yu. Graph indexing: Tree + delta >= graph. In Proc. of VLDB, pages 938–949, 2007.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huahai He .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag US

About this chapter

Cite this chapter

He, H., Singh, A.K. (2010). Query Language and Access Methods for Graph Databases. In: Aggarwal, C., Wang, H. (eds) Managing and Mining Graph Data. Advances in Database Systems, vol 40. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-6045-0_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4419-6045-0_4

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4419-6044-3

  • Online ISBN: 978-1-4419-6045-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics