Skip to main content

Labeling RDF Graphs for Linear Time and Space Querying

  • Chapter
  • First Online:
Semantic Web Information Management

Abstract

Indices and data structures for web querying have mostly considered tree shaped data, reflecting the view of XML documents as tree-shaped. However, for RDF (and when querying ID/IDREF constraints in XML) data is indisputably graph-shaped. In this chapter, we first study existing indexing and labeling schemes for RDF and other graph datawith focus on support for efficient adjacency and reachability queries. For XML, labeling schemes are an important part of the widespread adoption of XML, in particular for mapping XML to existing (relational) database technology. However, the existing indexing and labeling schemes for RDF (and graph data in general) sacrifice one of the most attractive properties of XML labeling schemes, the constant time (and per-node space) test for adjacency (child) and reachability (descendant). In the second part, we introduce the first labeling scheme for RDF data that retains this property and thus achieves linear time and space processing of acyclic RDF queries on a significantly larger class of graphs than previous approaches (which are mostly limited to tree-shaped data). Finally, we show how this labeling scheme can be applied to (acyclic) SPARQL queries to obtain an evaluation algorithm with time and space complexity linear in the number of resources in the queried RDF graph.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Agrawal, R., Borgida, A., Jagadish, H.V.: Efficient management of transitive relationships in large data and knowledge bases. In: Proc. ACM Symp. on Management of Data (SIGMOD), pp. 253–262. ACM, New York (1989)

    Google Scholar 

  2. Al-Khalifa, S., Jagadish, H.V., Koudas, N., Patel, J.M., Srivastava, D., Wu, Y.: Structural joins: a primitive for efficient XML query pattern matching. In: Proc. Int. Conf. on Data Engineering, p. 141. IEEE Computer Society, Los Alamitos (2002)

    Google Scholar 

  3. Backett, D.: Turtle—Terse RDF Triple Language. Technical Report, Institute for Learning and Research Technology, University of Bristol (2007)

    Google Scholar 

  4. Beckett, D., McBride, B.: RDF/XML Syntax Specification (Revised). Recommendation, W3C (2004)

    Google Scholar 

  5. Bolzer, O.: Towards Data-Integration on the Semantic Web: Querying RDF with Xcerpt. Diplomarbeit/diploma Thesis, University of Munich (2005)

    Google Scholar 

  6. Boncz, P., Grust, T., van Keulen, M., Manegold, S., Rittinger, J., Teubner, J.: MonetDB/XQuery: a fast XQuery processor powered by a relational engine. In: Proc. ACM Symp. on Management of Data (SIGMOD), pp. 479–490. ACM, New York (2006)

    Google Scholar 

  7. Booth, K.S., Lueker, G.S.: Linear algorithms to recognize interval graphs and test for the consecutive ones property. In: Proc. of ACM Symposium on Theory of Computing, pp. 255–265. ACM, New York (1975)

    Google Scholar 

  8. Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal XML pattern matching. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 310–321. ACM, New York (2002)

    Google Scholar 

  9. Bry, F., Furche, T., Linse, B., Pohl, A.: Xcerptrdf: A pattern-based answer to the versatile web challenge. In: Proc. Workshop on (Constraint) Logic Programming (WLP) (2008)

    Google Scholar 

  10. Chen, L., Gupta, A., Kurul, M.E.: Stack-based algorithms for pattern matching on dags. In: Proc. Int’l. Conf. on Very Large Data Bases (VLDB), pp. 493–504. VLDB Endowment (2005)

    Google Scholar 

  11. Chen, T., Lu, J., Ling, T.W.: On boosting holism in XML twig pattern matching using structural indexing techniques. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 455–466. ACM, New York (2005)

    Chapter  Google Scholar 

  12. Chen, Z., Gehrke, J., Korn, F., Koudas, N., Shanmugasundaram, J., Srivastava, D.: Index structures for matching XML twigs using relational query processors. Data Knowl. Eng. (DKE) 60(2), 283–302 (2007)

    Article  Google Scholar 

  13. Christophides, V., Plexousakis, D., Scholl, M., Tourtounis, S.: On labeling schemes for the semantic web. In: Proc. Int’l. World Wide Web Conf. (WWW), pp. 544–555. ACM, New York (2003)

    Google Scholar 

  14. Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. In: Proc. ACM Symposium on Discrete Algorithms, pp. 937–946. Society for Industrial and Applied Mathematics, Philadelphia (2002)

    Google Scholar 

  15. Dietz, P.F.: Maintaining order in a linked list. In: Proc. ACM Symp. on Theory of Computing (STOC), pp. 122–127. ACM, New York (1982)

    Google Scholar 

  16. Fulkerson, D.R., Gross, O.A.: Incidence matrices and interval graphs. Pac. J. Math. 15(3), 835–855 (1965)

    MATH  MathSciNet  Google Scholar 

  17. Furche, T.: Implementation of web query language reconsidered: beyond tree and single-language algebras at (almost) no cost. Dissertation/doctoral Thesis, Ludwig-Maxmilians University Munich (2008)

    Google Scholar 

  18. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, New York (1979)

    MATH  Google Scholar 

  19. Goldberg, P.W., Golumbic, M.C., Kaplan, H., Shamir, R.: Four strikes against physical mapping of DNA. J. Comput. Biol. 2(1), 139–152 (1995)

    Article  Google Scholar 

  20. Gottlob, G., Koch, C., Pichler, R.: Efficient algorithms for processing XPath queries. ACM Trans. Database Syst. (2005)

    Google Scholar 

  21. Gottlob, G., Leone, N., Scarcello, F.: The complexity of acyclic conjunctive queries. J. ACM 48(3), 431–498 (2001)

    Article  MathSciNet  Google Scholar 

  22. Grust, T.: Accelerating XPath location steps. In: Proc. ACM Symp. on Management of Data (SIGMOD) (2002)

    Google Scholar 

  23. Grust, T., van Keulen, M., Teubner, J.: Staircase join: teach a relational DBMS to watch its (axis) steps. In: Proc. Int. Conf. on Very Large Databases (2003)

    Google Scholar 

  24. Habib, M., McConnell, R., Paul, C., Viennot, L.: Lex-BFS and partition refinement, with applications to transitive orientation, interval graph recognition and consecutive ones testing. Theor. Comput. Sci. 234(1–2), 59–84 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  25. Haddadi, S., Layouni, Z.: Consecutive block minimization is 1.5-approximable. Inf. Process. Lett. 108(3), 132–135 (2008)

    Article  MathSciNet  Google Scholar 

  26. Hsu, W.L.: PC-trees vs. PQ-trees. In: Proc. Int’l. Conf. on Computing and Combinatorics. LNCS, vol. 2108. Springer, Berlin (2001)

    Google Scholar 

  27. Hsu, W.L.: A simple test for the consecutive ones property. J. Algorithms 43(1), 1–16 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  28. Jiang, H., Wang, W., Lu, H., Yu, J.X.: Holistic twig joins on indexed XML documents. In: Proc. Int’l. Conf. on Very Large Data Bases (VLDB), pp. 273–284. VLDB Endowment (2003)

    Google Scholar 

  29. Kou, L.T.: Polynomial complete consecutive information retrieval problems. SIAM J. Comput. 6(1), 67–75 (1977)

    Article  MATH  MathSciNet  Google Scholar 

  30. Meuss, H., Schulz, K.U.: Complete answer aggregates for treelike databases: a novel approach to combine querying and navigation. ACM Trans. Inf. Syst. 19(2), 161–215 (2001)

    Article  Google Scholar 

  31. Olteanu, D.: SPEX: streamed and progressive evaluation of XPath. IEEE Trans. Knowl. Data Eng. (2007)

    Google Scholar 

  32. Olteanu, D., Furche, T., Bry, F.: Evaluating complex queries against XML streams with polynomial combined complexity. In: Proc. British National Conf. on Databases (BNCOD), pp. 31–44 (2003)

    Google Scholar 

  33. Olteanu, D., Furche, T., Bry, F.: An efficient single-pass query evaluator for XML data streams. In: Data Streams Track, Proc. ACM Symp. on Applied Computing (SAC) pp. 627–631 (2004)

    Google Scholar 

  34. Olteanu, D., Meuss, H., Furche, T., Bry, F.: XPath: looking forward. In: Proc. EDBT Workshop on XML-Based Data Management. Lecture Notes in Computer Science, vol. 2490. Springer, Berlin (2002)

    Chapter  Google Scholar 

  35. O’Neil, P., O’Neil, E., Pal, S., Cseri, I., Schaller, G., Westbury, N.: ORDPATHs: insert-friendly XML node labels. In: Proc. ACM Symp. on Management of Data (SIGMOD), pp. 903–908. ACM, New York (2004)

    Google Scholar 

  36. Paige, R., Tarjan, R.E.: Three partition refinement algorithms. SIAM J. Comput. 16(6), 973–989 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  37. Perez, J., Arenas, M., Gutierrez, C.: Semantics and complexity of SPARQL. In: Proc. Int’l. Semantic Web Conf. (ISWC) (2006)

    Google Scholar 

  38. Pérez, J., Arenas, M., Gutierrez, C.: nSPARQL: A navigational language for rdf. In: Proc. Int’l. Semantic Web Conf. (ISWC), pp. 66–81 (2008)

    Google Scholar 

  39. Polleres, A.: From SPARQL to rules (and back). In: Proc. Int’l. World Wide Web Conf. (WWW), pp. 787–796. ACM, New York (2007)

    Chapter  Google Scholar 

  40. Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF. Proposed Recommendation, W3C (2007)

    Google Scholar 

  41. Schenkel, R., Theobald, A., Weikum, G.: HOPI: an efficient connection index for complex XML document collections. In: Proc. Extending Database Technology (2004)

    Google Scholar 

  42. Su-Cheng, H., Chien-Sing, L.: Node labeling schemes in XML query optimization: A survey and trends. IETE Tech. Rev. 26(2), 88–100 (2009)

    Article  Google Scholar 

  43. Trißl, S., Leser, U.: Fast and practical indexing and querying of very large graphs. In: Proc. ACM Symp. on Management of Data (SIGMOD), pp. 845–856. ACM, New York (2007)

    Google Scholar 

  44. Wang, H., He, H., Yang, J., Yu, P.S., Yu, J.X.: Dual labeling: Answering graph reachability queries in constant time. In: Proc. Int’l. Conf. on Data Engineering (ICDE), p. 75. IEEE Computer Society, Los Alamitos (2006)

    Google Scholar 

  45. Weigel, F., Schulz, K.U., Meuss, H.: The BIRD numbering scheme for XML and tree databases—deciding and reconstructing tree relations using efficient arithmetic operations. In: Proc. Int’l. XML Database Symposium (XSym). LNCS, vol. 3671, pp. 49–67. Springer, Berlin (2005)

    Google Scholar 

  46. Weinzierl, A.: Interval-based graph representations for efficient web querying. Diplomarbeit/diploma Thesis, Ludwig-Maxmilians University Munich (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tim Furche .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Furche, T., Weinzierl, A., Bry, F. (2010). Labeling RDF Graphs for Linear Time and Space Querying. In: de Virgilio, R., Giunchiglia, F., Tanca, L. (eds) Semantic Web Information Management. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04329-1_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04329-1_14

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04328-4

  • Online ISBN: 978-3-642-04329-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics