Abstract
Indices and data structures for web querying have mostly considered tree shaped data, reflecting the view of XML documents as tree-shaped. However, for RDF (and when querying ID/IDREF constraints in XML) data is indisputably graph-shaped. In this chapter, we first study existing indexing and labeling schemes for RDF and other graph datawith focus on support for efficient adjacency and reachability queries. For XML, labeling schemes are an important part of the widespread adoption of XML, in particular for mapping XML to existing (relational) database technology. However, the existing indexing and labeling schemes for RDF (and graph data in general) sacrifice one of the most attractive properties of XML labeling schemes, the constant time (and per-node space) test for adjacency (child) and reachability (descendant). In the second part, we introduce the first labeling scheme for RDF data that retains this property and thus achieves linear time and space processing of acyclic RDF queries on a significantly larger class of graphs than previous approaches (which are mostly limited to tree-shaped data). Finally, we show how this labeling scheme can be applied to (acyclic) SPARQL queries to obtain an evaluation algorithm with time and space complexity linear in the number of resources in the queried RDF graph.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Borgida, A., Jagadish, H.V.: Efficient management of transitive relationships in large data and knowledge bases. In: Proc. ACM Symp. on Management of Data (SIGMOD), pp. 253–262. ACM, New York (1989)
Al-Khalifa, S., Jagadish, H.V., Koudas, N., Patel, J.M., Srivastava, D., Wu, Y.: Structural joins: a primitive for efficient XML query pattern matching. In: Proc. Int. Conf. on Data Engineering, p. 141. IEEE Computer Society, Los Alamitos (2002)
Backett, D.: Turtle—Terse RDF Triple Language. Technical Report, Institute for Learning and Research Technology, University of Bristol (2007)
Beckett, D., McBride, B.: RDF/XML Syntax Specification (Revised). Recommendation, W3C (2004)
Bolzer, O.: Towards Data-Integration on the Semantic Web: Querying RDF with Xcerpt. Diplomarbeit/diploma Thesis, University of Munich (2005)
Boncz, P., Grust, T., van Keulen, M., Manegold, S., Rittinger, J., Teubner, J.: MonetDB/XQuery: a fast XQuery processor powered by a relational engine. In: Proc. ACM Symp. on Management of Data (SIGMOD), pp. 479–490. ACM, New York (2006)
Booth, K.S., Lueker, G.S.: Linear algorithms to recognize interval graphs and test for the consecutive ones property. In: Proc. of ACM Symposium on Theory of Computing, pp. 255–265. ACM, New York (1975)
Bruno, N., Koudas, N., Srivastava, D.: Holistic twig joins: optimal XML pattern matching. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 310–321. ACM, New York (2002)
Bry, F., Furche, T., Linse, B., Pohl, A.: Xcerptrdf: A pattern-based answer to the versatile web challenge. In: Proc. Workshop on (Constraint) Logic Programming (WLP) (2008)
Chen, L., Gupta, A., Kurul, M.E.: Stack-based algorithms for pattern matching on dags. In: Proc. Int’l. Conf. on Very Large Data Bases (VLDB), pp. 493–504. VLDB Endowment (2005)
Chen, T., Lu, J., Ling, T.W.: On boosting holism in XML twig pattern matching using structural indexing techniques. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 455–466. ACM, New York (2005)
Chen, Z., Gehrke, J., Korn, F., Koudas, N., Shanmugasundaram, J., Srivastava, D.: Index structures for matching XML twigs using relational query processors. Data Knowl. Eng. (DKE) 60(2), 283–302 (2007)
Christophides, V., Plexousakis, D., Scholl, M., Tourtounis, S.: On labeling schemes for the semantic web. In: Proc. Int’l. World Wide Web Conf. (WWW), pp. 544–555. ACM, New York (2003)
Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. In: Proc. ACM Symposium on Discrete Algorithms, pp. 937–946. Society for Industrial and Applied Mathematics, Philadelphia (2002)
Dietz, P.F.: Maintaining order in a linked list. In: Proc. ACM Symp. on Theory of Computing (STOC), pp. 122–127. ACM, New York (1982)
Fulkerson, D.R., Gross, O.A.: Incidence matrices and interval graphs. Pac. J. Math. 15(3), 835–855 (1965)
Furche, T.: Implementation of web query language reconsidered: beyond tree and single-language algebras at (almost) no cost. Dissertation/doctoral Thesis, Ludwig-Maxmilians University Munich (2008)
Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, New York (1979)
Goldberg, P.W., Golumbic, M.C., Kaplan, H., Shamir, R.: Four strikes against physical mapping of DNA. J. Comput. Biol. 2(1), 139–152 (1995)
Gottlob, G., Koch, C., Pichler, R.: Efficient algorithms for processing XPath queries. ACM Trans. Database Syst. (2005)
Gottlob, G., Leone, N., Scarcello, F.: The complexity of acyclic conjunctive queries. J. ACM 48(3), 431–498 (2001)
Grust, T.: Accelerating XPath location steps. In: Proc. ACM Symp. on Management of Data (SIGMOD) (2002)
Grust, T., van Keulen, M., Teubner, J.: Staircase join: teach a relational DBMS to watch its (axis) steps. In: Proc. Int. Conf. on Very Large Databases (2003)
Habib, M., McConnell, R., Paul, C., Viennot, L.: Lex-BFS and partition refinement, with applications to transitive orientation, interval graph recognition and consecutive ones testing. Theor. Comput. Sci. 234(1–2), 59–84 (2000)
Haddadi, S., Layouni, Z.: Consecutive block minimization is 1.5-approximable. Inf. Process. Lett. 108(3), 132–135 (2008)
Hsu, W.L.: PC-trees vs. PQ-trees. In: Proc. Int’l. Conf. on Computing and Combinatorics. LNCS, vol. 2108. Springer, Berlin (2001)
Hsu, W.L.: A simple test for the consecutive ones property. J. Algorithms 43(1), 1–16 (2002)
Jiang, H., Wang, W., Lu, H., Yu, J.X.: Holistic twig joins on indexed XML documents. In: Proc. Int’l. Conf. on Very Large Data Bases (VLDB), pp. 273–284. VLDB Endowment (2003)
Kou, L.T.: Polynomial complete consecutive information retrieval problems. SIAM J. Comput. 6(1), 67–75 (1977)
Meuss, H., Schulz, K.U.: Complete answer aggregates for treelike databases: a novel approach to combine querying and navigation. ACM Trans. Inf. Syst. 19(2), 161–215 (2001)
Olteanu, D.: SPEX: streamed and progressive evaluation of XPath. IEEE Trans. Knowl. Data Eng. (2007)
Olteanu, D., Furche, T., Bry, F.: Evaluating complex queries against XML streams with polynomial combined complexity. In: Proc. British National Conf. on Databases (BNCOD), pp. 31–44 (2003)
Olteanu, D., Furche, T., Bry, F.: An efficient single-pass query evaluator for XML data streams. In: Data Streams Track, Proc. ACM Symp. on Applied Computing (SAC) pp. 627–631 (2004)
Olteanu, D., Meuss, H., Furche, T., Bry, F.: XPath: looking forward. In: Proc. EDBT Workshop on XML-Based Data Management. Lecture Notes in Computer Science, vol. 2490. Springer, Berlin (2002)
O’Neil, P., O’Neil, E., Pal, S., Cseri, I., Schaller, G., Westbury, N.: ORDPATHs: insert-friendly XML node labels. In: Proc. ACM Symp. on Management of Data (SIGMOD), pp. 903–908. ACM, New York (2004)
Paige, R., Tarjan, R.E.: Three partition refinement algorithms. SIAM J. Comput. 16(6), 973–989 (1987)
Perez, J., Arenas, M., Gutierrez, C.: Semantics and complexity of SPARQL. In: Proc. Int’l. Semantic Web Conf. (ISWC) (2006)
Pérez, J., Arenas, M., Gutierrez, C.: nSPARQL: A navigational language for rdf. In: Proc. Int’l. Semantic Web Conf. (ISWC), pp. 66–81 (2008)
Polleres, A.: From SPARQL to rules (and back). In: Proc. Int’l. World Wide Web Conf. (WWW), pp. 787–796. ACM, New York (2007)
Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF. Proposed Recommendation, W3C (2007)
Schenkel, R., Theobald, A., Weikum, G.: HOPI: an efficient connection index for complex XML document collections. In: Proc. Extending Database Technology (2004)
Su-Cheng, H., Chien-Sing, L.: Node labeling schemes in XML query optimization: A survey and trends. IETE Tech. Rev. 26(2), 88–100 (2009)
Trißl, S., Leser, U.: Fast and practical indexing and querying of very large graphs. In: Proc. ACM Symp. on Management of Data (SIGMOD), pp. 845–856. ACM, New York (2007)
Wang, H., He, H., Yang, J., Yu, P.S., Yu, J.X.: Dual labeling: Answering graph reachability queries in constant time. In: Proc. Int’l. Conf. on Data Engineering (ICDE), p. 75. IEEE Computer Society, Los Alamitos (2006)
Weigel, F., Schulz, K.U., Meuss, H.: The BIRD numbering scheme for XML and tree databases—deciding and reconstructing tree relations using efficient arithmetic operations. In: Proc. Int’l. XML Database Symposium (XSym). LNCS, vol. 3671, pp. 49–67. Springer, Berlin (2005)
Weinzierl, A.: Interval-based graph representations for efficient web querying. Diplomarbeit/diploma Thesis, Ludwig-Maxmilians University Munich (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Furche, T., Weinzierl, A., Bry, F. (2010). Labeling RDF Graphs for Linear Time and Space Querying. In: de Virgilio, R., Giunchiglia, F., Tanca, L. (eds) Semantic Web Information Management. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04329-1_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-04329-1_14
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04328-4
Online ISBN: 978-3-642-04329-1
eBook Packages: Computer ScienceComputer Science (R0)