skip to main content
10.1145/2213556.2213572acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article

Static analysis and optimization of semantic web queries

Published:21 May 2012Publication History

ABSTRACT

Static analysis is a fundamental task in query optimization. In this paper we study static analysis and optimization techniques for SPARQL, which is the standard language for querying Semantic Web data. Of particular interest for us is the optionality feature in SPARQL. It is crucial in Semantic Web data management, where data sources are inherently incomplete and the user is usually interested in partial answers to queries. This feature is one of the most complicated constructors in SPARQL and also the one that makes this language depart from classical query languages such as relational conjunctive queries. We focus on the class of well-designed SPARQL queries, which has been proposed in the literature as a fragment of the language with good properties regarding query evaluation. We first propose a tree representation for SPARQL queries, called pattern trees, which captures the class of well-designed SPARQL graph patterns and which can be considered as a query execution plan. Among other results, we propose several transformation rules for pattern trees, a simple normal form, and study equivalence and containment. We also study the enumeration and counting problems for this class of queries.

References

  1. D. J. Abadi, A. Marcus, S. Madden, and K. J. Hollenbach. Scalable semantic web data management using vertical partitioning. In VLDB, pages 411--422. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Angles and C. Gutierrez. The expressive power of SPARQL. In ISWC, pages 114--129. Springer, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Arenas and J. Pérez. Querying Semantic Web data with SPARQL. In PODS, pages 305--316. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. G. Bagan, A. Durand, and E. Grandjean. On acyclic conjunctive queries and constant delay enumeration. In CSL, pages 208--222. Springer, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Bauland, P. Chapdelaine, N. Creignou, M. Hermann, and H. Vollmer. An algebraic approach to the complexity of generalized conjunctive queries. In SAT 2004 - Revised Selected Papers, pages 30--45. Springer, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. Berners-Lee. Linked data -- design issues. http://www.w3.org/DesignIssues/LinkedData.html, 2006.Google ScholarGoogle Scholar
  7. C. Bizer, T. Heath, and T. Berners-Lee. Linked data - the story so far. Int. J. Semantic Web Inf. Syst., 5(3):1--22, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  8. A. K. Chandra and P. M. Merlin. Optimal implementation of conjunctive queries in relational data bases. In STOC, pages 77--90. ACM, 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Chekol, J. Euzenat, P. Genevès, and N. Layaïda. PSPARQL query containment. In DBPL, 2011.Google ScholarGoogle Scholar
  10. C. Chekuri and A. Rajaraman. Conjunctive query containment revisited. Theor. Comput. Sci., 239(2):211--229, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Cohen and I. Fadida and Y. Kanza and B. Kimelfeld and Y. Sagiv. Full Disjunctions: Polynomial-Delay Iterators in Action. In VLDB, pages 739--750. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Durand, M. Hermann, and P. G. Kolaitis. Subtractive reductions and complete problems for counting complexity classes. Theor. Comput. Sci., 340(3):496--513, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Flum, M. Frick, and M. Grohe. Query evaluation via tree-decompositions. J. ACM, 49(6):716--752, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Flum and M. Grohe. The parameterized complexity of counting problems. SIAM J. Comput., 33(4):892--922, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. A. Gallego, J. D. Fernández, M. A. Martínez-Prieto, and P. de la Fuente. An empirical study of real-world SPARQL queries. In USEWOD, 2011.Google ScholarGoogle Scholar
  16. G. Gottlob, N. Leone, and F. Scarcello. A comparison of structural CSP decomposition methods. Artif. Intell., 124(2):243--282, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. G. Gottlob, N. Leone, and F. Scarcello. Hypertree decompositions and tractable queries. J. Comput. Syst. Sci., 64(3):579--627, 2002.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. G. Greco and F. Scarcello. The power of tree projections: local consistency, greedy algorithms, and larger islands of tractability. In PODS 2010, pages 327--338. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C. Gutierrez, C. A. Hurtado, A. O. Mendelzon, and J. Pérez. Foundations of semantic web databases. J. Comput. Syst. Sci., 77(3):520--541, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. L. A. Hemaspaandra and H. Vollmer. The satanic notations: Counting classes beyond \#P and other definitional adventures. SIGACT News, Complexity Theory Column 8, 26(1):2--13, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Y. Kanza and W. Nutt and Y. Sagiv. Querying Incomplete Information in Semistructured Data. J. Comput. Syst. Sci., 64(3):655--693, 2002.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. P.-Å. Larson and J. Zhou. View matching for outer-join views. In VLDB, pages 445--456. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. O. Lassila and R. Swick. Resource description framework (RDF) model and syntax. W3C Recommnedation, January 1999.Google ScholarGoogle Scholar
  24. T. Neumann and G. Weikum. The rdf-3x engine for scalable management of rdf data. VLDB J., 19(1):91--113, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Pérez, M. Arenas, and C. Gutierrez. Semantics and complexity of SPARQL. In ISWC, pages 30--43. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. Pérez, M. Arenas, and C. Gutierrez. Semantics of SPARQL. Technical Report, Universidad de Chile TR/DCC-2006--17, October 2006.Google ScholarGoogle Scholar
  27. J. Pérez, M. Arenas, and C. Gutierrez. Semantics and complexity of SPARQL. ACM Trans. Database Syst., 34(3), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. F. Picalausa and S. Vansummeren. What are real SPARQL queries like? In SWIM, pages 7:1--7:6. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. R. Pichler and S. Skritek. Tractable counting of the answers to conjunctive queries. In AMW. CEUR-WS.org, 2011.Google ScholarGoogle Scholar
  30. A. Polleres. From SPARQL to rules (and back). In WWW, pages 787--796. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. E. Prud'Hommeaux and A. Seaborne. SPARQL query language for RDF. W3C Recommnedation, January 2008.Google ScholarGoogle Scholar
  32. M. Schmidt, T. Hornung, N. Küchlin, G. Lausen, and C. Pinkel. An experimental comparison of RDF data management approaches in a SPARQL benchmark scenario. In ISWC, pages 82--97. Springer, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. Schmidt, T. Hornung, G. Lausen, and C. Pinkel. SP$^2$Bench: A SPARQL performance benchmark. In ICDE, pages 222--233. IEEE, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. Schmidt, M. Meier, and G. Lausen. Foundations of SPARQL query optimization. In ICDT, pages 4--33. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. G. Serfiotis, I. Koffina, V. Christophides, and V. Tannen. Containment and minimization of RDF/S query patterns. In ISWC, pages 607--623. Springer, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. L. Sidirourgos, R. Goncalves, M. L. Kersten, N. Nes, and S. Manegold. Column-store support for RDF data management: not all swans are white. PVLDB, 1(2):1553--1563, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. M. Stocker, A. Seaborne, A. Bernstein, C. Kiefer, and D. Reynolds. Sparql basic graph pattern optimization using selectivity estimation. In WWW, pages 595--604. ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. J. D. Ullman. Information integration using logical views. In ICDT, pages 19--40. Springer, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. L. G. Valiant. The complexity of enumeration and reliability problems. SIAM J. Comput., 8(3):410--421, 1979.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. C. Weiss, P. Karras, and A. Bernstein. Hexastore: sextuple indexing for semantic web data management. PVLDB, 1(1):1008--1019, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. M. Yannakakis. Algorithms for acyclic database schemes. In VLDB, pages 82--94. IEEE, 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. http://data.gov.uk.Google ScholarGoogle Scholar
  43. http://www.data.gov.Google ScholarGoogle Scholar
  44. http://DBpedia.org/sparql.Google ScholarGoogle Scholar
  45. ARQ. http://sourceforge.net/projects/jena/files/ARQ/.Google ScholarGoogle Scholar

Index Terms

  1. Static analysis and optimization of semantic web queries

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PODS '12: Proceedings of the 31st ACM SIGMOD-SIGACT-SIGAI symposium on Principles of Database Systems
      May 2012
      332 pages
      ISBN:9781450312486
      DOI:10.1145/2213556

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 May 2012

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      PODS '12 Paper Acceptance Rate26of101submissions,26%Overall Acceptance Rate642of2,707submissions,24%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader