skip to main content
10.1145/2213556.2213573acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article

The complexity of evaluating path expressions in SPARQL

Published:21 May 2012Publication History

ABSTRACT

The World Wide Web Consortium (W3C) recently introduced property paths in SPARQL 1.1, a query language for RDF data. Property paths allow SPARQL queries to evaluate regular expressions over graph data. However, they differ from standard regular expressions in several notable aspects. For example, they have a limited form of negation, they have numerical occurrence indicators as syntactic sugar, and their semantics on graphs is defined in a non-standard manner. We formalize the W3C semantics of property paths and investigate various query evaluation problems on graphs. More specifically, let x and y be two nodes in an edge-labeled graph and r be an expression. We study the complexities of (1) deciding whether there exists a path from x to y that matches r and (2) counting how many paths from x to y match r. Our main results show that, compared to an alternative semantics of regular expressions on graphs, the complexity of (1) and (2) under W3C semantics is significantly higher. Whereas the alternative semantics remains in polynomial time for large fragments of expressions, the W3C semantics makes problems (1) and (2) intractable almost immediately.

As a side-result, we prove that the membership problem for regular expressions with numerical occurrence indicators and negation is in polynomial time.

References

  1. S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J. L. Wiener. The Lorel query language for semistructured data. Int. J. on Digital Libraries, 1(1):68--88, 1997.Google ScholarGoogle ScholarCross RefCross Ref
  2. S. Abiteboul and V. Vianu. Regular path queries with constraints. J. Comput. Syst. Sci., 58(3):428--452, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. F. Alkhateeb, J.-F. Baget, and J. Euzenat. Extending SPARQL with regular expression patterns (for querying RDF). J. Web Sem., 7(2):57--73, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. Álvarez and B. Jenner. A very hard log-space counting class. Theor. Comput. Sci., 107:3--30, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Arenas, S. Conca, and J. Pérez. Counting beyond a yottabyte, or how SPARQL 1.1 property paths will prevent the adoption of the standard. In World Wide Web Conference (WWW), 2012. To appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Arenas and J. Pérez. Querying semantic web data with SPARQL. In Principles of Database Systems (PODS), p. 305--316, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Berge. Graphs and Hypergraphs. North-Holland Publishing Company, 1973. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. G. J. Bex, F. Neven, T. Schwentick, and S. Vansummeren. Inference of concise regular expressions and DTDs. ACM Trans. Database Syst., 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. Book, S. Even, S. Greibach, and G. Ott. Ambiguity in graphs and expressions. IEEE Trans. Comput., 20:149--153, 1971. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. Buneman, S. B. Davidson, G. G. Hillebrand, and D. Suciu. A query language and optimization techniques for unstructured data. In SIGMOD Conference, p. 505--516, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Calvanese, G. De Giacomo, M. Lenzerini, and M. Y. Vardi. Containment of conjunctive regular path queries with inverse. In Principles of Knowledge Representation and Reasoning (KR), p. 176--185, 2000.Google ScholarGoogle Scholar
  12. D. Calvanese, G. De Giacomo, M. Lenzerini, and M. Y. Vardi. View-based query processing for regular path queries with inverse. In Principles of Database Systems (PODS), pages 58--66, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. Calvanese, G. De Giacomo, M. Lenzerini, and M.Y. Vardi. Rewriting of regular expressions and regular path queries. J. Comput. Syst. Sci., 64(3):443--465, 2002.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. D. Colazzo, G. Ghelli, and C. Sartiani. Efficient asymmetric inclusion between regular expression types. In International Conference on Database Theory (ICDT), pages 174--182, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Colazzo, G. Ghelli, and C. Sartiani. Efficient inclusion for a class of XML types with interleaving and counting. Information Systems, 34(7):643--656, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. P. Consens and A. O. Mendelzon. GraphLog: a visual formalism for real life recursion. In Principles of Database Systems (PODS), p. 404--416, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. I. F. Cruz, A. O. Mendelzon, and P. T. Wood. A graphical query language supporting recursion. In SIGMOD Conference, p. 323--330, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Deutsch and V. Tannen. Optimization properties for classes of conjunctive regular path queries. In Database Programming Languages (DBPL), p. 21--39, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. F. Fernández, D. Florescu, A. Y. Levy, and D. Suciu. Declarative specification of web sites with strudel. VLDB J., 9(1):38--55, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D. Florescu, A. Y. Levy, and D. Suciu. Query containment for conjunctive queries with regular expressions. In Principles of Database Systems (PODS), p. 139--148, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Gao, C. M. Sperberg-McQueen, H.S. Thompson, N. Mendelsohn, D. Beech, and M. Maloney. W3C XML Schema Definition Language (XSD) 1.1 part 1: Structures. Tech. report, World Wide Web Consortium, April 2009.Google ScholarGoogle Scholar
  22. W. Gelade, M. Gyssens, and W. Martens. Regular expressions with counting: Weak versus strong determinism. SIAM J. Comput., 41(1):160--190, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. W. Gelade, W. Martens, and F. Neven. Optimizing schema languages for XML: Numerical constraints and interleaving. SIAM J. Comput., 38(5), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. V. M. Glushkov. The abstract theory of automata. Russian Math. Surveys, 16(5(101)):1--53, 1961.Google ScholarGoogle Scholar
  25. S. Harris and A. Seaborne. SPARQL 1.1 query language. Tech. report, World Wide Web Consortium (W3C), January2012.Google ScholarGoogle Scholar
  26. J.E. Hopcroft and J.D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. Kannan, Z. Sweedyk, and S. R. Mahaney. Counting and random generation of strings in regular languages. In Symp.\ on Discrete Algorithms (SODA), p. 551--557, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. P. Kilpeläinen and R. Tuhkanen. Regular expressions with numerical occurrence indicators -- preliminary results. In Symp. on Prog. Lang. and Software Tools (SPLST), p. 163--173, 2003.Google ScholarGoogle Scholar
  29. P. Kilpeläinen and R. Tuhkanen. One-unambiguity of regular expressions with numeric occurrence indicators. Information and Computation, 205(6):890--916, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. S. C. Kleene. Automata Studies, chapter Representations of events in nerve sets and finite automata, p. 3--42. Princeton Univ. Press, 1956.Google ScholarGoogle Scholar
  31. L. Libkin and D. Vrgoc. Regular path queries on graphs with data. In International Conference on Database Theory (ICDT),2012. To appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Y. A. Liu and F. Yu. Solving regular path queries. In Intl. Conf. on Mathematics of Program Construction (MPC), p. 195--208, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. W. Martens, F. Neven, and T. Schwentick. Complexity of decision problems for simple regular expressions. In Mathematical Foundations of Computer Science (MFCS), p. 889--900, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  34. W. Martens, F. Neven, and T. Schwentick. Complexity of decision problems for XML schemas and chain regular expressions. SIAM J. Comput., 39(4):1486--1530, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. A. O. Mendelzon and P. T. Wood. Finding regular simple paths in graph databases. SIAM J. Comput., 24(6):1235--1258, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. J. Pérez, M. Arenas, and C. Gutierrez. Semantics and complexity of SPARQL. ACM Trans. Database Syst., 34(3), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. J. Pérez, M. Arenas, and C. Gutierrez. nSPARQL: A navigational language for RDF. J. Web Sem., 8(4):255--270, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. M. Schmidt, M. Meier, and G. Lausen. Foundations of SPARQL query optimization. In International Conference on Database Theory (ICDT), pages 4--33, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. L. Stockmeyer. The complexity of decision problems in automata theory and logic. PhD thesis, Massachusetts Institute of Technology, 1974.Google ScholarGoogle Scholar
  40. L. G. Valiant. The complexity of enumeration and reliability problems. SIAM J. Comput., 8(3):410--421, 1979.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. M. Yannakakis. Graph-theoretic methods in database theory. In Principles of Database Systems (PODS), p. 230--242, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The complexity of evaluating path expressions in SPARQL

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PODS '12: Proceedings of the 31st ACM SIGMOD-SIGACT-SIGAI symposium on Principles of Database Systems
      May 2012
      332 pages
      ISBN:9781450312486
      DOI:10.1145/2213556

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 May 2012

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      PODS '12 Paper Acceptance Rate26of101submissions,26%Overall Acceptance Rate642of2,707submissions,24%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader