ABSTRACT
The problem of answering SPARQL queries over virtual SPARQL views is commonly encountered in a number of settings, including while enforcing security policies to access RDF data, or when integrating RDF data from disparate sources. We approach this problem by rewriting SPARQL queries over the views to equivalent queries over the underlying RDF data, thus avoiding the costs entailed by view materialization and maintenance. We show that SPARQL query rewriting combines the most challenging aspects of rewriting for the relational and XML cases: like the relational case, SPARQL query rewriting requires synthesizing multiple views; like the XML case, the size of the rewritten query is exponential to the size of the query and the views. In this paper, we present the first native query rewriting algorithm for SPARQL. For an input SPARQL query over a set of virtual SPARQL views, the rewritten query resembles a union of conjunctive queries and can be of exponential size. We propose optimizations over the basic rewriting algorithm to (i) minimize each conjunctive query in the union; (ii) eliminate conjunctive queries with empty results from evaluation; and (iii) efficiently prune out big portions of the search space of empty rewritings. The experiments, performed on two RDF stores, show that our algorithms are scalable and independent of the underlying RDF stores. Furthermore, our optimizations have order of magnitude improvements over the basic rewriting algorithm in both the rewriting size and evaluation time.
- 4store - scalable RDF storage. http://4store.org/http://4store.org/.Google Scholar
- Jena semantic web framework. http://jena.sourceforge.nethttp://jena.sourceforge.net.Google Scholar
- Virtuoso universal server. http://virtuoso.openlinksw.comhttp://virtuoso.openlinksw.com.Google Scholar
- D. J. Abadi, A. Marcus, S. R. Madden, and K. Hollenbach. Scalable semantic web data management using vertical partitioning. In VLDB, 2007. Google ScholarDigital Library
- F. Abel and et al. Enabling advanced and context dependent access control in RDF stores. In ISWC, 2007. Google ScholarDigital Library
- S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. Google ScholarDigital Library
- N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating the frequency moments. In STOC, 1996. Google ScholarDigital Library
- R. Angles and C. Gutierrez. The expressive power of SPARQL. In ISWC, pages 114--129, 2008. Google ScholarDigital Library
- K. Beyer, P. J. Haas, B. Reinwald, Y. Sismanis, and R. Gemulla. On synopses for distinct-value estimation under multiset operations. In SIGMOD, 2007. Google ScholarDigital Library
- B. Cautis, A. Deutsch, and N. Onose. Xpath rewriting using multiple views: Achieving completeness and efficiency. In WebDB, 2008.Google Scholar
- B. Cautis, A. Deutsch, and N. Onose. Querying data sources that export infinite sets of views. In ICDT, 2009. Google ScholarDigital Library
- G. Correndo, M. Salvadores, I. Millard, H. Glaser, and N. Shadbolt. SPARQL query rewriting for implementing data integration over linked data. In EDBT, 2010. Google ScholarDigital Library
- W. Fan, C.-Y. Chan, and M. Garofalakis. Secure XML querying with security views. In SIGMOD, 2004. Google ScholarDigital Library
- W. Fan, F. Geerts, X. Jia, and A. Kementsietsidis. Rewriting regular XPath queries on XML views. In ICDE, pages 666--675, 2007.Google ScholarCross Ref
- Y. Guo, Z. Pan, and J. Heflin. LUBM: A benchmark for OWL knowledge base systems. Journal of Web Semantics, 2005. Google ScholarDigital Library
- A. Y. Halevy. Answering queries using views: A survey. VLDB J., 10(4):270--294, 2001. Google ScholarDigital Library
- B. Kalyanasundaram and G. Schintger. The probabilistic communication complexity of set intersection. SIAM J. Discret. Math., 5(4):545--557, 1992. Google ScholarDigital Library
- W. Le, S. Duan, A. Kementsietsidis, F. Li, and M. Wang. Query rewriting over SPARQL views. Technical report. http://ww2.cs.fsu.edu/~le/rdfview.pdf.Google Scholar
- M. Lenzerini. Data integration: A theoretical perspective. In PODS, pages 233--246, 2002. Google ScholarDigital Library
- G. Manjunath and et al. Semantic views for controlled access to the semantic web. In Tech. Rep. HPL-08-15, 2008.Google Scholar
- Y. Papakonstantinou and V. Vassalos. Query rewriting for semistructured data. In SIGMOD, pages 455--466, 1999. Google ScholarDigital Library
- J. Pérez, M. Arenas, and C. Gutierrez. Semantics and complexity of SPARQL. ACM Trans. Database Syst., 34(3):1--45, 2009. Google ScholarDigital Library
- R. Pottinger and A. Halevy. MiniCon: A scalable algorithm for answering queries using views. VLDB J., 2001. Google ScholarDigital Library
- S. Rizvi, A. Mendelzon, S. Sudarshan, and P. Roy. Extending query rewriting techniques for fine-grained access control. In SIGMOD, pages 551--562, 2004. Google ScholarDigital Library
- J. D. Ullman. Information integration using logical views. In ICDT, pages 19--40, 1997. Google ScholarDigital Library
- Q. Wang and et al. On the correctness criteria of fine-grained access control in relational databases. In VLDB, 2007. Google ScholarDigital Library
Recommendations
Canonicalisation of Monotone SPARQL Queries
The Semantic Web – ISWC 2018AbstractCaching in the context of expressive query languages such as SPARQL is complicated by the difficulty of detecting equivalent queries: deciding if two conjunctive queries are equivalent is NP-complete, where adding further query features makes the ...
Rewriting general conjunctive queries using views
The problem of rewriting queries using views has important applications in data integration, query optimization, and physical data independence maintenance. Previous researchers have proposed rewriting algorithms for queries and views that are Datalog ...
Efficiently Pinpointing SPARQL Query Containments
Web EngineeringAbstractQuery containment is a fundamental problem in database research, which is relevant for many tasks such as query optimisation, view maintenance and query rewriting. For example, recent SPARQL engines built on Big Data frameworks that precompute ...
Comments