ABSTRACT
Query execution over the Web of Linked Data has attracted much attention recently. A particularly interesting approach is link traversal based query execution which proposes to integrate the traversal of data links into the creation of query results. Hence -in contrast to traditional query execution paradigms- this does not assume a fixed set of relevant data sources beforehand; instead, the traversal process discovers data and data sources on the fly and, thus, enables applications to tap the full potential of the Web.
While several authors have studied possibilities to implement the idea of link traversal based query execution and to optimize query execution in this context, no work exists that discusses theoretical foundations of the approach in general. Our paper fills this gap.
We introduce a well-defined semantics for queries that may be executed using a link traversal based approach. Based on this semantics we formally analyze properties of such queries. In particular, we study the computability of queries as well as the implications of querying a potentially infinite Web of Linked Data. Our results show that query computation in general is not guaranteed to terminate and that for any given query it is undecidable whether the execution terminates. Furthermore, we define an abstract execution model that captures the integration of link traversal into the query execution process. Based on this model we prove the soundness and completeness of link traversal based query execution and analyze an existing implementation approach.
- S. Abiteboul and V. Vianu. Queries and computation on the Web. Theoretical Computer Science, 239(2), 2000. Google ScholarDigital Library
- S. Auer, J. Lehmann, and S. Hellmann. LinkedGeoData -- adding a spatial dimension to the Web of Data. In Proc. of the 8th Int. Semantic Web Conference (ISWC), 2009. Google ScholarDigital Library
- P. Bouquet, C. Ghidini, and L. Serafini. Querying the Web of Data: A formal approach. In Proc of the 4th Asian Semantic Web Conference (ASWC), 2009. Google ScholarDigital Library
- D. Florescu, A. Y. Levy, and A. O. Mendelzon. Database techniques for the world-wide Web: A survey. SIGMOD Record, 27(3), 1998. Google ScholarDigital Library
- T. Guan, M. Liu, and L. V. Saxton. Structure-based queries over the world wide Web. In Proc. of the 17th Int. Conference on Conceptual Modeling (ER), 1998. Google ScholarDigital Library
- O. Hartig. Zero-knowledge query planning for an iterator implementation of link traversal based query execution. In Proc. of the 8th Ext. Semantic Web Conference (ESWC), 2011. Google ScholarDigital Library
- O. Hartig, C. Bizer, and J.-C. Freytag. Executing SPARQL queries over the Web of Linked Data. In Proc. of the 8th Int. Semantic Web Conference (ISWC), 2009. Google ScholarDigital Library
- O. Hartig and J. C. Freytag. Foundations of traversal based query execution over Linked Data (extended version). CoRR, abs/1108.6328, 2011. Online: http://arxiv.org/abs/1108.6328. Google ScholarDigital Library
- O. Hartig and A. Langegger. A database perspective on consuming Linked Data on the Web. Datenbank-Spektrum, 10(2), 2010.Google Scholar
- T. Heath and C. Bizer. Linked Data: Evolving the Web into a Global Data Space. Morgan & Claypool, 1st edition, 2011. Google ScholarDigital Library
- G. Klyne and J. J. Carroll. Resource description framework (RDF): Concepts and abstract syntax. W3C Rec., Online at http://www.w3.org/TR/rdf-concepts/, Feb. 2004.Google Scholar
- D. Konopnicki and O. Shmueli. W3qs: A query system for the world-wide Web. In Proc. of 21th Int. Conference on Very Large Data Bases (VLDB), 1995. Google ScholarDigital Library
- G. Ladwig and D. T. Tran. Linked Data query processing strategies. In Proc. of the 9th Int. Semantic Web Conference (ISWC), 2010. Google ScholarDigital Library
- G. Ladwig and D. T. Tran. SIHJoin: Querying remote and local linked data. In Proc. of the 8th Ext. Semantic Web Conference (ESWC), 2011. Google ScholarDigital Library
- A. O. Mendelzon and T. Milo. Formal models of Web queries. Information Systems, 23(8), 1998. Google ScholarDigital Library
- J. Pérez, M. Arenas, and C. Gutierrez. Semantics and complexity of SPARQL. ACM Transactions on Database Systems, 34, 2009. Google ScholarDigital Library
- E. Prud'hommeaux and A. Seaborne. SPARQL query language for RDF. W3C Rec., Online at http://www.w3.org/TR/rdf-sparql-query/, Jan. 2008.Google Scholar
- F. Schmedding. Incremental SPARQL evaluation for query answering on Linked Data. In Proc. of the 2nd Int.Workshop on Consuming Linked Data (COLD) at ISWC, 2011.Google Scholar
- D. Vrandečić, M. Krötzsch, S. Rudolph, and U. Lösch. Leveraging non-lexical knowledge for the linked open data web. In RAFT, 2010.Google Scholar
Index Terms
- Foundations of traversal based query execution over linked data
Recommendations
Reachable subwebs for traversal-based query execution
WWW '14 Companion: Proceedings of the 23rd International Conference on World Wide WebTraversal-based approaches to execute queries over data on the Web have recently been studied. These approaches make use of up-to-date data from initially unknown data sources and, thus, enable applications to tap the full potential of the Web. While ...
SQUIN: a traversal based query execution system for the web of linked data
SIGMOD '13: Proceedings of the 2013 ACM SIGMOD International Conference on Management of DataThe World Wide Web (WWW) currently evolves into a Web of Linked Data where content providers publish and link their data as they have done with hypertext for the last 20 years. We understand this emerging dataspace as a huge, distributed database which ...
Answering SPARQL queries on the web of data through zero-knowledge link traversal
Link traversal has emerged as a SPARQL query processing method that exploits the Linked Data principles to dynamically discover data relevant for answering a query by dereferencing online Web resources (URIs) at query execution time. While several ...
Comments