Abstract
Assessing the quality of linked data currently published on the Web is a crucial need of various data-intensive applications. Extensive work on similar applications for relational data and queries has shown that data provenance can be used in order to compute trustworthiness, reputation and reliability of query results, based on the source data and query operators involved in their derivation. In particular, abstract provenance models can be employed to record information about source data and query operators during query evaluation, and later be used e.g., to assess trust for individual query results. In this paper, we investigate the extent to which relational provenance models can be leveraged for capturing the provenance of SPARQL queries over linked data, and identify their limitations. To overcome these limitations, we advocate the need for new provenance models that capture the full expressive power of SPARQL, and can be used to support assessment of various forms of data quality for linked data manipulated declaratively by such queries.
An earlier version of this paper appeared in IEEE Internet Computing 15(1): 31-39, 2011.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Amsterdamer, Y., Deutch, D., Tannen, V.: Provenance for Aggregate Queries. In: PODS (2011)
Arenas, M., Pérez, J.: Querying Semantic Web Data with SPARQL. In: PODS (2011)
Artz, D., Gil, Y.: A Survey of Trust in Computer Science and the Semantic Web. Web Semantics 5(2) (2007)
Benjelloun, O., Sarma, A.D., Halevy, A.Y., Widom, J.: ULDBs: Databases with Uncertainty and Lineage. In: VLDB (2006)
Buneman, P., Cheney, J., Vansummeren, S.: On the Expressiveness of Implicit Provenance in Query and Update Languages. ACM TODSÂ 33(4) (2008)
Buneman, P., Khanna, S., Tan, W.-C.: Why and Where: A Characterization of Data Provenance. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 316–330. Springer, Heidelberg (2000)
Carroll, J.J., Bizer, C., Hayes, P.J., Stickler, P.: Named Graphs. Web Semantics 3(4) (2005)
Cheney, J., Chiticariu, L., Tan, W.C.: Provenance in Databases: Why, Where and How. Foundations and Trends in Databases 1(4) (2009)
Cui, Y., Widom, J.: Lineage Tracing for General Data Warehouse Transformations. In: VLDB (2001)
Damásio, C.V., Analyti, A., Antoniou, G.: Provenance for SPARQL queries. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012, Part I. LNCS, vol. 7649, pp. 625–640. Springer, Heidelberg (2012)
Davidson, S.B., Freire, J.: Provenance and scientific workflows: challenges and opportunities. In: SIGMOD (2008)
Dividino, R., Sizov, S., Staab, S., Schueler, B.: Querying for Provenance, Trust, Uncertainty and other Meta Knowledge in RDF. Web Semantics 7(3) (2009)
Flouris, G., Fundulaki, I., Pediaditis, P., Theoharis, Y., Christophides, V.: Coloring RDF Triples to Capture Provenance. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 196–212. Springer, Heidelberg (2009)
Freire, J., Koop, D., Santos, E., Silva, C.T.: Provenance for Computational Tasks: A Survey. CiSE 10(3) (2008)
Fuhr, N., Rölleke, T.: A Probabilistic Relational Algebra for the Integration of Information Retrieval and Database Systems. ACM TOIS 14(1) (1997)
Geerts, F., Karvounarakis, G., Christophides, V., Fundulaki, I.: Algebraic Structures for Capturing the Provenance of SPARQL Queries (submitted for publication)
Geerts, F., Kementsietsidis, A., Milano, D.: MONDRIAN: Annotating and Querying Databases through Colors and Blocks. In: ICDE (2006)
Geerts, F., Poggi, A.: On Database Query Languages for K-Relations. Applied Logic 8(2) (2010)
Glavic, B., Alonso, G.: Perm: Processing Provenance and Data on the Same Data Model through Query Rewriting. In: ICDE (2009)
Green, T.J.: Containment of Conjunctive Queries on Annotated Relations. Theory of Computing Systems 49(2) (2011)
Green, T.J., Karvounarakis, G., Ives, Z.G., Tannen, V.: Update Exchange with Mappings and Provenance. In: VLDB (2007)
Green, T.J., Karvounarakis, G., Tannen, V.: Provenance Semirings. In: PODS (2007)
Heath, T., Bizer, C.: Linked Data: Evolving the Web into a Global Data Space. Synthesis Lectures on the Semantic Web. Morgan & Claypool Publishers (2011)
Imielinski, T., Lipski, W.: Incomplete Information in Relational Databases. JACMÂ 31(4) (1984)
Karvounarakis, G., Ives, Z.G., Tannen, V.: Querying Data Provenance. In: SIGMOD (2010)
Lian, X., Chen, L.: Efficient Query Answering in Probabilistic RDF graphs. In: SIGMOD, pp. 157–168. ACM (2011)
Manola, F., Miller, E., McBride, B.: RDF Primer (February 2004), http://www.w3.org/TR/rdf-primer
Mumick, I.S., Shmueli, O.: Finiteness Properties of Database Queries. In: ADC (1993)
Pérez, J., Arenas, M., Gutierrez, C.: Semantics and Complexity of SPARQL. ACM TODS 34(3) (2009)
Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF (January 2008), http://www.w3.org/TR/rdf-sparql-query
Udrea, O., Recupero, D.R., Subrahmanian, V.S.: Annotated RDF. ACM Trans. Comput. Logic 11(2), 10:1–10:41 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Karvounarakis, G., Fundulaki, I., Christophides, V. (2013). Provenance for Linked Data. In: Tannen, V., Wong, L., Libkin, L., Fan, W., Tan, WC., Fourman, M. (eds) In Search of Elegance in the Theory and Practice of Computation. Lecture Notes in Computer Science, vol 8000. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41660-6_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-41660-6_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41659-0
Online ISBN: 978-3-642-41660-6
eBook Packages: Computer ScienceComputer Science (R0)