Abstract
The provenance of data has recently been recognized as central to the trust one places in data. This paper presents a novel framework in order to empower provenance in a mediator based data integration system. We use a simple mapping language for mapping schema constructs, between an ontology and relational sources, capable to carry provenance information. This language extends the traditional data exchange setting by translating our mapping specifications into source-to-target tuple generating dependencies (s-t tgds). Then we define formally the provenance information we want to retrieve i.e. annotation, source and tuple provenance. We provide three algorithms to retrieve provenance information using information stored on the mappings and the sources. We show the feasibility of our solution and the advantages of our framework.
This work was partially supported by the EU project plutIt (ICT-231430).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Buneman, P.: Information Integration Needs a History Lesson. University of Edinburgh, Edinburgh (2006)
Buneman, P., Cheney, J.: On the Expressiveness of Implicit Provenance in Query and Update Languages. ACM Transactions on Database Systems V, 1–45 (2008)
Glavic, B., Dittrich, K.R.: Data Provenance: A Categorization of Existing Approaches. In: BTW (2007)
Buneman, P., Khanna, S., Tan, W.C.: Why and Where: A Characterization of Data Provenance. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, p. 316. Springer, Heidelberg (2001)
Uschold, M., Gruninger, M.: Ontologies: Principles, methods and applications. Knowledge Engineering Review 11, 93–155 (1996)
Konstantinou, N., Spanos, D.-E., Mitrou, N.: Ontology and database mapping: A survey of current implementations and future directions. Journal of Web Engineering 7, 1–24 (2008)
Auer, S., Ives, Z.G.: Integrating Ontologies and Relational Data. University of Pennsylvania Department of Computer and Information Science Technical, Report No. MS-CIS-07-24 (2007)
Lenzerini, M.: Data integration: a theoretical perspective. In: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. ACM, Madison (2002)
Doerr, M., Ore, C.-E., Stead, S.: The CIDOC conceptual reference model: a new standard for knowledge sharing. Tutorials, posters, panels and industrial contributions at the 26th international conference on Conceptual modeling, vol. 83, Australian Computer Society, Inc., Auckland (2007)
Klein, M.: Combining and relating ontologies:an analysis of problems and solutions. In: IJCAI (2001)
Doan, A., Noy, N.F., Halevy, A.Y.: Introduction to the special issue on semantic integration. ACM SIGMOD Record 33, 11–13 (2004)
Kalfoglou, Y., Schorlemmer, M.: Ontology mapping: the state of the art. Knowl. Eng. Rev. 18, 1–31 (2003)
Choi, N., Song, I.-Y., Han, H.: A survey on ontology mapping. SIGMOD Record 35, 34–41 (2006)
Kondylakis, H., Doerr, M., Plexousakis, D.: Mapping Language for Information Integration. FORTH-ICS, Technical Report 385, ICS-FORTH (December 2006)
Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data exchange: semantics and query answering. Theoretical Computer Science 336, 89–124 (2005)
Chiticariu, L., Tan, W.-C.: Debugging schema mappings with routes. In: Proceedings of the 32nd international conference on Very large data bases. VLDB Endowment, Seoul (2006)
Wang, Y.R., Madnick, S.E.: A Polygen Model for Heterogeneous Database Systems: The Source Tagging Perspective. In: Proceedings of the 16th International Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc., San Francisco (1990)
Woodruff, A., Stonebraker, M.: Supporting Fine-grained Data Lineage in a Database Visualization Environment. In: Proceedings of the Thirteenth International Conference on Data Engineering. IEEE Computer Society, Los Alamitos (1997)
Velegrakis, Y., Miller, R.J., Mylopoulos, J.: Representing and Querying Data Transformations. In: Proceedings of the 21st International Conference on Data Engineering. IEEE Computer Society, Los Alamitos (2005)
Buneman, P., Khanna, S., Tan, W.-C.: On propagation of deletions and annotations through views. In: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. ACM, Madison (2002)
Tan, W.C.: Containment of relational queries with annotation propagation. In: Workshop on Database and Programming Languages, pp. 37–53 (2003)
Ioannidis, Y.E., Ramakrishnan, R.: Containment of conjunctive queries: beyond relations as sets. ACM Trans. Database Syst. 20, 288–324 (1995)
Lee, T., Bressan, S., Madnick, S.E.: Source Attribution for Querying Against Semi-structured Documents. In: Workshop on Web Information and Data Management (1998)
Green, T.J., Karvounarakis, G., Tannen, V.: Provenance semirings. In: Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. ACM, Beijing (2007)
Cui, Y., Widom, J.: Practical Lineage Tracing in Data Warehouses. In: Proceedings of the 16th International Conference on Data Engineering. IEEE Computer Society, Los Alamitos (2000)
Tan, W.C.: Provenance in Databases: Past, Current, and Future. IEEE Data Eng. Bull. 30, 3–12 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kondylakis, H., Doerr, M., Plexousakis, D. (2009). Empowering Provenance in Data Integration. In: Grundspenkis, J., Morzy, T., Vossen, G. (eds) Advances in Databases and Information Systems. ADBIS 2009. Lecture Notes in Computer Science, vol 5739. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03973-7_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-03973-7_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03972-0
Online ISBN: 978-3-642-03973-7
eBook Packages: Computer ScienceComputer Science (R0)