Skip to main content

Empowering Provenance in Data Integration

  • Conference paper
Advances in Databases and Information Systems (ADBIS 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5739))

Abstract

The provenance of data has recently been recognized as central to the trust one places in data. This paper presents a novel framework in order to empower provenance in a mediator based data integration system. We use a simple mapping language for mapping schema constructs, between an ontology and relational sources, capable to carry provenance information. This language extends the traditional data exchange setting by translating our mapping specifications into source-to-target tuple generating dependencies (s-t tgds). Then we define formally the provenance information we want to retrieve i.e. annotation, source and tuple provenance. We provide three algorithms to retrieve provenance information using information stored on the mappings and the sources. We show the feasibility of our solution and the advantages of our framework.

This work was partially supported by the EU project plutIt (ICT-231430).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Buneman, P.: Information Integration Needs a History Lesson. University of Edinburgh, Edinburgh (2006)

    Google Scholar 

  2. Buneman, P., Cheney, J.: On the Expressiveness of Implicit Provenance in Query and Update Languages. ACM Transactions on Database Systems V, 1–45 (2008)

    Google Scholar 

  3. Glavic, B., Dittrich, K.R.: Data Provenance: A Categorization of Existing Approaches. In: BTW (2007)

    Google Scholar 

  4. Buneman, P., Khanna, S., Tan, W.C.: Why and Where: A Characterization of Data Provenance. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, p. 316. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  5. Uschold, M., Gruninger, M.: Ontologies: Principles, methods and applications. Knowledge Engineering Review 11, 93–155 (1996)

    Article  Google Scholar 

  6. Konstantinou, N., Spanos, D.-E., Mitrou, N.: Ontology and database mapping: A survey of current implementations and future directions. Journal of Web Engineering 7, 1–24 (2008)

    Google Scholar 

  7. Auer, S., Ives, Z.G.: Integrating Ontologies and Relational Data. University of Pennsylvania Department of Computer and Information Science Technical, Report No. MS-CIS-07-24 (2007)

    Google Scholar 

  8. Lenzerini, M.: Data integration: a theoretical perspective. In: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. ACM, Madison (2002)

    Google Scholar 

  9. Doerr, M., Ore, C.-E., Stead, S.: The CIDOC conceptual reference model: a new standard for knowledge sharing. Tutorials, posters, panels and industrial contributions at the 26th international conference on Conceptual modeling, vol. 83, Australian Computer Society, Inc., Auckland (2007)

    Google Scholar 

  10. Klein, M.: Combining and relating ontologies:an analysis of problems and solutions. In: IJCAI (2001)

    Google Scholar 

  11. Doan, A., Noy, N.F., Halevy, A.Y.: Introduction to the special issue on semantic integration. ACM SIGMOD Record 33, 11–13 (2004)

    Article  Google Scholar 

  12. Kalfoglou, Y., Schorlemmer, M.: Ontology mapping: the state of the art. Knowl. Eng. Rev. 18, 1–31 (2003)

    Article  MATH  Google Scholar 

  13. Choi, N., Song, I.-Y., Han, H.: A survey on ontology mapping. SIGMOD Record 35, 34–41 (2006)

    Article  Google Scholar 

  14. Kondylakis, H., Doerr, M., Plexousakis, D.: Mapping Language for Information Integration. FORTH-ICS, Technical Report 385, ICS-FORTH (December 2006)

    Google Scholar 

  15. Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data exchange: semantics and query answering. Theoretical Computer Science 336, 89–124 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  16. Chiticariu, L., Tan, W.-C.: Debugging schema mappings with routes. In: Proceedings of the 32nd international conference on Very large data bases. VLDB Endowment, Seoul (2006)

    Google Scholar 

  17. Wang, Y.R., Madnick, S.E.: A Polygen Model for Heterogeneous Database Systems: The Source Tagging Perspective. In: Proceedings of the 16th International Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc., San Francisco (1990)

    Google Scholar 

  18. Woodruff, A., Stonebraker, M.: Supporting Fine-grained Data Lineage in a Database Visualization Environment. In: Proceedings of the Thirteenth International Conference on Data Engineering. IEEE Computer Society, Los Alamitos (1997)

    Google Scholar 

  19. Velegrakis, Y., Miller, R.J., Mylopoulos, J.: Representing and Querying Data Transformations. In: Proceedings of the 21st International Conference on Data Engineering. IEEE Computer Society, Los Alamitos (2005)

    Google Scholar 

  20. Buneman, P., Khanna, S., Tan, W.-C.: On propagation of deletions and annotations through views. In: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. ACM, Madison (2002)

    Google Scholar 

  21. Tan, W.C.: Containment of relational queries with annotation propagation. In: Workshop on Database and Programming Languages, pp. 37–53 (2003)

    Google Scholar 

  22. Ioannidis, Y.E., Ramakrishnan, R.: Containment of conjunctive queries: beyond relations as sets. ACM Trans. Database Syst. 20, 288–324 (1995)

    Article  Google Scholar 

  23. Lee, T., Bressan, S., Madnick, S.E.: Source Attribution for Querying Against Semi-structured Documents. In: Workshop on Web Information and Data Management (1998)

    Google Scholar 

  24. Green, T.J., Karvounarakis, G., Tannen, V.: Provenance semirings. In: Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. ACM, Beijing (2007)

    Google Scholar 

  25. Cui, Y., Widom, J.: Practical Lineage Tracing in Data Warehouses. In: Proceedings of the 16th International Conference on Data Engineering. IEEE Computer Society, Los Alamitos (2000)

    Google Scholar 

  26. Tan, W.C.: Provenance in Databases: Past, Current, and Future. IEEE Data Eng. Bull. 30, 3–12 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kondylakis, H., Doerr, M., Plexousakis, D. (2009). Empowering Provenance in Data Integration. In: Grundspenkis, J., Morzy, T., Vossen, G. (eds) Advances in Databases and Information Systems. ADBIS 2009. Lecture Notes in Computer Science, vol 5739. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03973-7_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03973-7_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03972-0

  • Online ISBN: 978-3-642-03973-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics