Skip to main content

Data Lineage Tracing in Data Warehousing Environments

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4587))

Abstract

Data lineage tracing (DLT) is to find derivations of integrated data in integrated database systems, where the data sources might be autonomous, distributed and heterogeneous. In previous work, we present a DLT approach using partial schema transformation pathways. In this paper, we extend our DLT approach to using full schema transformation pathways and discuss the problem of lineage data ambiguities. Our DLT approach is not limited in one specific data model and query language, and would be useful in general data warehousing environments.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Boyd, M., Kittivoravitkul, S., Lazanitis, C., et al.: AutoMed: A BAV data integration system for heterogeneous data sources. In: Persson, A., Stirna, J. (eds.) CAiSE 2004. LNCS, vol. 3084, pp. 82–97. Springer, Heidelberg (2004)

    Google Scholar 

  2. Buneman, P., Khanna, S., Tan, W.C.: Why and Where: A characterization of data provenance. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 316–330. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  3. Buneman, P., et al.: Comprehension syntax. SIGMOD Record 23(1), 87–96 (1994)

    Article  Google Scholar 

  4. Cui, Y.: Lineage tracing in data warehouses. PhD thesis, Computer Science Department, Stanford University (2001)

    Google Scholar 

  5. Cui, Y., Widom, J.: Lineage tracing for general data warehouse transformations. In: Proc. VLDB 2001, pp. 471–480. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  6. Cui, Y., Widom, J., Wiener, J.L.: Tracing the lineage of view data in a warehousing environment. ACM Transactions on Database Systems (TODS) 25(2), 179–227 (2000)

    Article  Google Scholar 

  7. Fan, H., Poulovassilis, A.: Tracing data lineage using schema transformation pathways. In: Knowledge Transformation for the Semantic Web, vol. 95, pp. 64–79. IOS Press, Amsterdam (2003)

    Google Scholar 

  8. Fan, H., Poulovassilis, A.: Using AutoMed metadata in data warehousing environments. In: Proc. DOLAP 2003, pp. 86–93. ACM Press, New York (2003)

    Chapter  Google Scholar 

  9. Fan, H., Poulovassilis, A.: Using schema transformation pathways for data lineage tracing. In: Jackson, M., Nelson, D., Stirk, S. (eds.) Database: Enterprise, Skills and Innovation. LNCS, vol. 3567, pp. 133–144. Springer, Heidelberg (2005)

    Google Scholar 

  10. McBrien, P., Poulovassilis, A.: A uniform approach to inter-model transformations. In: Jarke, M., Oberweis, A. (eds.) CAiSE 1999. LNCS, vol. 1626, pp. 333–348. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  11. McBrien, P., Poulovassilis, A.: Defining peer-to-peer data integration using both as view rules. In: Aberer, K., Koubarakis, M., Kalogeraki, V. (eds.) Databases, Information Systems, and Peer-to-Peer Computing. LNCS, vol. 2944, pp. 91–107. Springer, Heidelberg (2004)

    Google Scholar 

  12. Zamboulis, L.: XML data integration by graph restructuring. In: Williams, H., MacKinnon, L.M. (eds.) Key Technologies for Data Management. LNCS, vol. 3112, Springer, Heidelberg (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Richard Cooper Jessie Kennedy

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fan, H. (2007). Data Lineage Tracing in Data Warehousing Environments. In: Cooper, R., Kennedy, J. (eds) Data Management. Data, Data Everywhere. BNCOD 2007. Lecture Notes in Computer Science, vol 4587. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73390-4_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73390-4_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73389-8

  • Online ISBN: 978-3-540-73390-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics