Skip to main content

The Providence of Provenance

  • Conference paper
Big Data (BNCOD 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7968))

Included in the following conference series:

Abstract

For many years and under various names, provenance has been modelled, theorised about, standardised and implemented in various ways; it has become part of mainstream database research. Moreover, the topic has now infected nearly every branch of computer science: provenance is a problem for everyone. But what exactly is the problem? And has the copious research had any real effect on how we use databases or, more generally, how we use computers.

This is a brief attempt to summarise the research on provenance and what practical impact it has had. Although much of the research has yet to come to market, there is an increasing interest in the topic from industry; moreover, it has had a surprising impact in tangential areas such as data integration and data citation. However, we are still lacking basic tools to deal with provenance and we need a culture shift if ever we are to make full use of the technology that has already been developed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amsterdamer, Y., Deutch, D., Tannen, V.: Provenance for aggregate queries. CoRR, abs/1101.1110 (2011)

    Google Scholar 

  2. http://www.bbc.co.uk/news/magazine-22223190

  3. Bizer, C.: World factbook, fu berlin (UTC) (retrieved 16:30, May 4, 2013)

    Google Scholar 

  4. Bowers, S., McPhillips, T.M., Ludäscher, B.: Provenance in collection-oriented scientific workflows. Concurrency and Computation: Practice and Experience 20(5), 519–529 (2008)

    Article  Google Scholar 

  5. Bowers, S., McPhillips, T., Ludäscher, B., Cohen, S., Davidson, S.B.: A model for user-oriented data provenance in pipelined scientific workflows. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, pp. 133–147. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. Braun, U., Shinnar, A., Seltzer, M.I.: Securing provenance. In: HotSec (2008)

    Google Scholar 

  7. Buneman, P., Cheney, J., Vansummeren, S.: On the expressiveness of implicit provenance in query and update languages. ACM Trans. Database Syst. 33(4) (2008)

    Google Scholar 

  8. Buneman, P., Khanna, S., Tajima, K., Tan, W.C.: Archiving scientific data. ACM Trans. Database Syst. 29, 2–42 (2004)

    Article  Google Scholar 

  9. Buneman, P., Khanna, S., Tan, W.-C.: Why and where: A characterization of data provenance. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 316–330. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  10. Central Intelligence Agency. The World Factbook, https://www.cia.gov/library/publications/the-world-factbook/

  11. Cheney, J., Ahmed, A., Acar, U.A.: Provenance as dependency analysis. Mathematical Structures in Computer Science 21(6), 1301–1337 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  12. Cui, Y., Widom, J.: Practical lineage tracing in data warehouses. In: ICDE, pp. 367–378 (2000)

    Google Scholar 

  13. Davidson, S.B., Freire, J.: Provenance and scientific workflows: challenges and opportunities. In: SIGMOD Conference, pp. 1345–1350 (2008)

    Google Scholar 

  14. Davidson, S.B., Khanna, S., Roy, S., Stoyanovich, J., Tannen, V., Chen, Y.: On provenance and privacy. In: ICDT, pp. 3–10 (2011)

    Google Scholar 

  15. Deutch, D., Ives, Z., Milo, T., Tannen, V.: Caravan: Provisioning for what-if analysis. In: CIDR (2013)

    Google Scholar 

  16. Freire, J., Silva, C.T.: Making computations and publications reproducible with vistrails. Computing in Science and Engineering 14(4), 18–25 (2012)

    Article  Google Scholar 

  17. Gil, Y., Miles, S.: Prov model primer (2013), http://www.w3.org/TR/2013/NOTE-prov-primer-20130430/

  18. Green, T.J., Karvounarakis, G., Ives, Z.G., Tannen, V.: Provenance in orchestra. IEEE Data Eng. Bull. 33(3), 9–16 (2010)

    Google Scholar 

  19. Green, T.J., Karvounarakis, G., Tannen, V.: Provenance semirings. In: PODS, pp. 31–40 (2007)

    Google Scholar 

  20. Heath, T., Bizer, C.: Linked Data: Evolving the Web into a Global Data Space. Synthesis Lectures on the Semantic Web. Morgan & Claypool Publishers (2011)

    Google Scholar 

  21. Karvounarakis, G., Ives, Z.G., Tannen, V.: Querying data provenance. In: SIGMOD Conference, pp. 951–962 (2010)

    Google Scholar 

  22. Marinho, A., Murta, L., Werner, C., Braganholo, V., Cruz, S., Ogasawara, E., Mattoso, M.: Provmanager: a provenance management system for scientific workflows. Concurr. Comput.: Pract. Exper. 24(13), 1513–1530 (2012)

    Article  Google Scholar 

  23. Moreau, L., Freire, J., Futrelle, J., McGrath, R.E., Myers, J., Paulson, P.: The open provenance model: An overview. In: Freire, J., Koop, D., Moreau, L. (eds.) IPAW 2008. LNCS, vol. 5272, pp. 323–326. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  24. Muniswamy-Reddy, K.-K., Braun, U., David, P.M., Holland, A., Maclean, D., Margo, D., Seltzer, M., Smogor, R.: Layering in Provenance Systems. In: 2009 USENIX Annual Technical Conference, San Diego, CA (June 2009)

    Google Scholar 

  25. Nowakowski, P., Ciepiela, E., Harezlak, D., Kocot, J., Kasztelnik, M., Bartynski, T., Meizner, J., Dyk, G., Malawski, M.: The collage authoring environment. Procedia CS 4, 608–617 (2011)

    Google Scholar 

  26. Seltzer, M.: World domination through provenance (tapp 2013 keynote) (2013), https://www.usenix.org/conference/tapp13/world-domination-through-provenance

  27. Sharman, J.L., Benson, H.E., Pawson, A.J., Lukito, V., Mpamhanga, C.P., Bombail, V., Davenport, A.P., Peters, J.A., Spedding, M., Harmar, A.J.: Nc-Iuphar. Iuphar-db: updated database content and new features. Nucleic Acids Research 41(Database-Issue), 1083–1088 (2013)

    Article  Google Scholar 

  28. Wang, Y.R., Madnick, S.E.: A polygen model for heterogeneous database systems: The source tagging perspective. In: VLDB, pp. 519–538 (1990)

    Google Scholar 

  29. Woodruff, A., Stonebraker, M.: Supporting fine-grained data lineage in a database visualization environment. In: ICDE, pp. 91–102 (1997)

    Google Scholar 

  30. Zhao, J., Goble, C., Stevens, R., Turi, D.: Mining taverna’s semantic web of provenance. Concurrency and Computation: Practice and Experience 20(5), 463–472 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Buneman, P. (2013). The Providence of Provenance. In: Gottlob, G., Grasso, G., Olteanu, D., Schallhart, C. (eds) Big Data. BNCOD 2013. Lecture Notes in Computer Science, vol 7968. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39467-6_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39467-6_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39466-9

  • Online ISBN: 978-3-642-39467-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics