Skip to main content

Fine-Grained Provenance Collection over Scripts Through Program Slicing

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9672))

Abstract

Collecting provenance from scripts is often useful for scientists to explain and reproduce their scientific experiments. However, most existing automatic approaches capture provenance at coarse-grain, for example, the trace of user-defined functions. These approaches lack information of variable dependencies. Without this information, users may struggle to identify which functions really influenced the results, leading to the creation of false-positive provenance links. To address this problem, we propose an approach that uses dynamic program slicing for gathering provenance of Python scripts. By capturing dependencies among variables, it is possible to expose execution paths inside functions and, consequently, to create a provenance graph that accurately represents the function activations and the results they affect.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Agrawal, H., Horgan, J.R.: Dynamic program slicing. In: Conference on Programming Language Design and Implementation, pp. 246−256. ACM, New York, NY, USA (1990)

    Google Scholar 

  2. Angelino, E., Yamins, D., Seltzer, M.: StarFlow: a script-centric data analysis environment. In: McGuinness, D.L., Michaelis, J.R., Moreau, L. (eds.) IPAW 2010. LNCS, vol. 6378, pp. 236–250. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  3. Chen, Z., et al.: Dynamic slicing of Python programs. In: Annual Conference on Computer Software and Applications (COMPSAC), pp. 219−228 (2014)

    Google Scholar 

  4. Lerner, B.S., Boose, E.R.: Collecting provenance in an interactive scripting environment. In: Workshop on the Theory and Practice of Provenance (TaPP), Cologne, Germany (2014)

    Google Scholar 

  5. Murta, L., Braganholo, V., Chirigati, F., Koop, D., Freire, J.: noWorkflow: capturing and analyzing provenance of scripts. In: Ludaescher, B., Plale, B. (eds.) IPAW 2014. LNCS, vol. 8628, pp. 71–83. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  6. Pimentel, J.F., et al.: Tracking and analyzing the evolution of provenance from scripts. In: Mattoso, M., Glavic, B. (eds.) IPAW 2016. LNCS, vol. 9672, pp. 16–28. Springer, Heidelberg (2016)

    Google Scholar 

  7. Pimentel, J.F.N., et al.: Collecting and analyzing provenance on interactive notebooks: when IPython meets noWorkflow. In: Workshop on the Theory and Practice of Provenance (TaPP), Edinburgh, Scotland (2015)

    Google Scholar 

  8. Porges, A.: A set of eight numbers. Am. Math. Mon. 52(7), 379–382 (1945)

    Article  MathSciNet  MATH  Google Scholar 

  9. Tariq, D. et al.: Towards automated collection of application-level data provenance. In: Workshop on the Theory and Practice of Provenance (TaPP), Boston, MA, USA (2012)

    Google Scholar 

  10. Weiser, M.: Program slicing. In: International Conference on Software Engineering (ICSE), pp. 439–449. IEEE Press, Piscataway, NJ, USA (1981)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to João Felipe Pimentel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Pimentel, J.F., Freire, J., Murta, L., Braganholo, V. (2016). Fine-Grained Provenance Collection over Scripts Through Program Slicing. In: Mattoso, M., Glavic, B. (eds) Provenance and Annotation of Data and Processes. IPAW 2016. Lecture Notes in Computer Science(), vol 9672. Springer, Cham. https://doi.org/10.1007/978-3-319-40593-3_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-40593-3_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-40592-6

  • Online ISBN: 978-3-319-40593-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics