research-article

Assessing the quality of computational notebooks for a frictionless transition from exploration to production

Author:
Luigi Quaranta

University of Bari, Bari, Italy

University of Bari, Bari, Italy
View Profile

ICSE '22: Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion ProceedingsMay 2022Pages 256–260https://doi.org/10.1145/3510454.3517055

Published:19 October 2022Publication History

ICSE '22: Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings

Pages 256–260

ABSTRACT

The massive trend of integrating data-driven AI capabilities into traditional software systems is rising new intriguing challenges. One of such challenges is achieving a smooth transition from the explorative phase of Machine Learning projects - in which data scientists build prototypical models in the lab - to their production phase - in which software engineers translate prototypes into production-ready AI components. To narrow down the gap between these two phases, tools and practices adopted by data scientists might be improved by incorporating consolidated software engineering solutions. In particular, computational notebooks have a prominent role in determining the quality of data science prototypes. In my research project, I address this challenge by studying the best practices for collaboration with computational notebooks and proposing proof-of-concept tools to foster guidelines compliance.

References

Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software Engineering for Machine Learning: a Case Study. In Proceedings of the 41st International Conference on Software Engineering: Software Engineering in Practice. IEEE Press, 291--300.Google ScholarDigital Library
Anders Arpteg, Björn Brinne, Luka Crnkovic-Friis, and Jan Bosch. 2018. Software engineering challenges of deep learning. In 2018 44th euromicro conference on software engineering and advanced applications (SEAA). 50--59. tex.organization: IEEE.Google Scholar
Justus Bogner, Roberto Verdecchia, and Ilias Gerostathopoulos. 2021. Characterizing Technical Debt and Antipatterns in AI-Based Systems: A Systematic Mapping Study. In 2021 IEEE/ACM International Conference on Technical Debt (TechDebt). IEEE, Madrid, Spain. arXiv:2103.09783. Google ScholarCross Ref
Vahid Garousi, Michael Felderer, and Mika V. Mäntylä. 2019. Guidelines for including grey literature and conducting multivocal literature reviews in software engineering. Information and Software Technology 106 (2019), 101 -- 121. Google ScholarCross Ref
Joel Grus. 2018. I don't like notebooks. https://conferences.oreilly.com/jupyter/jup-ny/public/schedule/detail/68282.htmlGoogle Scholar
Andrew Head, Fred Hohman, Titus Barik, Steven M. Drucker, and Robert DeLine. 2019. Managing Messes in Computational Notebooks. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems - CHI '19. ACM Press, Glasgow, Scotland Uk, 1--12. Google ScholarDigital Library
Mary Beth Kery and Brad A Myers. 2018. Interactions for untangling messy history in a computational notebook. In 2018 IEEE symposium on visual languages and human-centric computing (VL/HCC). 147--155. tex.organization: IEEE.Google ScholarCross Ref
Miryung Kim, Thomas Zimmermann, Robert DeLine, and Andrew Begel. 2018. Data scientists in software teams: State of the art and challenges. IEEE Transactions on Software Engineering 44, 11 (2018), 1024--1038. Publisher: IEEE. Google ScholarCross Ref
Andreas Koenzen, Neil Ernst, and Margaret-Anne Storey. 2020. Code Duplication and Reuse in Jupyter Notebooks. In Proc. of the 2020 Symposium on Visual Languages and Human-Centric Computing. Google ScholarCross Ref
Filippo Lanubile, Fabio Calefato, Luigi Quaranta, Maddalena Amoruso, Fabio Fumarola, and Michele Filannino. 2021. Towards Productizing AI/ML Models: An Industry Perspective from Data Scientists. In 2021 IEEE/ACM 1st Workshop on AI Engineering - Software Engineering for AI (WAIN). IEEE, Madrid, Spain, 129--132. Google ScholarCross Ref
Grace A. Lewis, Stephany Bellomo, and April Galyardt. 2019. Component Mismatches Are a Critical Bottleneck to Fielding AI-Enabled Systems in the Public Sector. In arXiv:1910.06136 [cs]. Arlington, Virginia, USA. http://arxiv.org/abs/1910.06136 arXiv: 1910.06136.Google Scholar
Grace A. Lewis, Stephany Bellomo, and Ipek Ozkaya. 2021. Characterizing and Detecting Mismatch in Machine-Learning-Enabled Systems. In 2021 IEEE/ACM 1st Workshop on AI Engineering - Software Engineering for AI (WAIN). IEEE, Madrid, Spain. Google ScholarCross Ref
Lucy Ellen Lwakatare, Aiswarya Raj, Jan Bosch, Helena Holmström Olsson, and Ivica Crnkovic. 2019. A Taxonomy of Software Engineering Challenges for Machine Learning Systems: An Empirical Investigation. In Agile Processes in Software Engineering and Extreme Programming, Philippe Kruchten, Steven Fraser, and François Coallier (Eds.). Springer International Publishing, 227--243.Google Scholar
Lucy Ellen Lwakatare, Aiswarya Raj, Ivica Crnkovic, Jan Bosch, and Helena Holmström Olsson. 2020. Large-scale machine learning systems in real-world industrial settings: A review of challenges and solutions. Information and Software Technology 127 (2020), 106368. Google ScholarCross Ref
E Nascimento, I Ahmed, E Oliveira, MP Palheta, I Steinmacher, and T Conte. 2019. Understanding Development Process of Machine Learning Systems: Challenges and Solutions. In 2019 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). 1--6. Google ScholarCross Ref
Jeffrey M Perkel. 2018. Why Jupyter is data scientists' computational notebook of choice. Nature 563, 7732 (2018), 145--147.Google Scholar
Joao Felipe Pimentel, Leonardo Murta, Vanessa Braganholo, and Juliana Freire. 2019. A Large-Scale Study About Quality and Reproducibility of Jupyter Notebooks. In Proc. of the 16th International Conference on Mining Software Repositories. 507--517. Google ScholarDigital Library
João Felipe Pimentel, Leonardo Murta, Vanessa Braganholo, and Juliana Freire. 2021. Understanding and improving the quality and reproducibility of Jupyter notebooks. Empirical Software Engineering 26, 4 (July 2021), 65. Google ScholarDigital Library
Fernando Pérez and Brian E. Granger. 2015. Project Jupyter: Computational Narratives as the Engine of Collaborative Data Science. Technical Report. UC Berkeley and Cal Poly. 24 pages. http://archive.ipython.org/JupyterGrantNarrative2015.pdfGoogle Scholar
Luigi Quaranta, Fabio Calefato, and Filippo Lanubile. 2021. KGTorrent: A Dataset of Python Jupyter Notebooks from Kaggle. In 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR). IEEE, Madrid, Spain, 550--554. Google ScholarCross Ref
Luigi Quaranta, Fabio Calefato, and Filippo Lanubile. 2021. A Taxonomy of Tools for Reproducible Machine Learning Experiments. AIxIA 2021.Google Scholar
Luigi Quaranta, Fabio Calefato, and Filippo Lanubile. 2022. <Title omitted for double blind review>. Under minor revision at CSCW 2022.Google Scholar
Adam Rule, Amanda Birmingham, Cristal Zuniga, Ilkay Altintas, Shih-Cheng Huang, Rob Knight, Niema Moshiri, Mai H. Nguyen, Sara Brin Rosenthal, Fernando Pérez, and Peter W. Rose. 2018. Ten Simple Rules for Reproducible Research in Jupyter Notebooks. arXiv:1810.08055 [cs] (Oct. 2018). http://arxiv.org/abs/1810.08055 arXiv: 1810.08055.Google Scholar
Adam Rule, Ian Drosos, Aurélien Tabard, and James D. Hollan. 2018. Aiding Collaborative Reuse of Computational Notebooks with Annotated Cell Folding. Proceedings of the ACM on Human-Computer Interaction 2, CSCW (Nov. 2018), 1--12. Google ScholarDigital Library
Adam Rule, Aurélien Tabard, and James D. Hollan. 2018. Exploration and Explanation in Computational Notebooks. In Proc. of the 2018 CHI Conference on Human Factors in Computing Systems. Google ScholarDigital Library
Danilo Sato, Arif Wider, and Christoph Windheuser. 2019. Continuous Delivery for Machine Learning - Automating the end-to-end lifecycle of Machine Learning applications. https://martinfowler.com/articles/cd4ml.htmlGoogle Scholar
David Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Francois Crespo, and Dan Dennison. 2015. Hidden technical debt in machine learning systems. In Advances in neural information processing systems. 2503--2511.Google Scholar
Alex Serban, Koen van der Blom, Holger Hoos, and Joost Visser. 2020. Adoption and Effects of Software Engineering Best Practices in Machine Learning. In Proceedings of the 14th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) (ESEM '20). Association for Computing Machinery, New York, NY, USA, 1--12. Google ScholarCross Ref
Yiming Tang, Raffi Khatchadourian, Mehdi Bagherzadeh, Rhia Singh, Ajani Stewart, and Anita Raja. 2021. An empirical study of refactorings and technical debt in machine learning systems. In 2021 IEEE/ACM 43rd international conference on software engineering (ICSE). IEEE, Madrid, Spain, 238--250. tex.organization: IEEE. Google ScholarDigital Library
Bart van Oort, Luís Cruz, Maurício Aniche, and Arie van Deursen. 2021. The Prevalence of Code Smells in Machine Learning projects. In 2021 IEEE/ACM 1st Workshop on AI Engineering - Software Engineering for AI (WAIN). IEEE, Madrid, Spain. arXiv: 2103.04146. Google ScholarCross Ref
Zhiyuan Wan, Xin Xia, David Lo, and Gail C. Murphy. 2019. How does Machine Learning Change Software Development Practices? IEEE Transactions on Software Engineering (2019), 1--1. Google ScholarCross Ref
April Yi Wang, Zihan Wu, Christopher Brooks, and Steve Oney. 2020. Callisto: Capturing the "Why" by Connecting Conversations with Computational Narratives. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. ACM, Honolulu HI USA, 1--13. Google ScholarDigital Library
Jiawei Wang, Li Li, and Andreas Zeller. 2020. Better code, better sharing: On the need of analyzing jupyter notebooks. In Proc. of the ACM/IEEE 42nd International Conference on Software Engineering: New Ideas and Emerging Results. ACM, 53--56. Google ScholarDigital Library
Hironori Washizaki, Hiromu Uchida, Foutse Khomh, and Yann-Gael Gueheneuc. 2019. Studying Software Engineering Patterns for Designing Machine Learning Systems. 2019 10th International Workshop on Empirical Software Engineering in Practice (IWESEP) (2019), 49--495. arXiv: 1910.04736. Google ScholarCross Ref
Amy X. Zhang, Michael Muller, and Dakuo Wang. 2020. How do Data Science Workers Collaborate? Roles, Workflows, and Tools. Proceedings of the ACM on Human-Computer Interaction 4, CSCW (May 2020), 022:1--022:23. Google ScholarDigital Library

Index Terms

Assessing the quality of computational notebooks for a frictionless transition from exploration to production
1. Computing methodologies
  1. Machine learning
2. Software and its engineering

Recommendations

What's Wrong with Computational Notebooks? Pain Points, Needs, and Design Opportunities
CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems

Computational notebooks - such as Azure, Databricks, and Jupyter - are a popular, interactive paradigm for data scientists to author code, analyze data, and interleave visualizations, all within a single document. Nevertheless, as data scientists ...
Read More
Modern software cybernetics

Classify software cybernetics as Software Cybernetics I and II.Identify the transition from Software Cybernetics I to Software Cybernetics II.Indicate that some new research areas are related to Software Cybernetics II.Highlight new research trends of ...
Read More
Exploring and Evaluating the Potential of 2D Computational Notebooks
ISS Companion '23: Companion Proceedings of the 2023 Conference on Interactive Surfaces and Spaces

Computational notebooks are popular tools for data science and presentation of computational narratives. However, their 1D structure introduces and exacerbates user issues, such as messiness, tedious navigation, inefficient use of large screen space, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICSE '22: Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings
May 2022
394 pages
ISBN:9781450392235
DOI:10.1145/3510454
General Chair:
Matthew B Dwyer
University of Virginia
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 October 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
artificial intelligence
computational notebooks
data science
linters
machine learning
software engineering
static analysis tools
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate276of1,856submissions,15%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 90
  Total Downloads
- Downloads (Last 12 months)44
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Assessing the quality of computational notebooks for a frictionless transition from exploration to production

ICSE '22: Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings

ABSTRACT

References

Cited By

Index Terms

Recommendations

What's Wrong with Computational Notebooks? Pain Points, Needs, and Design Opportunities

Modern software cybernetics

Exploring and Evaluating the Potential of 2D Computational Notebooks