skip to main content
10.1145/3626246.3654752acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
short-paper
Open access

Demonstration of ElasticNotebook: Migrating Live Computational Notebook States

Published: 09 June 2024 Publication History

Abstract

Computational notebooks (e.g., Jupyter, Google Colab) are widely used for interactive data science and machine learning. However, existing notebook systems lack the functionality of reliably and efficiently persisting thenotebook session state consisting of user-defined variables (e.g., processed datasets, ML models), hence the termination of a session often leads to loss of work.
In this demo, we introduce a new notebook system, ElasticNotebook, that offers live migration of session states via computational checkpointing/restoration for notebook systems (e.g., Jupyter Notebook, Colab). ElasticNotebook's frontend allows users to configure the periodic creation of session state checkpoints, which can then be restored at will through a drop-down menu. ElasticNotebook's backend utilizes novel lightweight monitoring techniques to find a reliable and efficient way (i.e., replication plan ) for replicating session states when requested. This demo will showcase ElasticNotebook's ability to preserve the user's work progress in Jupyter Servers by replicating their session state in two common use cases: live migration across machines and resumption after termination.

References

[1]
Lightning AI. 2018. PyTorch ModelCheckpoint. pytorch-lightning.readthedocs.io/en/stable/api/pytorch_lightning.callbacks.ModelCheckpoint.html.
[2]
James Bergstra and Yoshua Bengio. 2012. Random Search for HyperhypParameter Optimization. J. Mach. Learn. Res., Vol. 13, null (feb 2012), 281--305.
[3]
CreateLab. 2023. ElasticNotebook demo video. https://youtu.be/icI-FiPgXvE.
[4]
CreateLab. 2024. Kishu - PyPi. https://pypi.org/project/kishu/.
[5]
CRIU. 2023. Linux CRIU. criu.org/Main_Page.
[6]
Andrew Crotty, Alex Galakatos, Emanuel Zgraggen, Carsten Binnig, and Tim Kraska. 2015. Vizdom: interactive analytics through pen and touch. Proceedings of the VLDB Endowment, Vol. 8, 12 (2015), 2024--2027.
[7]
The Devastator. 2023. Bruteforce Clustering. https://www.kaggle.com/code/thedevastator/bruteforce-clustering.
[8]
Cody Dunne, Nathalie Henry Riche, Bongshin Lee, Ronald Metoyer, and George Robertson. 2012. GraphTrail: Analyzing Large Multivariate, Heterogeneous Networks While Supporting Exploration History. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Austin, Texas, USA) (CHI '12). Association for Computing Machinery, New York, NY, USA, 1663--1672. https://doi.org/10.1145/2207676.2208293
[9]
Philipp Eichmann, Emanuel Zgraggen, Carsten Binnig, and Tim Kraska. 2020. IDEBench: A Benchmark for Interactive Data Exploration. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (Portland, OR, USA) (SIGMOD '20). Association for Computing Machinery, New York, NY, USA, 1555--1569. https://doi.org/10.1145/3318464.3380574
[10]
The Uncertainty Quantification Foundation. 2023 a. Dill. pypi.org/project/dill/.
[11]
The Uncertainty Quantification Foundation. 2023 b. Dill dump session. dill.readthedocs.io/en/latest/dill.html.
[12]
Jeremiah W. Johnson. 2020. Benefits and Pitfalls of Jupyter Notebooks in the Classroom. In Proceedings of the 21st Annual Conference on Information Technology Education (Virtual Event, USA) (SIGITE '20). Association for Computing Machinery, New York, NY, USA, 32--37. https://doi.org/10.1145/3368308.3415397
[13]
Project Jupyter. 2023. Jupyter Notebook. jupyter.org/.
[14]
Kaggle. 2023. Tabular Playground Series - Jul 2022. https://www.kaggle.com/competitions/tabular-playground-series-jul-2022.
[15]
Zhaoheng Li, Pranav Gor, Rahul Prabhu, Hui Yu, Yuzhou Mao, and Yongjoo Park. 2023 a. ElasticNotebook: Enabling Live Migration for Computational Notebooks. Proc. VLDB Endow., Vol. 17, 2 (oct 2023), 119--133. https://doi.org/10.14778/3626292.3626296
[16]
Zhaoheng Li, Pranav Gor, Rahul Prabhu, Hui Yu, Yuzhou Mao, and Yongjoo Park. 2023 b. ElasticNotebook: Enabling Live Migration for Computational Notebooks (Technical Report). arXiv e-prints (2023), arXiv--2309.
[17]
PBC Posit Software, PBC formerly RStudio. 2023. Posit RStudio. posit.co/.
[18]
Python. 2023. Pickle Documentation. docs.python.org/3/library/pickle.html.
[19]
scikit-learn developers. 2023. Scikit-learn PowerTransformer. https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PowerTransformer.html.
[20]
The IPython Team. 2023. IPython Interactive Computing. ipython.org/.
[21]
ErichypJan Wagenmakers and Simon Farrell. 2004. AIC model selection using Akaike weights. Psychonomic bulletin & review, Vol. 11, 1 (2004), 192--196.
[22]
April Yi Wang, Anant Mittal, Christopher Brooks, and Steve Oney. 2019. How data scientists use computational notebooks for realhyptime collaboration. Proceedings of the ACM on HumanhypComputer Interaction, Vol. 3, CSCW (2019), 1--30.
[23]
Emanuel Zgraggen, Robert Zeleznik, and Steven M. Drucker. 2014. PanoramicData: Data Analysis through Pen & Touch. IEEE Transactions on Visualization and Computer Graphics, Vol. 20, 12 (2014), 2112--2121. https://doi.org/10.1109/TVCG.2014.2346293

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD/PODS '24: Companion of the 2024 International Conference on Management of Data
June 2024
694 pages
ISBN:9798400704222
DOI:10.1145/3626246
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 June 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. computing platforms
  2. data replication tools

Qualifiers

  • Short-paper

Conference

SIGMOD/PODS '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 162
    Total Downloads
  • Downloads (Last 12 months)162
  • Downloads (Last 6 weeks)40
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media