Deep learning has improved the state-of-the-art results in an ever-growing number of domains. This success heavily relies on the development and training of deep learning models–an experimental, iterative process that produces tens to hundreds of models before arriving at a satisfactory result. While there has been a surge in the number of tools and frameworks that aim at facilitating deep learning, the process of managing the models and their artifacts is still surprisingly challenging and time-consuming. Existing model-management solutions are either tailored for commercial platforms or require significant code changes. Moreover, most of the existing solutions address a single phase of the modeling lifecycle, such as experiment monitoring, while ignoring other essential tasks, such as model deployment. In this paper, we present a software system to facilitate and accelerate the deep learning lifecycle, named ModelKB. ModelKB can automatically manage the modeling lifecycle end-to-end, including (1) monitoring and tracking experiments; (2) visualizing, searching for, and comparing models and experiments; (3) deploying models locally and on the cloud; and (4) sharing and publishing trained models. Moreover, our system provides a stepping-stone for enhanced reproducibility. ModelKB currently supports TensorFlow 2.0, Keras, and PyTorch, and it can be extended to other deep learning frameworks easily.

We would like to thank Sirisha Rella and Duy Ho for their help in some implementation parts in early versions of ModelKB. We would like to thank the Ph.D. students and the industry participants who helped in conducting the user study and evaluate the software system. We also thank the anonymous reviewers for their time and effort in reviewing this work. The first author thanks Yasmin Hussein for her help and support throughout this work. The coauthor, Yugyung Lee, would like to acknowledge the partial support of the NSF Grant No. 1747751
