Abstract
Machine learning (ML) lifecycle is a cyclic process to build an efficient ML system. Though a lot of commercial and community (non-commercial) frameworks have been proposed to streamline the major stages in the ML lifecycle, they are normally overqualified and insufficient for an ML system in its nascent phase. Driven by real-world experience in building and maintaining ML systems, we find that it is more efficient to initialize the major stages of ML lifecycle first for trial and error, followed by the extension of specific stages to acclimatize towards more complex scenarios. For this, we introduce a simple yet flexible framework, MLife, for fast ML lifecycle initialization. This is built on the fact that data flow in MLife is in a closed loop driven by bad cases, especially those which impact ML model performance the most but also provide the most value for further ML model development—a key factor towards enabling enterprises to fast track their ML capabilities. Better yet, MLife is also flexible enough to be easily extensible to more complex scenarios for future maintenance. For this, we introduce two real-world use cases to demonstrate that MLife is particularly suitable for ML systems in their early phases.




















Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
5Analytics. Retrieved from 08 May 2021. https://www.5analytics.com/
airflow. Retrieved from 08 May 2021. https://airflow.apache.org/
Algorithmia. Retrieved from 08 May 2021. https://algorithmia.com/
Amazon, (2020). Training ml models. In Amazon machine learning: Developer guide (pp. 72–73). Amazon Web Services.
Amazon web services. Retrieved from 08 May 2021. https://aws.amazon.com/
Ashmore, R., Calinescu, R., & Paterson, C. (2019). Assuring the machine learning lifecycle: Desiderata, methods, and challenges. arXiv preprint arXiv:1905.04223
Aslam, F. A., Mohammed, H. N., Mohd, J. M., Gulamgaus, M. A., & Lok, P. (2015). Efficient way of web development using python and flask. International Journal of Advanced Research in Computer Science, 6(2), 54.
Baylor, D., Breck, E., Cheng, H. T., Fiedel, N., Foo, C. Y., Haque, Z., Haykal, S., Ispir, M., Jain, V., Koc, L., & Koo, C. Y. (2017). Tfx: A tensorflow-based production-scale machine learning platform. In ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1387–1395).
Bengio, S. (2015). Sharing representations for long tail computer vision problems. In ACM on international conference on multimodal interaction (p. 1).
Bhosale, S., Patil, T., & Patil, P. (2015). Sqlite: Light database system. International Journal of Computer Science and Mobile Computing, 4(4), 882.
Chen, C., Golshan, B., Halevy, A., Tan, W., & Doan, A. (2018). Biggorilla: An open-source ecosystem for data preparation and integration. IEEE Data Engineering Bulletin, 41(2), 10–22.
Clobotics: Cloud image recognition. Retrieved from 08 May 2021. https://clobotics.com/retail
Cortex. Retrieved from 08 May 2021. https://www.cortex.dev/
craft ai. Retrieved from 08 May 2021. https://www.craft.ai/
Crankshaw, D., Wang, X., Zhou, G., Franklin, M., Gonzalez, J., & Stoica, I. (2017). Clipper: A low-latency online prediction serving system. In USENIX symposium on operating systems design and implementation (OSDI) (pp. 613–627).
Datatron. Retrieved from 08 May 2021. https://www.datatron.com/
Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In IEEE conference on computer vision and pattern recognition (pp. 248–255).
Engwall, K., & Roe, M. (2020). Git and GitLab in library website change management workflows. Code4Lib Journal, 48. https://journal.code4lib.org/articles/15250.
Fan, J., & Li, G. (2018). Human-in-the-loop rule learning for data integration. IEEE Data Engineering Bulletin, 41(2), 104–115.
Fanelli, D., & Piazza, F. (2020). Analysis and forecast of covid-19 spreading in China, Italy and France. Chaos, Solitons & Fractals, 134, 109761.
FBLearner. Retrieved from 08 May 2021. https://code.fb.com/core-data/introducing-fblearner-flow-facebook-s-ai-backbone/
Flyte. Retrieved from 08 May 2021. https://lyft.github.io/flyte/
Horizon Robotics: Driver monitoring system. Retrieved from 08 May 2021. https://en.horizon.ai/product/nebula
JupyterHub. Retrieved from 08 May 2021. https://jupyter.org/hub
Khan, M. Q., & Lee, S. (2019). A comprehensive survey of driving monitoring and assistance systems. Sensors, 19(11), 2574.
KNIME. Retrieved from 08 May 2021. https://www.knime.com/
kubeflow. Retrieved from 08 May 2021. https://www.kubeflow.org/
Lee, D., Macke, S., Xin, D., Lee, A., Huang, S., & Parameswaran, A. (2019). A human-in-the-loop perspective on automl: Milestones and the road ahead. IEEE Data Engineering Bulletin, 42(2), 59–70.
Lee, Y., Scolari, A., Chun, B., Santambrogio, M., Weimer, M., & Interlandi, M. (2018). Pretzel: Opening the black box of machine learning prediction serving systems. In USENIX symposium on operating systems design and implementation (OSDI) (pp. 611–626).
Lee, Y., Scolari, A., Chun, B., Weimer, M., & Interlandi, M. (2018). From the edge to the cloud: Model serving in ml.net. IEEE Data Engineering Bulletin, 41(4), 46–53.
Li, S., & Deng, W. (2020). Deep facial expression recognition: A survey. IEEE Transactions on Affective Computing. https://doi.org/10.1109/TAFFC.2020.2981446
Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y., & Alsaadi, F. (2017). A survey of deep neural network architectures and their applications. Neurocomputing, 234, 11–26.
Miao, H., Li, A., Davis, L., & Deshpande, A. (2017). Modelhub: Deep learning lifecycle management. In International conference on data engineering (pp. 1393–1394).
Michelangelo. Retrieved from 08 May 2021. https://eng.uber.com/michelangelo/
Microsoft. Retrieved from 08 May 2021. https://docs.microsoft.com/en-us/azure/machine-learning/
Microsoft machine learning server. Retrieved from 08 May 2021. https://docs.microsoft.com/en-us/machine-learning-server
mlflow. Retrieved from 08 May 2021. https://mlflow.org/docs/
mxnet. Retrieved from 08 May 2021. https://mxnet.cdn.apache.org/
Mxnet model server (mms). Retrieved from 08 May 2021. https://github.com/awslabs/mxnet-model-server
NiFi. Retrieved from 08 May 2021. https://nifi.apache.org/
Olston, C., Li, F., Harmsen, J., Soyke, J., Gorovoy, K., Lao, L., Fiedel, N., Ramesh, S., & Rajashekhar, V. (2017). Tensorflow-serving: Flexible, high-performance ml serving. In Workshop on ML systems at NIPS 2017 (pp. 1–8).
Ortu, M., Destefanis, G., Kassab, M., Counsell, S., Marchesi, M., & Tonelli, R. (2015). Would you mind fixing this issue? In International conference on Agile software development (pp. 129–140). Springer.
Pan, J., & McElhannon, J. (2018). Future edge cloud and edge computing for internet of things applications. IEEE Internet of Things Journal, 5(1), 439–449.
Peltarion. Retrieved from 08 May 2021. https://peltarion.com/
Polyzotis, N., Roy, S., Whang, S., & Zinkevich, M. (2018). Data lifecycle challenges in production machine learning: A survey. ACM SIGMOD Record, 47(2), 17–28.
Pytorch. Retrieved from 08 May 2021. https://pytorch.org/
Raschka, S., & Mirjalili, V. (2019). Python machine learning: Machine learning and deep learning with Python, scikit-learn, and TensorFlow 2. Packt Publishing Ltd.
Russell, B., Torralba, A., Murphy, K., & Freeman, W. (2008). Labelme: A database and web-based tool for image annotation. International Journal of Computer Vision, 77(1–3), 157–173.
SageMaker. Retrieved from 08 May 2021. https://aws.amazon.com/cn/sagemaker/
SAS: Sas model manager. Retrieved from 08 May 2021. https://www.sas.com/en_us/software/model-manager.html
Sawaya, W., & Giauque, W. (1986). Production and operations management. Harcourt Brace Jovanovich.
Schelter, S., Bießmann, F., Januschowski, T., Salinas, D., Seufert, S., & Szarvas, G. (2018). On challenges in machine learning model management. IEEE Data Engineering Bulletin, 41(4), 5–15.
Sculley, D., Holt, G., Golovin, D., Davydov, E., Phillips, T., Ebner, D., Chaudhary, V., Young, M., Crespo, J. F., & Dennison, D. (2015). Hidden technical debt in machine learning systems. In International conference on neural information processing systems (pp. 2503–2511).
Seldon. Retrieved from 08 May 2021. https://www.seldon.io/
Srinivasan, T., Sanabria, R., & Metze, F. (2019). Analyzing utility of visual context in multimodal speech recognition under noisy conditions. arXiv preprint arXiv:1907.00477
Tensorflow serving. Retrieved from 08 May 2021. https://www.tensorflow.org/serving
valohai. Retrieved from 08 May 2021. https://valohai.com/
Vartak, M., & Madden, S. (2018). Modeldb: Opportunities and challenges in managing machine learning models. IEEE Data Engineering Bulletin, 41(4), 16–25.
Xu, H., Zhang, H., Han, K., Wang, Y., Peng, Y., & Li, X. (2019). Learning alignment for multimodal emotion recognition from speech. arXiv preprint arXiv:1909.05645
Zaharia, M., et al. (2018). Accelerating the machine learning lifecycle with mlflow. IEEE Data Engineering Bulletin, 41(4), 39–45.
Acknowledgements
The work is supported by the funding from Clobotics and Horizon Robotics under the Research Program of Smart Retail and Driver Monitoring System, respectively, and in part by CREST R&D Grant T03C1-17, Malaysia.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Editors: João Gama, Alípio Jorge, Salvador García.
Rights and permissions
About this article
Cite this article
Yang, C., Wang, W., Zhang, Y. et al. MLife: a lite framework for machine learning lifecycle initialization. Mach Learn 110, 2993–3013 (2021). https://doi.org/10.1007/s10994-021-06052-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-021-06052-0