Abstract
In recent years, Web services are becoming more and more intelligent (e.g., in understanding user preferences) thanks to the integration of components that rely on Machine Learning (ML). Before users can interact (inference phase) with an ML-based service (ML-Service), the underlying ML model must learn (training phase) from existing data, a process that requires long-lasting batch computations. The management of these two, diverse phases is complex and meeting time and quality requirements can hardly be done with manual approaches.
This paper highlights some of the major issues in managing ML-services in both training and inference modes and presents some initial solutions that are able to meet set requirements with minimum user inputs. A preliminary evaluation demonstrates that our solutions allow these systems to become more efficient and predictable with respect to their response time and accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: Proceedings of the Symposium on Operating Systems Design and Implementation, pp. 265–283. USENIX (2016)
Baresi, L., Denaro, G., Quattrocchi, G.: Symbolic execution-driven extraction of the parallel execution plans of spark applications. In: Proceedings of the Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 246–256. ACM (2019)
Baresi, L., Leva, A., Quattrocchi, G.: Fine-grained dynamic resource allocation for big-data applications. IEEE Trans. Software Eng. 47(8), 1668–1682 (2021)
Baresi, L., Quattrocchi, G., Rasi, N.: Federated machine learning as a self-adaptive problem. In: Proceedings of the International Symposium on Software Engineering for Adaptive and Self-Managing Systems, pp. 41–47 (2021)
Baresi, L., Quattrocchi, G., Rasi, N.: Resource management for TensorFlow inference. In: Hacid, H., Kao, O., Mecella, M., Moha, N., Paik, H. (eds.) ICSOC 2021. LNCS, vol. 13121, pp. 238–253. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-91431-8_15
Chen, C.-C., Yang, C.-L., Cheng, H.-Y.: Efficient and robust parallel DNN training through model parallelism on multi-GPU platform. arXiv preprint arXiv:1809.02839 (2018)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Deng, L.: The MNIST database of handwritten digit images for machine learning research. Signal Process. Mag. 29(6), 141–142 (2012)
Fedorov, R., Camerada, A., Fraternali, P., Tagliasacchi, M.: Estimating snow cover from publicly available images. IEEE Trans. Multimedia 18(6), 1187–1200 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE (2016)
Islam, M.T., Srirama, S.N., Karunasekera, S., Buyya, R.: Cost-efficient dynamic scheduling of big data applications in apache spark on cloud. J. Syst. Softw. 162, 110515 (2020)
Jia, Z., Zaharia, M., Aiken, A.: Beyond data and model parallelism for deep neural networks. Proc. Mach. Learn. Syst. 1, 1–13 (2019)
Jordan, M.I., Mitchell, T.M.: Machine learning: trends, perspectives, and prospects. Science 349(6245), 255–260 (2015)
Jouppi, N.P., Young, C., Patil, N., Patterson, D.: A domain-specific architecture for deep neural networks. Commun. ACM 61(9), 50–59 (2018)
Juba, B., Le, H.S.: Precision-recall versus accuracy and the role of large data sets. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 4039–4048 (2019)
Dipu Kabir, H.M., Khosravi, A., Hosen, M.A., Nahavandi, S.: Neural network-based uncertainty quantification: a survey of methodologies and applications. IEEE Access 6, 36218–36234 (2018)
Labidi, T., Mtibaa, A., Gaaloul, W., Tata, S., Gargouri, F.: Cloud SLA modeling and monitoring. In: Proceedings of the International Conference on Services Computing, pp. 338–345. IEEE (2017)
Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and scalable predictive uncertainty estimation using deep ensembles. In: Advances in Neural Information Processing Systems. Annual Conference on Neural Information Processing Systems, vol. 30, pp. 6402–6413 (2017)
Lam, C.: Hadoop in Action. Simon and Schuster (2010)
Li, L., et al.: A system for massively parallel hyperparameter tuning. Proc. Mach. Learn. Syst. 2, 230–246 (2020)
Mohri, M., Rostamizadeh, A., Talwalkar, A.. Foundations of Machine Learning. MIT Press (2018)
Morabito, R., Chiang, M.: Discover, provision, and orchestration of machine learning inference services in heterogeneous edge. In: 41st International Conference on Distributed Computing Systems, pp. 1116–1119. IEEE (2021)
Nguyen, N., Khan, M.M.H., Wang, K.: Towards automatic tuning of apache spark configuration. In: IEEE International Conference on Cloud Computing, pp. 417–425 (2018)
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, Annual Conference on Neural Information Processing Systems, vol. 32, pp. 8024–8035 (2019)
Sahai, A., Durante, A., Machiraju, V.: Towards Automated SLA Management for Web Services. Hewlett-Packard Research Report HPL-2001-310 (R. 1) (2002)
Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6(1), 1–48 (2019)
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Vabalas, A., Gowen, E., Poliakoff, E., Casson, A.J.: Machine learning algorithm validation with a limited sample size. PloS ONE 14(11), e0224365 (2019)
Weiss, M., Tonella, P.: Uncertainty-wizard: fast and user-friendly neural network uncertainty quantification. In: Proceedings of the International Conference on Software Testing, Verification and Validation, pp. 436–441. IEEE (2021)
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
Jia Xu and David Lorge Parnas: Scheduling processes with release times, deadlines, precedence and exclusion relations. IEEE Trans. Softw. Eng. 16(3), 360–369 (1990)
Yang, Q., Liu, Y., Chen, T., Tong, Y.: Federated machine learning: concept and applications. ACM Trans. Intell. Syst. Technol. 10(2), 1–19 (2019)
Zaharia, M., et al.: Spark: cluster computing with working sets. In: Proceedings of the International Conference on Hot Topics in Cloud Computing. USENIX (2010)
Zhang, X., Zou, J., He, K., Sun, J.: Accelerating very deep convolutional networks for classification and detection. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 1943–1955 (2015)
Acknowledgments
This work has been partially supported by the SISMA (MIUR, PRIN 2017, Contract 201752ENYB) and EMELIOT (MUR, PRIN 2020, Contract 2020W3A5FY) national research projects.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Baresi, L., Quattrocchi, G. (2022). Training and Serving Machine Learning Models at Scale. In: Troya, J., Medjahed, B., Piattini, M., Yao, L., Fernández, P., Ruiz-Cortés, A. (eds) Service-Oriented Computing. ICSOC 2022. Lecture Notes in Computer Science, vol 13740. Springer, Cham. https://doi.org/10.1007/978-3-031-20984-0_48
Download citation
DOI: https://doi.org/10.1007/978-3-031-20984-0_48
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20983-3
Online ISBN: 978-3-031-20984-0
eBook Packages: Computer ScienceComputer Science (R0)