Abstract
Predicting execution time of computational jobs helps improve resource management, reduce execution cost, and optimize energy consumption. In this paper, we evaluate machine learning techniques for the purpose of predicting execution times of scientific workflow jobs. Various aspects of applying these techniques are evaluated in terms of their impact on prediction performance. These include (1) Comparison of performance of different regressors; (2) using a single-stage prediction pipeline vs. two-stage one; (3) impact of categorization granularity in the first stage of the two-stage pipeline; (4) training one global model for all jobs vs. using separate models for individual job types. We also propose a novel prediction model based on symbolic regression and evaluate its performance. Interpretability of prediction models and usage of proper performance metrics are also discussed. Experimental evaluation has led to a number of interesting findings that provide valuable insight on how to apply machine learning techniques to prediction of execution time of computational jobs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Balis, B.: Hyperflow: a model of computation, programming approach and enactment engine for complex distributed workflows. Futur. Gener. Comput. Syst. 55, 147–162 (2016)
Bhattacharyya, A., Hoefler, T.: Pemogen: automatic adaptive performance modeling during program runtime. In: Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT (2014). https://doi.org/10.1145/2628071.2628100
Borghesi, A., Bartolini, A., Lombardi, M., Milano, M., Benini, L.: Predictive modeling for job power consumption in HPC systems (2016). https://doi.org/10.1007/978-3-319-41321-1_10
Deelman, E., Mandal, A., Jiang, M., Sakellariou, R.: The role of machine learning in scientific workflows. Int. J. High Perform. Comput. Appl. 33(6), 1128–1139 (2019)
Galleguillos, C., Sîrbu, A., Kiziltan, Z., Babaoglu, O., Borghesi, A., Bridi, T.: Data-driven job dispatching in HPC systems (2018). https://doi.org/10.1007/978-3-319-72926-8_37
Good, P.: Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses. Springer, Heidelberg (2013)
Kim, I.K., Wang, W., Qi, Y., Humphrey, M.: Forecasting cloud application workloads with cloudinsight for predictive resource management. IEEE Trans. Cloud Comput. 10(3), 1848–1863 (2020)
Li, W., Xia, Y., Zhou, M., Sun, X., Zhu, Q.: Fluctuation-aware and predictive workflow scheduling in cost-effective infrastructure-as-a-service clouds. IEEE Access 6, 61488–61502 (2018)
Medel, V., Rana, O., Bañares, J.Á., Arronategui, U.: Modelling performance & resource management in kubernetes. In: Proceedings of the 9th International Conference on Utility and Cloud Computing, pp. 257–262 (2016)
Mustafa, S., Elghandour, I., Ismail, M.: A machine learning approach for predicting execution time of spark jobs. Alex. Eng. J. 57, 3767–3778 (2018). https://doi.org/10.1016/j.aej.2018.03.006
Nawrocki, P., Sniezynski, B.: Adaptive context-aware energy optimization for services on mobile devices with use of machine learning. Wireless Pers. Commun. 115(3), 1839–1867 (2020)
Pham, T.P., Durillo, J.J., Fahringer, T.: Predicting workflow task execution time in the cloud using a two-stage machine learning approach. IEEE Trans. Cloud Comput. 8(1), 256–268 (2020). https://doi.org/10.1109/TCC.2017.2732344
Pietri, I., Juve, G., Deelman, E., Sakellariou, R.: A performance model to estimate execution time of scientific workflows on the cloud. In: 2014 9th Workshop on Workflows in Support of Large-Scale Science, pp. 11–19. IEEE (2014)
Acknowledgment
The research presented in this paper was partially supported by the funds of Polish Ministry of Education and Science assigned to AGH University of Science and Technology.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Balis, B., Grabowski, M. (2023). Evaluation of Machine Learning Techniques for Predicting Run Times of Scientific Workflow Jobs. In: Wyrzykowski, R., Dongarra, J., Deelman, E., Karczewski, K. (eds) Parallel Processing and Applied Mathematics. PPAM 2022. Lecture Notes in Computer Science, vol 13826. Springer, Cham. https://doi.org/10.1007/978-3-031-30442-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-30442-2_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30441-5
Online ISBN: 978-3-031-30442-2
eBook Packages: Computer ScienceComputer Science (R0)