Abstract
The importance of cloud computing has been rapidly growing due to the increasing number of users’ requests for diverse sets of resources. Although clouds have rich resources to handle these incoming requests, under or over-provisioning of resources can lead to failure. Therefore, it is important to provision cloud resources appropriately. Machine-learning based techniques have been proven to be effective in the management of resources along with maintaining a Service Level Agreement (SLA). These techniques require complete data to produce better prediction results. In practice, it may happen that the data is incomplete and data with more missing attribute values can negatively affect the outcome of the predictions. Therefore, interpolation of missing attribute values is crucial for better predictions. However, the existing methods for interpolation of missing attribute values are heavy in terms of computation. This paper first predicts resource usage in terms of CPU by applying the lightGBM model to a real dataset. Furthermore, using the explanations of SHapley Additive exPlanations (SHAP) in combination with the K-Nearest Neighbor (KNN) to interpolate missing values in the dataset for CPU usage prediction. The experimental results show that SHAP explanations can be helpful for cloud providers in the selection of important features for interpolation of missing values. This SHAP-based interpolation results in lower computational time along with acceptable accuracy in comparison with KNN-based interpolation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Shaw, R., Howley, E., Barrett, E.: Applying reinforcement learning towards automating energy efficient virtual machine consolidation in cloud data centers. Inf. Syst. 107, 101722 (2021)
Saxena, S., Khan, M., Singh, R., Noorwali, A.: Proactive virtual machine scheduling to optimize the energy consumption of computational cloud. Int. J. Adv. Comput. Sci. Appl. 12, 10 (2021)
Sarikaa, S., Niranjana, S., Sri Vishnu Deepika, K.: Time Series Forecasting of Cloud Resource Usage. In: 2021 IEEE 6th International Conference On Computing, Communication And Automation (ICCCA), pp. 372–382 (2021)
Shahidinejad, A., Ghobaei-Arani, M., Masdari, M.: Resource provisioning using workload clustering in cloud computing environment: a hybrid approach. Cluster Comput. 24, 319–342 (2021)
Goodarzy, S., Nazari, M., Han, R., Keller, E., Rozner, E.: Resource management in cloud computing using machine learning: a survey. In: 2020 19th IEEE International Conference On Machine Learning And Applications (ICMLA), pp. 811–816 (2020)
Acuna, E., Rodriguez, C.: The treatment of missing values and its effect on classifier accuracy. In: Banks, D., McMorris, F.R., Arabie, P., Gaul, W. (eds.) Classification, Clustering, And Data Mining Applications, pp. 639–647. Springer, Berlin (2004). https://doi.org/10.1007/978-3-642-17103-1_60
Grzymala-Busse, J., Grzymala-Busse, W.: Handling missing attribute values. Data Mining And Knowledge Discovery Handbook, pp. 33–51 (2009)
Tsai, C., Hu, Y.: Empirical comparison of supervised learning techniques for missing value imputation. Knowl. Inf. Syst. 64, 1–29 (2022)
Khan, T., Tian, W., Ilager, S., Buyya, R.: Workload forecasting and energy state estimation in cloud data centres: ML-centric approach. Future Gener. Comput. Syst. 128, 320–332 (2022)
Tchernykh, A., Schwiegelshohn, U., Alexandrov, V., Talbi, E.: Towards understanding uncertainty in cloud computing resource provisioning. Procedia Comput. Sci. 51, 1772–1781 (2015)
Deng, L., Ren, Y.-L., Xu, F., He, H., Li, C.: Resource utilization analysis of Alibaba cloud. In: Huang, D.-S., Bevilacqua, V., Premaratne, P., Gupta, P. (eds.) ICIC 2018. LNCS, vol. 10954, pp. 183–194. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-95930-6_18
Perennou, L., Callau-Zori, M., Lefebvre, S.: Understanding scheduler workload on non-hyperscale cloud platform. In: 19th ACM International Middleware Conference, pp. 23–24 (2018)
Wei, J., Gao, M.: Workload Prediction of Serverless Computing. In: 2021 5th International Conference On Deep Learning Technologies (ICDLT), pp. 93–99 (2021)
Mohamed, H., El-Gayar, O.: End-to-end latency prediction of microservices workflow on Kubernetes: a comparative evaluation of machine learning models and resource metrics. In: Proceedings Of The 54th Hawaii International Conference On System Sciences, p. 1717 (2021)
Huang, J., et al.: Cross-validation based K nearest neighbor imputation for software quality datasets: an empirical study. J. Syst. Softw. 132, 226–252 (2017)
Lin, W., Tsai, C.: Missing value imputation: a review and analysis of the literature (2006–2017). Artif. Intell. Rev. 53, 1487–1509 (2020)
Zhi-xin, G., Teng-fei, B., Yang-tao, L., Yi-bing, W.: Dam deformation prediction model based on Bayesian optimization and LightGBM. J. Yangtze River Sci. Res. Inst. 38, 46–50 (2021)
Hao, J., Wang, J., OuYang, Z.: Performance prediction and fine-grained resource provision of virtual machines via LightGBM. In: International Conference On Data Mining And Big Data, pp. 261–272 (2021)
Lundberg, S., Lee, S.: A unified approach to interpreting model predictions. In: Proceedings Of The 31st International Conference On Neural Information Processing Systems, pp. 4768–4777 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Fahimullah, M., Gupta, R., Ahvar, S., Trocan, M. (2022). Explaining Predictive Scheduling in Cloud. In: Szczerbicki, E., Wojtkiewicz, K., Nguyen, S.V., Pietranik, M., Krótkiewicz, M. (eds) Recent Challenges in Intelligent Information and Database Systems. ACIIDS 2022. Communications in Computer and Information Science, vol 1716. Springer, Singapore. https://doi.org/10.1007/978-981-19-8234-7_7
Download citation
DOI: https://doi.org/10.1007/978-981-19-8234-7_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-8233-0
Online ISBN: 978-981-19-8234-7
eBook Packages: Computer ScienceComputer Science (R0)