Skip to main content
Log in

Cloud datacenter workload estimation using error preventive time series forecasting models

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

The workload estimation plays a vital role in efficient management of cloud resources. This paper introduces the error preventive score (EPS) in time series forecasting models to improve the prediction accuracy. The EPS analyzes the most recent estimations to capture the forecast error trend and uses it to achieve better forecasts. In addition, we have also proposed two metrics for accuracy evaluation namely predictions in error range (PER) and magnitude of predictions (MoP). These matrices favor a model that has maximum predictions close to actual values by evaluating the error and magnitude of individual forecast. The impact of EPS on the accuracy is evaluated using three workload estimation models. The experimental analysis is carried out over five data traces and performance is measured using correlation coefficient (CoC), sum of elasticity index (SEI), mean squared prediction error (MPE), PER and MoP metrics. The error preventive models achieved maximum improvement upto 183.9%, 95.4% and 100.0% over non error preventive models in CoC, SEI, and MPE respectively. The error preventive models significantly brought down the individual forecast error below 25% and under estimations are reduced by a maximum factor of 55.2%. The superiority of the proposed scheme is validated using a comprehensive statistical evaluation based on Wilcoxon signed rank test and Friedman test with Finner post-hoc analysis. We observed that error preventive weighted exponential smoothing model produced best forecasts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. 2017 bdo technology outlook survey. https://www.bdo.com/getattachment/022227f4-aa2e-4a8b-9739-b0ad6b855415/attachment.aspx?2017-Technology-Outlook-Report_2-17.pdf

  2. Adegboyega, A.: Time-series models for cloud workload prediction: a comparison. In: 2017 IFIP/IEEE Symposium on Integrated Network and Service Management (IM), pp. 298–307 (2017). https://doi.org/10.23919/INM.2017.7987292

  3. Amazon.com announces first quarter sales up 23% to \$35.7 billion. http://phx.corporate-ir.net/phoenix.zhtml?c=97664&p=irol-newsArticle&ID=2266657 (2017)

  4. Amiri, M., Mohammad-Khanli, L.: Survey on prediction models of applications for resources provisioning in cloud. J. Netw. Comput. Appl. 82(C), 93–113 (2017)

    Article  Google Scholar 

  5. Ardagna, D., Casolari, S., Colajanni, M., Panicucci, B.: Dual time-scale distributed capacity allocation and load redirect algorithms for cloud systems. J. Parallel Distrib. Comput. 72(6), 796–808 (2012)

    Article  Google Scholar 

  6. Baldán, F.J., Ramirez-Gallego, S., Bergmeir, C., Benitez-Sanchez, J.M., Herrera, F.: A forecasting methodology for workload forecasting in cloud systems. IEEE Trans. Cloud Comput. PP(99), 1–1 (2016). https://doi.org/10.1109/TCC.2016.2586064

    Article  Google Scholar 

  7. Box, G.E.P., Jenkins, G.: Time Series Analysis. Forecasting and Control. Holden-Day Inc, San Francisco, CA (1990)

    MATH  Google Scholar 

  8. Box, G.E.P., Pierce, D.A.: Distribution of residual autocorrelations in autoregressive-integrated moving average time series models. J. Am. Stat. Assoc. 65(332), 1509–1526 (1970)

    Article  MathSciNet  Google Scholar 

  9. Buyya, R., Broberg, J., Goscinski, A.M.: Cloud Computing Principles and Paradigms. Wiley, Hoboken (2011)

    Book  Google Scholar 

  10. Calheiros, R.N., Masoumi, E., Ranjan, R., Buyya, R.: Workload prediction using arima model and its impact on cloud applications’ qos. IEEE Trans. Cloud Comput. 3(4), 449–458 (2015)

    Article  Google Scholar 

  11. Cao, J., Zhu, X., Dong, F., Liu, B., Ma, Z., Min, H.: Time series based bandwidth allocation strategy in cloud datacenter. In: 2016 International Conference on Advanced Cloud and Big Data (CBD), pp. 228–233 (2016). https://doi.org/10.1109/CBD.2016.047

  12. Ceo to shareholders: 50 billion connections 2020. https://www.ericsson.com/en/press-releases/2010/4/ceo-to-shareholders-50-billion-connections-2020 (2010)

  13. Conejo, A.J., Plazas, M.A., Espinola, R., Molina, A.B.: Day-ahead electricity price forecasting using the wavelet transform and arima models. IEEE Trans. Power Syst. 20(2), 1035–1042 (2005)

    Article  Google Scholar 

  14. Ediger, V.S., Akar, S.: Arima forecasting of primary energy demand by fuel in turkey. Energy Policy 35(3), 1701–1708 (2007)

    Article  Google Scholar 

  15. Finner, H.: On a monotonicity problem in step-down multiple test procedures. J. Am. Stat. Assoc. 88(423), 920–923 (1993)

    Article  MathSciNet  Google Scholar 

  16. Friedman, M.: The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Am. Stat. Assoc. 32(200), 675–701 (1937)

    Article  Google Scholar 

  17. Khan, A., Yan, X., Tao, S., Anerousis, N.: Workload characterization and prediction in the cloud: a multiple time series approach. In: 2012 IEEE Network Operations and Management Symposium, pp. 1287–1294 (2012). https://doi.org/10.1109/NOMS.2012.6212065

  18. Khan, M.N.A.H., Liu, Y., Alipour, H., Singh, S.: Modeling the autoscaling operations in cloud with time series data. In: Proceedings of the 2015 IEEE 34th Symposium on Reliable Distributed Systems Workshop (SRDSW), SRDSW’15, pp. 7–12. IEEE Computer Society, Washington, DC, USA (2015). https://doi.org/10.1109/SRDSW.2015.20

  19. Kim, I.K., Wang, W., Qi, Y., Humphrey, M.: Empirical evaluation of workload forecasting techniques for predictive cloud resource scaling. In: 2016 IEEE 9th International Conference on Cloud Computing (CLOUD), pp. 1–10 (2016)

  20. Kuang, S.R., Wu, K.Y., Ke, B.C., Yeh, J.H., Jheng, H.Y.: Efficient architecture and hardware implementation of hybrid fuzzy-kalman filter for workload prediction. Integr. VLSI J. 47(4), 408–416 (2014)

    Article  Google Scholar 

  21. Kumar, A.S., Mazumdar, S.: Forecasting hpc workload using arma models and ssa. In: 2016 International Conference on Information Technology (ICIT), pp. 294–297 (2016). https://doi.org/10.1109/ICIT.2016.065

  22. Kumar, U., Jain, V.K.: Arima forecasting of ambient air pollutants (o3, no, no2 and co). Stoch. Env. Res. Risk Assess. 24(5), 751–760 (2010)

    Article  Google Scholar 

  23. Leesatapornwongsa, T., Stuardo, C.A., Suminto, R.O., Ke, H., Lukman, J.F., Gunawi, H.S.: Scalability bugs: When 100-node testing is not enough. In: Proceedings of the 16th Workshop on Hot Topics in Operating Systems, HotOS ’17, pp. 24–29. ACM, New York, NY, USA (2017)

  24. Liu, C., Liu, C., Shang, Y., Chen, S., Cheng, B., Chen, J.: An adaptive prediction approach based on workload pattern discrimination in the cloud. J. Netw. Comput. Appl. 80(C), 35–44 (2017)

    Article  Google Scholar 

  25. Mao, M., Li, J., Humphrey, M.: Cloud auto-scaling with deadline and budget constraints. In: 2010 11th IEEE/ACM International Conference on Grid Computing, pp. 41–48 (2010)

  26. Messias, V.R., Estrella, J.C., Ehlers, R., Santana, M.J., Santana, R.C., Reiff-Marganiec, S.: Combining time series prediction models using genetic algorithm to autoscaling web applications hosted in the cloud infrastructure. Neural Comput. Appl. 27(8), 2383–2406 (2016)

    Article  Google Scholar 

  27. Nordrum, A.: Popular internet of things forecast of 50 billion devices by 2020 is outdated. https://spectrum.ieee.org/tech-talk/telecom/internet/popular-internet-of-things-forecast-of-50-billion-devices-by-2020-is-outdated (2016)

  28. Pai, P.F., Lin, C.S.: A hybrid arima and support vector machines model in stock price forecasting. Omega 33(6), 497–505 (2005)

    Article  Google Scholar 

  29. Rahman, Z.U., Hussain, O.K., Hussain, F.K.: Time series qos forecasting for management of cloud services. In: 2014 Ninth International Conference on Broadband and Wireless Computing, Communication and Applications, pp. 183–190 (2014). https://doi.org/10.1109/BWCCA.2014.144

  30. Reiss, C., Wilkes, J.: Google cluster-usage traces: format + schema (2011)

  31. Roy, N., Dubey, A., Gokhale, A.: Efficient autoscaling in the cloud using predictive models for workload forecasting. In: Proceedings of the 2011 IEEE 4th International Conference on Cloud Computing, CLOUD ’11, pp. 500–507. IEEE Computer Society, Washington, DC, USA (2011). https://doi.org/10.1109/CLOUD.2011.42

  32. Shi, R., Gan, Y., Wang, Y.: Evaluating scalability bottlenecks by workload extrapolation. In: 2018 IEEE 26th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), pp. 333–347 (2018)

  33. Shukur, O.B., Lee, M.H.: Daily wind speed forecasting through hybrid kf-ann model based on arima. Renew. Energy 76, 637–647 (2015)

    Article  Google Scholar 

  34. Traces available in the internet traffic archive. http://ita.ee.lbl.gov/html/traces.html

  35. Tran, V.G., Debusschere, V., Bacha, S.: Hourly server workload forecasting up to 168 hours ahead using seasonal arima model. In: 2012 IEEE International Conference on Industrial Technology, pp. 1127–1131 (2012)

  36. Wilcoxon, F.: Individual comparisons by ranking methods. Biom. Bull. 1(6), 80–83 (1945)

    Article  Google Scholar 

  37. Yang, L., Foster, I., Schopf, J.M.: Homeostatic and tendency-based cpu load predictions. In: Proceedings International Parallel and Distributed Processing Symposium, p. 9 (2003)

  38. Ye, Z., Mistry, S., Bouguettaya, A., Dong, H.: Long-term qos-aware cloud service composition using multivariate time series analysis. IEEE Trans. Serv. Comput. 9(3), 382–393 (2016). https://doi.org/10.1109/TSC.2014.2373366

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the Ministry of Electronics & Information Technology (MeitY), Government of India for financial support to carry out this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jitendra Kumar.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, J., Singh, A.K. Cloud datacenter workload estimation using error preventive time series forecasting models. Cluster Comput 23, 1363–1379 (2020). https://doi.org/10.1007/s10586-019-03003-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-019-03003-2

Keywords

Navigation