Skip to main content

Online Workload Forecasting

  • Chapter
  • First Online:
Self-Aware Computing Systems

Abstract

This chapter gives a summary of the state-of-the-art approaches from different research fields that can be applied to continuously forecast future developments of time series data streams. More specifically, the input time series data contains continuously monitored metrics that quantify the amount of incoming workload units to a self-aware system. It is the goal of this chapter to identify and present approaches for online workload forecasting that are required for a self-aware system to act proactively—in terms of problem prevention and optimization—inferred from likely changes in their usage. The research fields covered are machine learning and time series analysis. We describe explicit limitations and advantages for each forecasting method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. S. Akioka and Y. Muraoka. Extended forecast of CPU and network load on computational Grid. In IEEE International Symposium on Cluster Computing and the Grid, 2004. CCGrid 2004, pages 765–772, April 2004.

    Google Scholar 

  2. Ayman Amin, Alan Colman, and Lars Grunske. An Approach to Forecasting QoS Attributes of Web Services Based on ARIMA and GARCH Models. In proceedings of the 19th International Conference on Web Services, pages 74–81. IEEE, 2012.

    Google Scholar 

  3. Ayman Amin, Lars Grunske, and Alan Colman. An automated approach to forecasting qos attributes based on linear and non-linear time series modeling. In Michael Goedicke, Tim Menzies, and Motoshi Saeki, editors, IEEE/ACM International Conference on Automated Software Engineering, ASE’12, Essen, Germany, September 3-7, 2012, pages 130–139. ACM, 2012.

    Google Scholar 

  4. Ayman Amin, Lars Grunske, and Alan Colman. An approach to software reliability prediction based on time series modeling. Journal of Systems and Software, 86(7):1923–1932, 2013.

    Article  Google Scholar 

  5. Mauro Andreolini and Sara Casolari. Load Prediction Models in Web-based Systems. In Proceedings of the 1st International Conference on Performance Evaluation Methodolgies and Tools, valuetools ’06, New York, NY, USA, 2006. ACM.

    Google Scholar 

  6. A. Andrzejak and J.B. Gomes. Parallel Concept Drift Detection with Online Map-Reduce. In 2012 IEEE 12th International Conference on Data Mining Workshops (ICDMW), pages 402–407, December 2012.

    Google Scholar 

  7. Artur Andrzejak and Luis Silva. Using Machine Learning for Non-Intrusive Modeling and Prediction of Software Aging. In IEEE/IFIP Network Operations & Management Symposium (NOMS 2008), Salvador de Bahia, Brazil, April 2008.

    Google Scholar 

  8. T. Bollerslev. Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics, 31(3):307–327, 1986.

    Article  MathSciNet  MATH  Google Scholar 

  9. Antoine Bordes, Seyda Ertekin, Jason Weston, and Léon Bottou. Fast kernel classifiers with online and active learning. J. Mach. Learn. Res., 6:1579–1619, December 2005.

    MathSciNet  MATH  Google Scholar 

  10. George E. P. Box and Gwilym M. Jenkins. Time Series Analysis: Forecasting and Control. HoldenDay, San Francisco, 1976.

    Google Scholar 

  11. Maria Carla Calzarossa, Luisa Massari, and Daniele Tessera. Workload characterization: A survey revisited. ACM Comput. Surv., 48(3):48:1–48:43, February 2016.

    Google Scholar 

  12. Bice Cavallo, Massimiliano Di Penta, and Gerardo Canfora. An empirical comparison of methods to support QoS-aware service selection. In proceedings of the 2nd International Workshop on Principles of Engineering Service-Oriented Systems, pages 64–70. ACM, 2010.

    Google Scholar 

  13. E. Cho, S. A. Myers, and J. Leskovec. Friendship and mobility: user movement in location-based social networks. In Proc. of KDD-11, pages 1082–1090, 2011.

    Google Scholar 

  14. Peyton Cook and Lyle D Broemeling. Analyzing threshold autoregressions with a Bayesian approach. Advances in Econometrics, 11:89–108, 1996.

    Google Scholar 

  15. Alysha M. De Livera, Rob J. Hyndman, and Ralph D. Snyder. Forecasting time series with complex seasonal patterns using exponential smoothing. Journal of the American Statistical Association, 106(496):1513–1527, 2011.

    Article  MathSciNet  MATH  Google Scholar 

  16. A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal Of The Royal Statistical Society, Series B, 39(1):1–38, 1977.

    MathSciNet  MATH  Google Scholar 

  17. John E Dennis, Jr and Jorge J Moré. Quasi-newton methods, motivation and theory. SIAM review, 19(1):46–89, 1977.

    Article  MathSciNet  Google Scholar 

  18. Sheng Di, Derrick Kondo, and Walfredo Cirne. Host load prediction in a google compute cloud with a bayesian model. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC ’12, pages 21:1–21:11, Los Alamitos, CA, USA, 2012. IEEE Computer Society Press.

    Google Scholar 

  19. Sheng Di, Derrick Kondo, and Walfredo Cirne. Google hostload prediction based on Bayesian model with optimized feature combination. J. Parallel Distrib. Comput., 74(1):1820–1832, 2014.

    Article  Google Scholar 

  20. P.A. Dinda. Online prediction of the running time of tasks. In 10th IEEE International Symposium on High Performance Distributed Computing, 2001. Proceedings, pages 383–394, 2001.

    Google Scholar 

  21. P.A. Dinda. Design, implementation, and performance of an extensible toolkit for resource prediction in distributed systems. IEEE Transactions on Parallel and Distributed Systems, 17(2):160–173, February 2006.

    Article  MathSciNet  Google Scholar 

  22. Peter A. Dinda. A Prediction-Based Real-Time Scheduling Advisor. In 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 15-19 April 2002, Fort Lauderdale, FL, USA, CD-ROM/Abstracts Proceedings, 2002.

    Google Scholar 

  23. Qia Ding. Long-term load forecast using decision tree method. In Power Systems Conference and Exposition, 2006. PSCE ’06. 2006 IEEE PES, pages 1541–1543, Oct 2006.

    Google Scholar 

  24. R.F. Engle. Autoregressive conditional heteroscedasticity with estimates of the variance of united kingdom inflation. Econometrica, pages 987–1007, 1982.

    Google Scholar 

  25. J Friedman, T Hastie, and R Tibshirani. The elements of statistical learning. 2001. 00571.

    Google Scholar 

  26. Jean Dickinson Gibbons and Subhabrata Chakraborti. Nonparametric statistical inference. CRC, 2003.

    Google Scholar 

  27. Daniel Gmach, Jerry Rolia, Ludmila Cherkasova, and Alfons Kemper. Workload Analysis and Demand Prediction of Enterprise Data Center Applications. In Proceedings of the 2007 IEEE 10th International Symposium on Workload Characterization, IISWC ’07, pages 171–180, Washington, DC, USA, 2007. IEEE Computer Society.

    Google Scholar 

  28. Manish Godse, Umesh Bellur, and Rajendra Sonar. Automating QoS Based Service Selection. In proceedings of the IEEE International Conference on Web Services, pages 534–541. IEEE, 2010.

    Google Scholar 

  29. Zhenhuan Gong and Xiaohui Gu. PAC: Pattern-driven Application Consolidation for Efficient Cloud Computing. In 2010 IEEE International Symposium on Modeling, Analysis Simulation of Computer and Telecommunication Systems (MASCOTS), pages 24–33, August 2010.

    Google Scholar 

  30. Zhenhuan Gong, Xiaohui Gu, and J. Wilkes. PRESS: PRedictive Elastic ReSource Scaling for cloud systems. In 2010 International Conference on Network and Service Management (CNSM), pages 9–16, October 2010.

    Google Scholar 

  31. Bruce Hansen. Testing for linearity. Journal of Economic Surveys, 13(5):551–576, 1999.

    Article  Google Scholar 

  32. Nikolas Roman Herbst, Nikolaus Huber, Samuel Kounev, and Erich Amrehn. Self-Adaptive Workload Classification and Forecasting for Proactive Resource Provisioning. Concurrency and Computation - Practice and Experience, John Wiley and Sons, Ltd., 26(12):2053–2078, 2014.

    Google Scholar 

  33. Magnus R. Hestenes and Eduard Stiefel. Methods of Conjugate Gradients for Solving Linear Systems. Journal of Research of the National Bureau of Standards, 49(6):409–436, December 1952.

    Article  MathSciNet  MATH  Google Scholar 

  34. L. Hu, X. L. Che, and S. Q. Zheng. Online system for grid resource monitoring and machine learning-based prediction. IEEE Transactions on Parallel and Distributed Systems, 23(1):134–145, Jan 2012.

    Article  Google Scholar 

  35. Laurent Hyafil and Ronald L. Rivest. Constructing optimal binary decision trees is np-complete. Information Processing Letters, 5(1):15–17, 1976.

    Google Scholar 

  36. Rob Hyndman, Anne Khler, Keith Ord, and Ralph Snyder, editors. Forecasting with Exponential Smoothing: The State Space Approach. Springer Series in Statistics. Springer-Verlag Berlin Heidelberg, Berlin, Heidelberg, 2008.

    Google Scholar 

  37. Rob J Hyndman, Maxwell Leslie King, Ivet Pitrun, and Baki Billah. Local linear forecasts using cubic smoothing splines. Monash Econometrics and Business Statistics Working Papers 10/02, Monash University, Department of Econometrics and Business Statistics, 2002.

    Google Scholar 

  38. Charles D. Kirkpatrick II and Julie Dahlquist. Technical Analysis: The Complete Resource for Financial Market Technicians. FT Press, November 2010.

    Google Scholar 

  39. Eamonn J. Keogh and Jessica Lin. Symbolic Aggregate approXimation (SAX) Homepage.

    Google Scholar 

  40. A. Khan, X. Yan, Shu Tao, and N. Anerousis. Workload characterization and prediction in the cloud: A multiple time series approach. In 2012 IEEE Network Operations and Management Symposium (NOMS), pages 1287–1294, April 2012.

    Google Scholar 

  41. Daphne Koller and Nir Friedman. Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning. The MIT Press, 2009.

    Google Scholar 

  42. Ali Lahouar and Jaleleddine Ben Hadj Slama. Random forests model for one day ahead load forecasting. In Renewable Energy Congress (IREC), 2015 6th International, pages 1–6, March 2015.

    Google Scholar 

  43. Pavel Laskov, Christian Gehl, Stefan Krüger, and Klaus-Robert Müller. Incremental support vector learning: Analysis, implementation and applications. J. Mach. Learn. Res., 7:1909–1936, December 2006.

    MathSciNet  MATH  Google Scholar 

  44. Jure Leskovec, Lars Backstrom, and Jon Kleinberg. Meme-tracking and the dynamics of the news cycle. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’09, pages 497–506, New York, NY, USA, 2009. ACM.

    Google Scholar 

  45. WK Li and K Lam. Modelling asymmetry in stock returns by a threshold autoregressive conditional heteroscedastic model. The Statistician, pages 333–341, 1995.

    Google Scholar 

  46. KS Lim. On the stability of a threshold ar(1) without intercepts. Journal of Time Series Analysis, 13(2):119–132, 1992.

    Article  MathSciNet  Google Scholar 

  47. O. J. Mengshoel, R. Desai, A. Chen, and B. Tran. Will we connect again? machine learning for link prediction in mobile social networks. In Proc. of Eleventh Workshop on Mining and Learning with Graphs, Chicago, IL, August 2013.

    Google Scholar 

  48. Ole J Mengshoel, Avneesh Saluja, and Priya Sundararajan. Age-layered expectation maximization for parameter learning in bayesian networks. In Proc. of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012.

    Google Scholar 

  49. J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo, CA, 1988.

    MATH  Google Scholar 

  50. Hema Prem and N. R. Srinivasa Raghavan. A support vector machine based approach for forecasting of network weather services. Journal of Grid Computing, 4(1):89–114, 2006.

    Article  Google Scholar 

  51. Dorian Pyle, Text Design, Morgan Kaufmann Publishers, Sixth Floor, and San Francisco. Data Preparation for Data Mining. 1999. 01347.

    Google Scholar 

  52. Jian qiang Li, Cheng lin Niu, Ji-Zhen Liu, and Jun jie Gu. The application of data mining in electric short-term load forecasting. In Fuzzy Systems and Knowledge Discovery, 2008. FSKD ’08. Fifth International Conference on, volume 2, pages 519–522, Oct 2008.

    Google Scholar 

  53. J. R. Quinlan. Induction of decision trees. Mach. Learn., 1(1):81–106, March 1986.

    Google Scholar 

  54. YC Raymond. An application of the arima model to real-estate prices in hong kong. Journal of Property Finance, 8(2):152–163, 1997.

    Article  Google Scholar 

  55. E. Reed, A. Ishihara, and O. J. Mengshoel. Adaptive control of apache web server. In Proc. of Feedback Computing ’13, San Jose, CA, June 2013.

    Google Scholar 

  56. Erik B Reed and Ole J Mengshoel. Scaling bayesian network parameter learning with expectation maximization using mapreduce. Proc. of Big Learning: Algorithms, Systems and Tools, 2012.

    Google Scholar 

  57. Jerry Rolia, Xiaoyun Zhu, Martin Arlitt, and Artur Andrzejak. Statistical Service Assurances for Applications in Utility Grid Environments. Performance Evaluation Journal, 58(2+3):319–339, November 2004.

    Google Scholar 

  58. D. Ruta and B. Gabrys. Neural Network Ensembles for Time Series Prediction. In International Joint Conference on Neural Networks, 2007. IJCNN 2007, pages 1204–1209, August 2007.

    Google Scholar 

  59. S. Seneviratne and S. Witharana. A survey on methodologies for runtime prediction on grid environments. In 2014 7th International Conference on Information and Automation for Sustainability (ICIAfS), pages 1–6, December 2014.

    Google Scholar 

  60. P. K. Sundararajan, E. Feller, J. Forgeat, and O. J. Mengshoel. A constrained genetic algorithm for rebalancing of services in cloud data centers. In 8th IEEE International Conference on Cloud Computing, CLOUD, pages 653–660, 2015.

    Google Scholar 

  61. H. Tong and K.S. Lim. Threshold autoregression, limit cycles and cyclical data. Journal of the Royal Statistical Society. Series B (Methodological), pages 245–292, 1980.

    Google Scholar 

  62. Howell Tong. Threshold models in non-linear time series analysis, volume 21. Springer, 1983.

    Google Scholar 

  63. Howell Tong. Non-linear time series: a dynamical system approach. Oxford University Press, 1990.

    Google Scholar 

  64. Bhuvan Urgaonkar, Giovanni Pacifici, Prashant Shenoy, Mike Spreitzer, and Asser Tantawi. An Analytical Model for Multi-tier Internet Services and Its Applications. In Proceedings of the 2005 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS ’05, pages 291–302, New York, NY, USA, 2005. ACM.

    Google Scholar 

  65. Paul E. Utgoff. Incremental induction of decision trees. Mach. Learn., 4(2):161–186, November 1989.

    Article  Google Scholar 

  66. T. Vercauteren, P. Aggarwal, Xiaodong Wang, and Ta-Hsin Li. Hierarchical Forecasting of Web Server Workload Using Sequential Monte Carlo Training. IEEE Transactions on Signal Processing, 55(4):1286–1297, April 2007.

    Article  MathSciNet  Google Scholar 

  67. Xiaozhe Wang, Kate Smith-Miles, and Rob Hyndman. Rule induction for forecasting method selection: Meta-learning the characteristics of univariate time series. Neurocomput., 72(10-12):2581–2594, June 2009.

    Article  Google Scholar 

  68. Ian H. Witten and Eibe Frank. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco, 2nd edition, 2005. 23937.

    Google Scholar 

  69. Rich Wolski, Neil T. Spring, and Jim Hayes. The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing. Future Gener. Comput. Syst., 15(5-6):757–768, October 1999.

    Article  Google Scholar 

  70. Yongwei Wu, Yulai Yuan, Guangwen Yang, and Weimin Zheng. Load Prediction Using Hybrid Model for Computational Grid. In Proceedings of the 8th IEEE/ACM International Conference on Grid Computing, GRID ’07, pages 235–242, Washington, DC, USA, 2007. IEEE Computer Society.

    Google Scholar 

  71. J. Xue, F. Yan, R. Birke, L. Y. Chen, T. Scherer, and E. Smirni. Practise: Robust prediction of data center time series. In Network and Service Management (CNSM), 2015 11th International Conference on, pages 126–134, Nov 2015.

    Google Scholar 

  72. Hui Zhang, Guofei Jiang, K. Yoshihira, and Haifeng Chen. Proactive Workload Management in Hybrid Cloud Computing. IEEE Transactions on Network and Service Management, 11(1):90–100, March 2014.

    Article  Google Scholar 

  73. Yuanyuan Zhang, Wei Sun, and Yasushi Inoguchi. Predicting Running Time of Grid Tasks Based on CPU Load Predictions. In Proceedings of the 7th IEEE/ACM International Conference on Grid Computing, GRID ’06, pages 286–292, Washington, DC, USA, 2006. IEEE Computer Society.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nikolas Herbst .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Herbst, N. et al. (2017). Online Workload Forecasting. In: Kounev, S., Kephart, J., Milenkoski, A., Zhu, X. (eds) Self-Aware Computing Systems. Springer, Cham. https://doi.org/10.1007/978-3-319-47474-8_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-47474-8_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-47472-4

  • Online ISBN: 978-3-319-47474-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics