Abstract
Workload prediction is an essential prerequisite to allocate resources efficiently and maintain service level agreements in cloud computing environment. However, the best solution for a prediction task may not be a single model due to the challenge of varied characteristics of different systems. Thus, in this work, we propose an ensemble model, namely ESNemble, based on echo state network (ESN) for workload time series forecasting. ESNemble consists of four main steps, including features selection using ESN reservoirs, dimensionality reduction using kernel principal component analysis, features aggregation using matrices concatenation, and regression using least absolute shrinkage and selection operator for final predictions. In addition, necessary hyperparameters for ESNemble are optimized using genetic algorithm. For experimental evaluation, we have used ESNemble to combine five different prediction algorithms on three recent logs extracted from real-world web servers. Through our experimental results, we have shown that ESNemble outperforms all component models in terms of accuracy and resource allocation and presented the running time of our model to show the feasibility of our model in real-world applications.






Similar content being viewed by others
References
Ali-Eldin A, Kihl M, Tordsson J, Elmroth E (2012) Efficient provisioning of bursty scientific workloads on the cloud using adaptive elasticity control. In: Proceedings of the 3rd Workshop on Scientific Cloud Computing. ACM, pp 31–40
Barbeau M, Kranakis E (2007) Principles of ad-hoc networking. Wiley, Hoboken
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
CRAN package download logs (2017) http://cran-logs.rstudio.com. Accessed Nov 2018
Dutreilh X, Moreau A, Malenfant J, Rivierre N, Truck I (2010) From data center resource allocation to control theory and back. In: 2010 IEEE 3rd International Conference on Cloud Computing (CLOUD). IEEE, pp 410–417
EDGAR Log File Data Set (2017) https://www.sec.gov/dera/data/edgar-log-file-data-set.html. Accessed Nov 2018
Erlang AK (1917) Solution of some problems in the theory of probabilities of significance in automatic telephone exchanges. Post Off Electr Eng J 10:189–197
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232
Han R, Guo L, Ghanem MM, Guo Y (2012) Lightweight resource scaling for cloud applications. In: 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). IEEE, pp 644–651
Hasan MZ, Magana E, Clemm A, Tucker L, Gudreddi SLD (2012) Integrated and autonomic cloud resource scaling. In: 2012 IEEE Network Operations and Management Symposium (NOMS). IEEE, pp 1327–1334
Hyndman RJ, Khandakar Y et al (2007) Automatic time series for forecasting: the forecast package for R. 6/07. Monash University, Department of Econometrics and Business Statistics, Clayton
Jaeger H (2001) The echo state approach to analysing and training recurrent neural networks-with an erratum note. Bonn, Germany: German National Research Center for Information Technology GMD Technical Report 148(34):13
Jaeger H, Haas H (2004) Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication. Science 304(5667):78–80
Kendall DG (1953) Stochastic processes occurring in the theory of queues and their analysis by the method of the imbedded markov chain. Ann Math Stat 24:338–354
Kyoto Traffic Data from Kyoto University’s Honeypots (2017) http://takakura.com/Kyoto_data. Accessed Nov 2018
Lorido-Botrán T, Miguel-Alonso J, Lozano JA (2012) Auto-scaling techniques for elastic applications in cloud environments. Department of Computer Architecture and Technology, University of Basque Country, Tech Rep EHU-KAT-IK-09-12
Lorido-Botran T, Miguel-Alonso J, Lozano JA (2014) A review of auto-scaling techniques for elastic applications in cloud environments. J Grid Comput 12(4):559–592
Lukoševičius M, Jaeger H (2009) Reservoir computing approaches to recurrent neural network training. Comput Sci Rev 3(3):127–149
Mell P, Grance T et al (2009) The nist definition of cloud computing. Natl inst Stand Technol 53(6):50
Messias VR, Estrella JC, Ehlers R, Santana MJ, Santana RC, Reiff-Marganiec S (2016) Combining time series prediction models using genetic algorithm to autoscaling web applications hosted in the cloud infrastructure. Neural Comput Appl 27(8):2383–2406
Miller M (2008) Cloud computing: web-based applications that change the way you work and collaborate online. Que Publishing, London
Mitchell TM (1997) Machine learning, 1st edn. McGraw-Hill Inc., New York, USA
RightScale Cloud Management (2017) http://www.rightscale.com. Accessed Nov 2018
Sakasegawa H (1977) An approximation formula l \(q\simeq \alpha \cdot \rho \beta /(1-\rho )\). Ann Inst Stat Math 29(1):67–75
Schapire RE, Freund Y (2012) Boosting: foundations and algorithms. MIT Press, Cambridge
Sollich P, Krogh A (1996) Learning with ensembles: how overfitting can be useful. In: Advances in neural information processing systems (NIPS), vol 8, pp 190–196
Urgaonkar B, Shenoy P, Chandra A, Goyal P, Wood T (2008) Agile dynamic provisioning of multi-tier internet applications. ACM Trans Auton Adapt Syst (TAAS) 3(1):1
Watkins CJ, Dayan P (1992) Q-learning. Mach Learn 8(3–4):279–292
Acknowledgements
This research was supported by International Research and Development Program of the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning of Korea (2016K1A3A7A03952054), and Smart City R&D project of the Korea Agency for Infrastructure Technology Advancement (KAIA) grant funded by the Ministry of Land, Infrastructure and Transport (MOLIT), Ministry of Science and ICT (MSIT) (Grant 18NSPS-B149386-01).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Nguyen, H.M., Kalra, G., Jun, T.J. et al. ESNemble: an Echo State Network-based ensemble for workload prediction and resource allocation of Web applications in the cloud. J Supercomput 75, 6303–6323 (2019). https://doi.org/10.1007/s11227-019-02851-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-019-02851-4