Resource needs prediction in virtualized systems: Generic proactive and self-adaptive solution

https://doi.org/10.1016/j.jnca.2019.102443Get rights and content

Highlights

  • We propose a novel algorithm for generic, dynamic and multi-step ahead prediction of resource needs in virtualized systems.

  • We provide dynamic and adaptive prediction adjustment and a padding strategy to reduce the resource under/over-estimation.

  • We provide the optimal size of the sliding window and predicted data minimizing prediction errors through Genetic Algorithm.

  • The evaluation results show that on average, our algorithm reduces the under-estimation by 86% over non-adjusted prediction.

  • The results show that on average, our algorithm reduces the over-estimation by 67% over threshold-based provisioning.

Abstract

Resource management of virtualized systems in cloud data centers is a critical and challenging task due to the fluctuating workloads and complex applications in such environments. Over-provisioning is a common practice to meet service level agreement requirements, but this leads to under-utilization of resources and energy waste. Thus, provisioning virtualized systems with resources according to their workload demands is essential. Existing solutions fail to provide a complete solution in this regard, as some of them lack proactivity and dynamism in estimating resources, while others are environment- or application-specific, which limits their accuracy in the case of bursty workloads. Effective resource management requires dynamic and accurate prediction. This work presents a novel prediction algorithm, which (1) is generic, and can thus be applied to any virtualized system, (2) is able to provide proactive estimation of resource requirements through machine learning techniques, and (3) is capable of real-time adaptation with padding and prediction adjustments based on prediction error probabilities in order to reduce under- and over-provisioning of resources. In several virtualized systems, and under different workload profiles, the experimental results show that our proposition is able to reduce under-estimation by an average of 86% over non-adjusted prediction, and to decrease over-estimation by an average of 67% versus threshold-based provisioning.

Introduction

Virtualization is one of the key technologies leveraged to provide scalability, better management flexibility, optimized resource sharing and lower costs in data centers. To capitalize on this technology, it is essential to provision virtualized systems with resources dynamically according to their workload demands. However, the complexity of virtualized systems and applications, their fluctuating resource demands over time (Shyam and Manvi, 2016), and their dynamic and heterogeneous environments all impose a real challenge in resource management, which requires optimizing resource utilization while avoiding Service Level Agreement (SLA) violations. A common practice is to over-provision resources to meet various SLA requirements established with clients. However, this increases the costs incurred in data centers in terms of energy consumption (Goudarzi and Pedram, 2016) and capital expenditure since more resources have to be available. Scalable and elastic allocations of resources is necessary and crucial for the dynamic adjustment of resource capacity to actual demand in real time, while minimizing SLA violations and delays in resource scaling.

Effective and accurate prediction of resource demands is fundamental to real-time resource needs planning and virtualized resource management in data centers. It helps to meet service-level agreement stipulations (by minimizing under-provisioning), anticipate needs in terms of middleboxes (e.g., Load Balancer, Firewall) and lay the groundwork for proactive job scheduling. Thus, consequently this will improve the resource usage, service performance and reduce costs (by minimizing over-provisioning) (Shyam and Manvi, 2016). Several studies have proposed diverse techniques to address these issues (Shyam and Manvi, 2016; K Hoong et al., 2012; Iqbal and K John, 2012; Lloyd et al., 2013; Liang et al., 2011; Hu et al., 2013; Pezzè and Toffetti, 2016), however none of them has provided a complete solution. Some of these approaches do not offer proactive and adaptive management of resources or even consider SLA requirements. Moreover, some of these solutions are environment-specific or application-specific. This limits their accuracy in the case of unexpected and large amounts of data (Shyam and Manvi, 2016), which is a major drawback in the cloud context, where we are dealing with highly dynamic and bursty workloads.

To address these limitations, we propose in this work a novel solution, which is generic enough to be applied to any virtualized system or application. It is able to dynamically generate and adjust predictions in real time and offers proactivity in terms of estimating resource demand by anticipating future changes in the system. Our approach provides an algorithm for the dynamic, accurate and effective prediction of resource needs by developing and leveraging different methods and techniques. Black-box prediction methods derive models from the system behavior without requiring any knowledge of the system internals (Gambi, 2013). The adaptability and the efficiency of these methods make them appropriate for virtualized, dynamic and complex environments such as data centers. The proposed algorithm also employs machine learning methods and time series to remain a few steps ahead in the dynamic estimation of resource needs. Furthermore, because prediction is not always sufficiently accurate, adjustments are sometimes needed. Therefore, a dynamic adjustment technique is devised and employed in the prediction algorithm to reduce the under- and over-estimation of resources. A thorough experimentation was also conducted to study the efficiency and performance of the proposed algorithm with different systems and workloads.

European Telecommunications Standards Institute (ETSI) introduced Network Function Virtualization (NFV) concept to reduce cost and accelerate service deployment. The high-level NFV framework proposed by ETSI identified three main working domains (see Fig. B1 in appendix B): (1) Virtualized Network Function, (2) NFV Infrastructure (NFVI) and (3) NFV Management and Orchestration (MANO). The latter covers the orchestration and lifecycle management of physical and/or software resources and the lifecycle management of VNFs. The proposed resource prediction Algorithm can be integrate into MANO and more specifically as a part of Virtualized Infrastructure Manager (VIM).

The main contributions of this work are threefold:

  • Novel algorithm for dynamic and multi-step ahead prediction of resource needs in virtualized systems without any prior knowledge or assumptions on their internal behaviors.

  • Dynamic and adaptive adjustment of prediction based on the estimated probability of prediction errors, and padding strategy to reduce under-estimation (SLA violations) and over-estimation (resource loss) of resource needs.

  • Dynamic determination of the optimal sizes of the sliding window and the predicted data that minimize under- and over-estimations through Genetic Algorithm (GA) intelligence.

The rest of the paper is organized as follows. First, we review the state of the art related to resource demand prediction in the context of virtualized system resource management. Second, we present our proposed approach and we explain the algorithm, the methods and the strategies we used. Third, we evaluate the performance of the prediction algorithm. Finally, we analyze and discuss the main results and conclude the article.

Section snippets

Related work

Focus on resource management of virtualized systems as a research area has recently been gaining steam, and several techniques have been proposed in this regard. In this section, we study existing work on the techniques used in this domain. To highlight our contributions, in Table 1, we classify the most recent approaches and compare them based on key features needed for efficient prediction of resources in virtualized systems.

Approach overview

In the present work, we propose a generic, dynamic, and self-adaptive prediction of the resource needs in virtualized systems. The proposition aims to minimize under-estimation, which can lead to possible SLA violations, and reduce over-estimation, which that causes a loss of resources, without any prior knowledge of the system or any assumption regarding its behavior or load profile. To that end, we propose a novel prediction algorithm that involves three main techniques. The first mechanism

Prediction

Kriging (Krige, 1951; Matheron, 1963) is a spatial interpolation procedure that uses statistical methods for prediction. It assumes a spatial correlation between observed data. In other words, observed data close to each other in the input space are assumed to have similar output values (Pezzè and Toffetti, 2016). Kriging is able to model a system based on its external behavior (black-box model) and generic data. It also provides adaptability to linear, non-linear and multi-modal behaviors of

Adjustment

To improve the efficiency of the prediction method and reduce the under-estimation caused by significant changes in resource demand, we propose a dynamic prediction adjustment strategy based on the estimated probability of prediction errors (Algorithm 2) and a variable padding technique. Algorithm 2 describes the proposed strategy. We determine the error adjustment coefficient i that reflects the current tendency for under/over-estimation and we add it to the predicted data. In the event of a

Padding strategies

If the under-estimation exceeds a tolerance threshold (e.g., 10%), an additional adjustment, called padding, is computed (Algorithm 3) and added to the adjusted predicted data in the next prediction step in order to quickly address the gap between the observed data and the predicted data. This ultimately prevents a long-lasting under-estimation and SLA violations. In this context, we tested two padding strategies, namely, ratio-based and standard deviation-based strategies in order to measure

Optimization

A time series is an ordered collection of values obtained through repeated measurements, typically over equally-spaced time intervals (Wei, 1990), and its analysis allows the extraction of a model that describes particular patterns in the collected data. Therefore, for dynamic resource consumption prediction, we consider real-time data collected from the system as time series, where at each sliding window, the prediction of the next resource demand is performed. Each set of i observed data is

Experimental evaluation

We evaluated the cost of the resource demand prediction in terms of SLA violation and resource wastage by computing the probability of under-estimations (e.g.PrUnderEstimPredictData), the mean of over-estimations (e.g.EOverEstimPredictData) and the mean time between under-estimations (MTBUE) for both predicted and adjusted data. Also, we considered the mean of over-estimations in the case of static provisioning of resources (threshold-based provisioning), that is, an over-provisioning of

Conclusion and future work

This work presents a generic, dynamic and multi-step ahead prediction of resource demand in virtualized systems. Based on a time series and machine learning method, our proposed algorithm is able to provide real-time prediction of resource needs without any prior knowledge or assumptions on the system or its internal behavior. When unexpected workload fluctuations occur, the proposed algorithm is capable of adapting to these changes within a relatively short delay (e.g. on average, 40 s in

Acknowledgment

This work has been supported in part by Natural Science and Engineering Research Council of Canada (NSERC), in part by Ericsson Canada and in part by Rogers Communication Canada.

Souhila Benmakrelouf holds a Master's degree in Software Engineering from the School of Superior Technology (ETS), University of Quebec (Canada) and Bachelor degree in Computer engineering from university of Houari Boumadienne, Algiers (Algeria). Currently, she is PhD student in ETS. Her main research interests include Cloud computing, resource management, machine learning techniques.

References (47)

  • C. Cortes et al.

    Support-vector networks

    Mach. Learn.

    (1995)
  • P.F. Dunn

    Measurement and Data Analysis for Engineering and Science

    (2014)
  • ETSI GS NFV 002 V1.1.1. “Network functions virtualisation (NFV);Architectural framework” Online...
  • F.O.K.U.S. Fraunhofer

    Open source IMS core by cnd

  • A. Gambi

    “Kriging-based Self-Adaptive Controllers for the Cloud”

    (2013)
  • A. Gandhi et al.

    Minimizing data center sla violations and power consumption via hybrid resource provisioning

  • R. GAYRAUD et al. “SIPp”. Online http://sipp.sourceforge.net/, accessed:...
  • H. Goudarzi et al.

    Hierarchical SLA-driven resource management for peak power-aware and energy-efficient operation of a cloud datacenter

    IEEE Transactions on Cloud Computing

    (2016)
  • Y. Gratton

    Le krigeage: La méthode optimale d’interpolation spatiale. Les articles de l’Institut d’Analyse Géographique. Online

    (2002)
  • R. Hu et al.

    CPU load prediction using support vector regression and kalman smoother for cloud

  • M.F. Iqbal et al.

    Power and performance analysis of network traffic prediction techniques

  • J. Jiang et al.

    Optimal cloud resource auto-scaling for web applications

  • P. K Hoong et al.

    Bittorrent network traffic forecasting with ARMA

    Int. J. Comput. Netw. Commun.

    (2012)
  • Cited by (8)

    View all citing articles on Scopus

    Souhila Benmakrelouf holds a Master's degree in Software Engineering from the School of Superior Technology (ETS), University of Quebec (Canada) and Bachelor degree in Computer engineering from university of Houari Boumadienne, Algiers (Algeria). Currently, she is PhD student in ETS. Her main research interests include Cloud computing, resource management, machine learning techniques.

    Nadjia, Kara holds a Ph.D. in Electrical and Computer Engineering from Ecole Polytechnique of Montreal (Canada), a Master's degree in Electrical and Computer Engineering from Ecole Polytechnique of Algiers (Algeria). She has several years of experience in research and development. She worked in industry for more than ten years. From 2005 to 2015, she held adjunct professor positions at INRS-EMT (Canada), University of Sherbrooke (Canada), and Concordia University (Canada). Since 2009, she is a full professor at the department of software engineering

    Hanine Tout received the Ph.D. degree in software engineering from École de Technologie Supérieure (ÉTS), University of Quebec, Montreal, Canada and the MSc degree in computer science from the Lebanese American University, Beirut, Lebanon. She is currently a Postdoctoral Fellow at Ericsson, Montreal, Canada. Her research interests include 5G, IoT, machine learning, mobile cloud computing, mobile virtualization, optimization, Web services, security and formal verification. She is serving as TPC member for IMCET′16, NTMS 2016 and SSCC-2018 and a reviewer in IEEE Communications Letters, Computers & Security journal, IEEE Transactions on Cloud Computing and several international conferences. She is a student member of the IEEE.

    Rafi Rabipour has been engaged in research and development at Bell-Northern Research, Nortel Networks, and Ericsson. His work at Bell-Northern Research and Nortel Networks was mainly focused on the development of digital signal processing algorithms aimed at improving the performance of wireless and VoIP products. At Ericsson he participated in research in the domain of Cloud Computing, on topics such as non-linear performance characteristics of virtualized applications, specific facets of the Internet-of-Things, as well as approaches to resource management. He holds a Master's degree in Electrical Engineering from McGill University in Montreal.

    Claes, Edstrom is Senior Specialist Cloud Computing and is based in Montreal. He is responsible for initiating and executing exploratory projects in the areas of NFV and Cloud technologies. His research interests include application transformation, resource management and automation in cloud computing environments.Before joining Ericsson Canada in 2007, Edstrom spent +15 years working for the company in Sweden and Denmark in various technology roles. He has been working with system studies for WCDMA, tender support for initial 3G/UMTS offerings roll out and system upgrades for 2G/3G networks in the Nordic region and development of AMPS/D-AMPS networks in North America.

    View full text