HAS: Hybrid auto-scaler for resource scaling in cloud environment

https://doi.org/10.1016/j.jpdc.2018.04.016Get rights and content

Highlights

  • Developed a Hybrid Auto-Scaler (HAS) framework for automated resource scaling in cloud environment. It is a hybrid method that combines the Predictive and the Reactive method for effective auto-scaling process.

  • HAS employs Auto-Regression of order one for estimating the future arrival rate. A novel set of equations is proposed to compute the future resource requirement.

  • Reactive method is utilized only when the computed resources are insufficient to handle the workloads.

  • Continuous Time Markov Model is employed to allocate the resources and to balance the load.

  • HAS framework is validated in a real cloud environment for proving its efficiency in terms of resource utilization, response time and scalability.

Abstract

Auto-scaling is a crucial mechanism that supports autonomic provisioning and de-provisioning of computing resources in accordance with fluctuating demands in a cloud environment. The success factor of autonomic provisioning depends on efficient resource utilization and response time performance metrics. Existing literature focuses on reactive or predictive auto-scaling mechanism where the computing system is unable to scale proportionally with the Slashdot effect or abrupt traffic bursts while these mechanisms are employed in a discrete fashion. Predictive methods strive to predict the future computational needs and subsequently obtain or release the resources in advance; however it could be directed to under-utilization. Hence, a Hybrid Auto-Scaler (HAS) is proposed to adjust the required resources automatically to the application in demand. HAS forecasts the future behaviour of the system using a time series method and deploys the anticipated resources by computing the required capacity through a queuing model. Further, it uses a reactive approach to scale out the resources in accordance as the provisioned resources are insufficient to deal with the current needs. HAS also balances the load efficiently by employing Continuous Time Markov Model (CTMM). The proposed HAS is validated with several benchmark workloads to achieve significant improvement in CPU utilization and response time.

Introduction

Cloud computing is a model for offering infinite number of computing resources in the form of Virtual Machines (VM). The underlying infrastructure of cloud computing system consists of data centres and clusters of servers that are monitored and maintained by the cloud service providers. Here, the infrastructure providers should ensure an efficient and flexible delivery of services for different customer requirements. VM supports isolation of applications from the underlying hardware by customizing it to meet the requirements of the end user. However, virtualization technology has been increasingly applied for provisioning web applications and to allocate the new resources rapidly [26]. Elasticity is an important characteristic of cloud environment that allows the users to obtain and release resources dynamically according to the fluctuating requirements. However, it is a well-known fact that deciding the accurate quantity of resources is a complex process [11], and if the situation is predictable, then capacity planning mechanisms can be utilized. However for unpredictable fluctuating load, auto-scaling system is necessary to eradicate the burden on the user for deciding the required computing resources during workload execution [[36], [22]].

Auto-scaling is one of the significant and latest innovations for automated resource configuration in cloud environment [10]. The challenge of building auto-scaling systems requires fine-tuning of resources in accordance with load variations exclusive of any human involvement. Resource provisioning is a complex process, as it necessitates the application provisioner to calculate the anticipated hardware configuration to guarantee QoS targets of application services [13]. Further, huge dynamic workloads characterized by variable arrival pattern, uncertain I/O behaviour with differential service time distribution results from the high performance computing applications. Determining the precise computational requirements of such dynamic workloads is not practical with the existing methods. Estimation error is a serious challenge that causes under- or over-estimation of system requirements because of deficit in knowledge associated with the complexities exist in the computational resources and application performance targets.

Cloud elasticity implies mapping of performance requirements to the underlying computing resources available. The method of adapting resources to an application for its on-demand requirements is complicated because of under-provisioning that certainly hurts system performance and creates Service Level Objective (SLO) infringement. Meanwhile, resource over-provisioning results in unused instances which is not cost effective. In addition, admission control is another issue in the course of cloud provisioning to avoid overloading of the server by providing sufficient resources. Therefore, a more refined technique is mandatory for automated resource scaling in accordance with the demand.

The solutions for auto-scaling of computing resources have been presented in the form of predictive as well as reactive methods and it was discussed comprehensively in the recent literatures [[26], [19], [39], [1]]. Previously, reactive method is employed for auto-scaling in cloud environment and it requires accurate quantitative values that are typically prone to uncertainty [[11], [2]]. The state-of-the-art methods used in predictive provisioning mode (Time series methods [32], Control theory [18], Reinforcement learning (RL) [47] and Queuing Theory [16]) also consist of certain drawbacks such as difficulty for predicting peak workloads and workload variations. Queuing models [11] are extensively used in the literatures for modelling the contemporary internet-based applications, and these models suit well only for stationary environment. Furthermore, existing approaches in Queuing Model (QM) hold impractical postulations regarding elastic systems, and these assumptions are not suitable for a cloud computational environment. Since in a cloud environment, uncertainty is frequent in terms of noise and dynamic workload fluctuations in the environment [20].

In the present work, an extensive study on auto-scaling of resources and resource allocation problems is carried out and a Hybrid Auto-Scaler (HAS) framework (Predictive–Reactive) is presented that incorporates the following contributions:

  • Auto-scaling of resources in a cloud environment is investigated and a HAS approach based on time series method, QM and Continuous Time Markov Model (CTMM) is introduced.

  • Time series method (Auto Regression (1)) is utilized to predict the future incoming workload arrival, and necessary resources for the arrival are computed using a novel analytical method.

  • The analytical method estimates the required number of VM and their capacity requirements. Here, the essential VM capacity for executing the workload is computed and allocated without any resource wastage.

  • Reactive provisioning approach is employed when the provisioned resources are insufficient for the current workload arrival.

  • CTMM is utilized for balancing the load and resource allocation for the incoming arrivals.

  • An extensive analysis is carried out for evaluating HAS with dynamic benchmarking applications such as RUBiS [46], RUBBoS [45], Cassandra [5] and Olio [40] with varying workloads. The proposed method is compared with different scenarios in realistic environments to prove the efficiency of resource provisioning and allocation in a computing environment.

Section snippets

Queuing theory

Queuing Theory (QT) is extensively employed to represent internet-based applications and traditional servers, to compute performance metrics such as the queue length and average waiting time for requests. The cloud application scenario uses a simple QM for a load balancer that allocates the requests among ‘n’ VMs. QT is preferred for stationary systems with constant arrival and service rates. An auto-scaling problem in realistic cloud environment is modelled by periodically varying the incoming

Hybrid auto-scaler (predictive–reactive) framework

The goal of HAS framework is to allocate resources with sufficient capacity for processing the incoming workloads. It solves two issues pertaining to resource provisioning such as estimating the required capacity for every workload and time at which it has to be provisioned. Once the resource request arrives, it completes the task execution without any drop or rejection of any request. Further, the system is modelled as an open QM where the process enters the system and completes the execution

Experimental evaluation

The present work is focused to achieve maximum resource utilization with a reasonably less response time during the resource scaling process. Here, the experimental evaluation is performed in four steps as summarized as follows:

  • The proposed integrated (predictive–reactive) provisioning mechanisms are applied for internet applications with peak workloads and the observations are made.

  • The future arrival rate (predicted trend), required capacity and performance metrics (resource

Conclusion

A Hybrid Auto-Scaler is presented for provisioning the virtualized computing resources without any human intervention for modern internet applications. A time series method is employed to compute the future arrivals and novel analytical models are presented for pre-estimating the capacity demands. Subsequently, CTMM is utilized for allocating the resources and balancing the loads. The system is evaluated using various benchmarking applications with fluctuating loads for ensuring its efficiency.

Acknowledgment

Anna University Regional Campus, Tirunelveli’s support for the work in terms of computing facilities is greatly acknowledged.

Bibal Benifa JV received her B.E. degree in Computer Science and M.E. degree in Software Engineering from Anna University, Chennai in 2009 and 2011, respectively. Currently she is a Ph.D. Research Scholar with the Department of Computer Science and Engineering, Anna University Regional Campus, Tirunelveli. Her research interests include cloud computing, distributed computing and large scale data processing.

References (58)

  • BarettE. et al.

    Applying reinforcement learning towards automating resource allocation and application scalability in the cloud

    Concurr. Comput.: Pract. Exper.

    (2012)
  • A. Beitch, B. Liu, T. Yung, R. Griffith, A. Fox, D.A. Patterson, Rain: A Workload Generation Toolkit for Cloud...
  • BenifaJ.V.B. et al.

    Reinforcement learning-based proactive auto-scaler for resource provisioning in cloud environment

  • BenifaJ.V.B. et al.

    An auto-scaling framework for heterogeneous hadoop systems

    Int. J. Cooper. Inf. Syst.

    (2017)
  • BotranT.L. et al.

    A review of auto-scaling techniques for elastic applications in cloud environments

    J. Grid Comput.

    (2014)
  • Byholm B. Ashraf, I. Porres, CRAMP: Cost-efficient resource allocation for multiple web applications with proactive...
  • R.N. Calheiros, R. Ranjan, R. Buyya, Virtual machine provisioning based on analytical performance and QoS in cloud...
  • A. Chandra, W. Gong, P. Shenoy, Dynamic resource allocation for shared data centers using online measurements, in:...
  • A.D.S. Dias, L.H.V. Nakamura, J.C. Estrella, R.H.C. Santana, M.J. Santana, Providing IaaS resources automatically...
  • X. Dutreilh, S. Kirgizov, O. Melekhova, J. Malenfant, N. Rivierre, I. Truck, Using reinforcement learning for autonomic...
  • X. Dutreilh, N. Rivierre, A. Moreau, J. Malenfant, I. Truck, From data center resource allocation to control theory and...
  • A.A. Eldin, J. Tordsson, E. Elmroth, An adaptive hybrid elasticity controller for cloud infrastructures, in: Network...
  • EscoiF.D.M. et al.

    A survey on elasticity management in PaaS systems

    Computing

    (2017)
  • GambiA. et al.

    Testing elastic computing systems

    IEEE Internet Comput.

    (2013)
  • Y. Hirashima, K. Yamasaki, M. Nagura, Proactive-reactive auto-scaling mechanism for unpredictable load change, in:...
  • https://console.aws.amazon.com/cloudwatch/ [Accessed:...
  • https://www.manageengine.com/products/applications_manager/amazon-ec2-monitoring  [Accessed:...
  • N. Huber, M.V. Quast, M. Hauck, S. Kounev, Evaluating and modeling virtualization performance overhead for cloud...
  • HummaidaA.R. et al.

    Adaptation in cloud resource configuration: a survey

    J. Cloud Comput.

    (2016)
  • Cited by (0)

    Bibal Benifa JV received her B.E. degree in Computer Science and M.E. degree in Software Engineering from Anna University, Chennai in 2009 and 2011, respectively. Currently she is a Ph.D. Research Scholar with the Department of Computer Science and Engineering, Anna University Regional Campus, Tirunelveli. Her research interests include cloud computing, distributed computing and large scale data processing.

    Dejey Dharma received her B.E. and M.E. degrees in Computer Science and Engineering from Manonmaniam Sundaranar University, Tirunelveli, India, in 2003 and 2005, respectively. Later, she was with the Department of Computer Science and Engineering, Manonmaniam Sundaranar University, Tirunelveli, India, as a Junior Research Fellow under the UGC Research Grant. She completed her Ph.D. in Computer Science and Engineering in 2011. She has been with the Department of Computer Science and Engineering, Anna University-Regional Campus, Tirunelveli, as an Assistant Professor since 2010 and as the Head of the Department from 2011 to 2015. She is a member of IEEE, IE(INDIA) and ISTE. Her research interests include Large scale data processing, image and signal processing, watermarking, information hiding and multimedia security.

    View full text