HAS: Hybrid auto-scaler for resource scaling in cloud environment
Introduction
Cloud computing is a model for offering infinite number of computing resources in the form of Virtual Machines (VM). The underlying infrastructure of cloud computing system consists of data centres and clusters of servers that are monitored and maintained by the cloud service providers. Here, the infrastructure providers should ensure an efficient and flexible delivery of services for different customer requirements. VM supports isolation of applications from the underlying hardware by customizing it to meet the requirements of the end user. However, virtualization technology has been increasingly applied for provisioning web applications and to allocate the new resources rapidly [26]. Elasticity is an important characteristic of cloud environment that allows the users to obtain and release resources dynamically according to the fluctuating requirements. However, it is a well-known fact that deciding the accurate quantity of resources is a complex process [11], and if the situation is predictable, then capacity planning mechanisms can be utilized. However for unpredictable fluctuating load, auto-scaling system is necessary to eradicate the burden on the user for deciding the required computing resources during workload execution [[36], [22]].
Auto-scaling is one of the significant and latest innovations for automated resource configuration in cloud environment [10]. The challenge of building auto-scaling systems requires fine-tuning of resources in accordance with load variations exclusive of any human involvement. Resource provisioning is a complex process, as it necessitates the application provisioner to calculate the anticipated hardware configuration to guarantee QoS targets of application services [13]. Further, huge dynamic workloads characterized by variable arrival pattern, uncertain I/O behaviour with differential service time distribution results from the high performance computing applications. Determining the precise computational requirements of such dynamic workloads is not practical with the existing methods. Estimation error is a serious challenge that causes under- or over-estimation of system requirements because of deficit in knowledge associated with the complexities exist in the computational resources and application performance targets.
Cloud elasticity implies mapping of performance requirements to the underlying computing resources available. The method of adapting resources to an application for its on-demand requirements is complicated because of under-provisioning that certainly hurts system performance and creates Service Level Objective (SLO) infringement. Meanwhile, resource over-provisioning results in unused instances which is not cost effective. In addition, admission control is another issue in the course of cloud provisioning to avoid overloading of the server by providing sufficient resources. Therefore, a more refined technique is mandatory for automated resource scaling in accordance with the demand.
The solutions for auto-scaling of computing resources have been presented in the form of predictive as well as reactive methods and it was discussed comprehensively in the recent literatures [[26], [19], [39], [1]]. Previously, reactive method is employed for auto-scaling in cloud environment and it requires accurate quantitative values that are typically prone to uncertainty [[11], [2]]. The state-of-the-art methods used in predictive provisioning mode (Time series methods [32], Control theory [18], Reinforcement learning (RL) [47] and Queuing Theory [16]) also consist of certain drawbacks such as difficulty for predicting peak workloads and workload variations. Queuing models [11] are extensively used in the literatures for modelling the contemporary internet-based applications, and these models suit well only for stationary environment. Furthermore, existing approaches in Queuing Model (QM) hold impractical postulations regarding elastic systems, and these assumptions are not suitable for a cloud computational environment. Since in a cloud environment, uncertainty is frequent in terms of noise and dynamic workload fluctuations in the environment [20].
In the present work, an extensive study on auto-scaling of resources and resource allocation problems is carried out and a Hybrid Auto-Scaler (HAS) framework (Predictive–Reactive) is presented that incorporates the following contributions:
-
Auto-scaling of resources in a cloud environment is investigated and a HAS approach based on time series method, QM and Continuous Time Markov Model (CTMM) is introduced.
-
Time series method (Auto Regression (1)) is utilized to predict the future incoming workload arrival, and necessary resources for the arrival are computed using a novel analytical method.
-
The analytical method estimates the required number of VM and their capacity requirements. Here, the essential VM capacity for executing the workload is computed and allocated without any resource wastage.
-
Reactive provisioning approach is employed when the provisioned resources are insufficient for the current workload arrival.
-
CTMM is utilized for balancing the load and resource allocation for the incoming arrivals.
-
An extensive analysis is carried out for evaluating HAS with dynamic benchmarking applications such as RUBiS [46], RUBBoS [45], Cassandra [5] and Olio [40] with varying workloads. The proposed method is compared with different scenarios in realistic environments to prove the efficiency of resource provisioning and allocation in a computing environment.
Section snippets
Queuing theory
Queuing Theory (QT) is extensively employed to represent internet-based applications and traditional servers, to compute performance metrics such as the queue length and average waiting time for requests. The cloud application scenario uses a simple QM for a load balancer that allocates the requests among ‘’ VMs. QT is preferred for stationary systems with constant arrival and service rates. An auto-scaling problem in realistic cloud environment is modelled by periodically varying the incoming
Hybrid auto-scaler (predictive–reactive) framework
The goal of HAS framework is to allocate resources with sufficient capacity for processing the incoming workloads. It solves two issues pertaining to resource provisioning such as estimating the required capacity for every workload and time at which it has to be provisioned. Once the resource request arrives, it completes the task execution without any drop or rejection of any request. Further, the system is modelled as an open QM where the process enters the system and completes the execution
Experimental evaluation
The present work is focused to achieve maximum resource utilization with a reasonably less response time during the resource scaling process. Here, the experimental evaluation is performed in four steps as summarized as follows:
-
The proposed integrated (predictive–reactive) provisioning mechanisms are applied for internet applications with peak workloads and the observations are made.
-
The future arrival rate (predicted trend), required capacity and performance metrics (resource
Conclusion
A Hybrid Auto-Scaler is presented for provisioning the virtualized computing resources without any human intervention for modern internet applications. A time series method is employed to compute the future arrivals and novel analytical models are presented for pre-estimating the capacity demands. Subsequently, CTMM is utilized for allocating the resources and balancing the loads. The system is evaluated using various benchmarking applications with fluctuating loads for ensuring its efficiency.
Acknowledgment
Anna University Regional Campus, Tirunelveli’s support for the work in terms of computing facilities is greatly acknowledged.
Bibal Benifa JV received her B.E. degree in Computer Science and M.E. degree in Software Engineering from Anna University, Chennai in 2009 and 2011, respectively. Currently she is a Ph.D. Research Scholar with the Department of Computer Science and Engineering, Anna University Regional Campus, Tirunelveli. Her research interests include cloud computing, distributed computing and large scale data processing.
References (58)
- et al.
Coordinating self-sizing and self-repair managers for multi-tier systems
Future Gener. Comput. Syst.
(2014) - et al.
Empirical prediction models for adaptive resource provisioning in the cloud
Future Gener. Comput. Syst.
(2012) - et al.
A resource elasticity framework for qos-aware execution of cloud applications
Future Gener. Comput. Syst.
(2014) - et al.
Cloud autoscaling simulation based on queuing network model
Simul. Model. Pract. Theory
(2017) - et al.
Elasticity in cloud computing: State of the art and research challenges
IEEE Trans. Serv. Comput.
(2018) - F. Al-Haidari, M. Sqalli, K. Salah, Impact of CPU utilization thresholds and scaling size on autoscaling cloud...
- et al.
ElastMan: Elasticity manager for elastic key-value stores in the cloud
- Amazon Elastic Compute Cloud. [Online]. Available: http://aws.amazon.com/ec2/ [Accessed:...
- Apache Cassandra, https://academy.datastax.com/planet-cassandra/nosql-performance-benchmarks [Accessed:...
- et al.
An autonomic resource provisioning approach for service-based cloud applications: A hybrid approach
Future Gener. Comput. Syst.
(2018)
Applying reinforcement learning towards automating resource allocation and application scalability in the cloud
Concurr. Comput.: Pract. Exper.
Reinforcement learning-based proactive auto-scaler for resource provisioning in cloud environment
An auto-scaling framework for heterogeneous hadoop systems
Int. J. Cooper. Inf. Syst.
A review of auto-scaling techniques for elastic applications in cloud environments
J. Grid Comput.
A survey on elasticity management in PaaS systems
Computing
Testing elastic computing systems
IEEE Internet Comput.
Adaptation in cloud resource configuration: a survey
J. Cloud Comput.
Cited by (0)
Bibal Benifa JV received her B.E. degree in Computer Science and M.E. degree in Software Engineering from Anna University, Chennai in 2009 and 2011, respectively. Currently she is a Ph.D. Research Scholar with the Department of Computer Science and Engineering, Anna University Regional Campus, Tirunelveli. Her research interests include cloud computing, distributed computing and large scale data processing.
Dejey Dharma received her B.E. and M.E. degrees in Computer Science and Engineering from Manonmaniam Sundaranar University, Tirunelveli, India, in 2003 and 2005, respectively. Later, she was with the Department of Computer Science and Engineering, Manonmaniam Sundaranar University, Tirunelveli, India, as a Junior Research Fellow under the UGC Research Grant. She completed her Ph.D. in Computer Science and Engineering in 2011. She has been with the Department of Computer Science and Engineering, Anna University-Regional Campus, Tirunelveli, as an Assistant Professor since 2010 and as the Head of the Department from 2011 to 2015. She is a member of IEEE, IE(INDIA) and ISTE. Her research interests include Large scale data processing, image and signal processing, watermarking, information hiding and multimedia security.