ROAR: A QoS-oriented modeling framework for automated cloud resource allocation and optimization

https://doi.org/10.1016/j.jss.2015.08.006Get rights and content

Highlights

  • ROAR transparently converts QoS goals to a set of optimized cloud resources.

  • ROAR implements a fully automatic endtoend resource optimization process.

  • ROAR supports generic web applications including multitier distributed architecture.

  • ROAR supports multiple cloud platforms such as AWS and GCE.

  • ROAR generates complete resource deployment template for the specific cloud platform.

Abstract

Cloud computing offers a fast, easy and cost-effective way to configure and allocate computing resources for web applications, such as consoles for smart grid applications, medical records systems, and security management platforms. Although a diverse collection of cloud resources (e.g., servers) is available, choosing the most optimized and cost-effective set of cloud resources for a given web application and set of quality of service (QoS) goals is not a straightforward task. Optimizing cloud resource allocation is a critical task for offering web applications using a software as a service model in the cloud, where minimizing operational cost while ensuring QoS goals are met is critical to meeting customer demands and maximizing profit. Manual load testing with different sets of cloud resources, followed by comparison of test results to QoS goals is tedious and inaccurate due to the limitations of the load testing tools, challenges characterizing resource utilization, significant manual test orchestration effort, and challenges identifying resource bottlenecks.

This paper introduces our work using a modeling framework – ROAR (Resource Optimization, Allocation and Recommendation System) to simplify, optimize, and automate cloud resource allocation decisions to meet QoS goals for web applications, including complex multi-tier application distributed in different server groups. ROAR uses a domain-specific language to describe the configuration of the web application, the APIs to benchmark and the expected QoS requirements (e.g., throughput and latency), and the resource optimization engine uses model-based analysis and code generation to automatically deploy and load test the application in multiple resource configurations in order to derive a cost-optimal resource configuration that meets the QoS goals.

Introduction

Cloud computing shifts computing from local dedicated resources to distributed, virtual, elastic, and multi-tenant resources. This paradigm provides end-users with on-demand access to computing, storage, and software services (OGRAPH and MORGENS, 2008). A number of cloud computing providers, such as Amazon Web Services (AWS) (Amazon Web Services (ELB), 2015) and Google Compute Engine (GCE) (Google Compute Engine, 2015), offer cloud computing platforms that provide custom applications with high availability and scalability. Users can allocate, execute, and terminate the instances (i.e., cloud servers) as needed, and pay for the cost of time and storage that active instances use based on a utility cost model of Rappa (2004).

To satisfy the computing resource needs of a wide variety of application types, cloud providers offer a menu of server types with different configurations of CPU capacity, memory, network capacity, disk I/O performance, and disk storage size. Table 1 shows a subset of the server configurations provided by AWS as of September 2014. For example, the m3.medium server with 3 ECU (i.e., EC2 Compute Unit – this is the relative measure of the CPU processing power of an Amazon EC2 cloud server) and 3.75GB memory costs $0.07/h, while the more powerful m3.2xlarge server costs $0.56/h. By comparison, the GCE cloud provider offers competitive options, as shown in Table 2. Although the server types are named differently (e.g., n1-standard-1 stands for the standard general-purpose server type with 1 virtual CPU using the n1-series of machines), the resource configurations at each pricing range vary only slightly. Cloud computing users must use this information to determine the appropriate subset of these resource configurations that will run an application cost-effectively, yet still meet its QoS goals, such as response time.

A common use case of the cloud is to offer existing software products, particularly web-based applications, through a software as a service (SaaS) model. In an SaaS model, the SaaS provider runs the web application in the cloud and customers remotely access the software platform, while the provider manages and maintains the software. SaaS providers typically provide service level agreements (SLAs) (Shu and Meina, 2010) to their clients, which dictate the number of users they will support, the availability of the service, the response time, and other parameters. For example, a provider of a SaaS electronic medical records system may offer an SLA that ensures a certain number of employees in a hospital can simultaneously access the system and that it will provide response times under 1 s.

An important consideration of SaaS providers is minimizing their operational costs while ensuring that the QoS requirements specified in their SLAs are met. For example, in the medical records system outlined above, the SaaS provider would like to minimize the cloud resources allocated to it to reduce operational cost, while ensuring that the chosen cloud resources can support the number of simultaneous clients and response times agreed to in the client SLAs. Moreover, as new clients are added and QoS requirements become more stringent (particularly in terms of the number of supported clients), the SaaS provider would like to know how adding resources on-demand will affect the application’s performance. Blindly allocating resources to the application to meet increasing load is not cost effective. Auto-scaling should instead be guided by a firm understanding of how resource allocations impact QoS goals.

Open Problem. While conventional cloud providers support simple and relatively quick resource allocation for applications, it is not an easy or straightforward task to decide an optimized and cost-effective resource configuration to run a specific application based on its QoS requirements. For instance, if a custom web application is expected to support 1000 simultaneous users with a throughput of 1000 requests/min, it is hard to decide the type and minimum number of servers and cloud providers needed by simply looking at the hardware configurations. Although this type of challenge existed in the traditional computing environments such as self-hosted data centers and server rooms, it has been augmented in the context of cloud computing due to the increasing number of cloud providers to choose and the virtualization-based nature of cloud computing.

Cloud providers and users have traditionally performed complex experimentation and load testing with applications on a wide variety of resource configurations. The most common practice is to deploy the application and perform a load stress test on each type of resource configuration, followed by analysis of the test results and selection of a resource configuration (Menascé, 2002). A number of load testing tools (e.g., jMeter, Apache JMeter, Halili, 2008, ApacheBench, Apache HTTP Server Benchmarking Tool, 2015, and HP LoadRunner, HP LoadRunner, 2015) are available to trigger a large amount of test requests automatically and collect the performance data. The situation gets even harder when an application includes a complex distributed architecture with multiple tiers, which requires the allocation of different groups of resources for the multiple tiers, followed by linking and connecting them properly.

Despite the importance of selecting a cost optimized resource allocation to meet QoS goals, many organizations do not have the time, resources, or experience to derive and perform a myriad of load testing experiments on a wide variety of resource types. Instead, developers typically employ a trial and error approach where they guess at the appropriate resource allocation, load test the application, and then accept it if the performance is at or above QoS goals. Optimization is usually only performed months or years into the application’s life in the cloud, when insight into the affect of resource allocations on QoS goals is better understood. Even then, the optimization is often not systematic.

The primary challenges that impede early resource allocation optimization stem from limitations with load testing tools and a number of manual procedures required in the cloud resource optimization process. It is often hard to specify the customized load tests and correlate load test configuration with the expected QoS goals. Moreover, it is tedious and error-prone to manually perform load testing with different cloud resources to derive an optimized resource configuration, particularly when the application has a complex multi-tier architecture in a heterogeneous cloud platform.

Even when an optimized resource configuration is finally obtained, allocating and deploying all of the resources also requires significant orchestration and complexity. Prior research has addressed some challenges separately (e.g., modeling realistic user test behavior to produce customized load test, Draheim et al., 2006, monitoring target test server performance metrics for capacity planning,Custom Plugins for Apache JMeter, 2015). However, a comprehensive, fully automated approach designed specifically for benchmarking, deriving, and implementing optimized cloud resource allocations has not been developed, particularly for multi-tier applications.

Solution approachQoS-oriented modeling framework for Resource Optimization, Allocation and Recommendation System (ROAR). To address these challenges, this paper presents a model-based system called “ROAR” that raises the level of abstraction when performing load testing and automates cloud resource optimization and allocation to transparently convert users’ application-specific QoS goals to a set of optimized resources running in the cloud. A textual domain-specific language (DSL) called the Generic Resource Optimization for Web applications Language (GROWL) is defined to specify the high-level and customizable load testing plan and QoS requirements without low-level configuration details, such as the number of threads to use, the concurrent clients, and the duration of keeping opened connections.

The model built from the GROWL DSL can generate a test specification that is compatible with our extended version of the jMeter load testing tool (Apache JMeter, Halili, 2008). Likewise, ROAR automates the process of deploying the application to the given cloud platform, executing the load test, collecting and aligning performance metrics, analyzing the test performance model, and controlling the next test iteration. ROAR can derive the appropriate cloud resource configurations to test, as well as automatically orchestrate the tests against each resource allocation before using the results to recommend a cost-optimized resource allocation to meet the QoS goals. When the optimal configuration is decided, ROAR can also generate the resource templates to automate the final deployment process.

The work presented in this paper extends the initial prototype of ROAR (Sun et al., 2014), with the following new contributions and extensions:

  • Multi-tier distributed web applications that require deployment in different groups of cloud servers can be supported by ROAR. Multi-tier configuration can be specified in the GROWL DSL. Likewise, the ROAR controller can automatically allocate the right resources to deploy and link different tiers.

  • The average latency of the deployment target has been added as another key QoS goal, in addition to the target throughput supported in our earlier work. The ROAR resource optimization engine aligns the average latency metrics together with the throughput and resource utilization metrics, and filters out resource configurations that support the target throughput but fail to meet an average latency goal.

  • Rather than focusing on a single cloud platform (such as AWS), the ROAR deployment manager has been extended to support deploying multi-tier applications to multiple cloud providers, which broadens its scope. In particular, we have included GCE as another platform supported by ROAR, which offers a wider range of optimization choices.

  • The motivating example has been changed from a simple single-tier web application to a canonical three-tier web application with a database. Additional experiments and evaluation have also been included to showcase and quantify the performance, cost, and benchmarking effort for the resource configurations generated by ROAR.

The remainder of this paper is organized as follows: Section 2 summarizes a concrete motivating example in the context of a multi-tier web application, followed by describing the key challenges of cloud resource allocation and optimization in Section 3. Section 4 analyzes our solution by explaining each key component in the ROAR framework. Section 5 discusses the validation of this framework by presenting the generated resources for the motivating example. Section 6 compares our work on ROAR with the related research and Section 7 presents concluding remarks.

Section snippets

Motivating example

The example in this paper is based on a multi-tier web service called GhostBox built to support fine-grained control over mobile applications on the Android platform. GhostBox is designed to enable system administrators to configure specific security policies for employee smartphones and manage the policies remotely. These security policies specify the following properties:

  • The types of the smartphone usage restrictions, such as running applications, accessing contacts, changing system settings

Challenges

Determining the most cost-effective set of resources to meet a web applications QoS requirements without over-provisioning resources is a common problem in launching services in production or transitioning services to a SaaS model. The conventional practice is to benchmark web application performance using load testing on each set of possible target cloud resources. Developers use the results from these tests to estimate the required resources needed by the application to meet its QoS goals.

Solution: Resource Optimization, Allocation and Recommendation System (ROAR)

To address the five challenges presented in Section 3 associated with load testing to direct and automate resource allocation and optimization, we developed the Resource Optimization, Allocation and Recommendation System (ROAR). ROAR combines modeling, model analysis, test automation, code generation, and optimization techniques to simplify and optimize the derivation of cloud resource sets that will meet web application’s QoS goals. This section explains each key component in the ROAR

Evaluation

This section presents a case study evaluation based on the motivating example from Section 2. Based on the given web application stack, we run experiments with two sets of QoS goals in both cloud platforms, as shown in Table 5.

These experiments allow us to observe the different optimizations provided by ROAR, as well as the difference between the various cloud computing platforms.

By default, ROAR considers all the available instance types from each cloud platform. This experiment focuses on the

Related work

This section compares our work on ROAR with the following related research efforts.

Ferry et al. (2013) summarized the state-of-the-art on cloud optimization and pointed out the need for model-driven engineering techniques and methods to aid provisioning, deployment, monitoring, and adaptation of multi-cloud systems. Revel8or (Zhu et al., 2007) is a model-driven capacity planning toolsuite to solve the problems related to complex multi-tier applications with strict performance requirements. They

Concluding remarks

This paper presents a QoS-oriented modeling framework called the Resource Optimization, Allocation and Recommendation System (ROAR). ROAR automates the testing and derivation of optimized cloud resource allocations for web applications. In particular, ROAR automates the end-to-end test orchestration from application deployment, metrics collection and alignment, to test model generation and the test iteration with termination.

ROAR uses a textual DSL called GROWL to hide the low-level

Dr. Yu Sun is an Assistant Professor of Computer Science in the Computer Science Department at California State Polytechnic University, Pomona. His research focuses on cloud computing, mobile computing, Internet-Of-Things (IoT) and large-scale distributed systems. Before joining Cal Poly Pomona, he was a post-doc research associate at Vanderbilt University, the Director of Engineering at Cloudpoint Labs, where he led the research and development on the high-precision 3D augmented reality

References (47)

  • Amazon EC2 Pricing, 2015....
  • Amazon Web Services - Auto-Scaling, 2015....
  • Amazon Web Services, 2015....
  • Amazon Web Services - Elastic Load Balancing, 2015....
  • Apache HTTP Server Benchmarking Tool, 2015....
  • Apache JMeter, 2015....
  • AWS Cloud Formation, 2015....
  • AWS Elastic Beanstalk adds Docker support, 2015....
  • BacigalupoD.A. et al.

    An investigation into the application of different performance prediction techniques to e-commerce applications

    Proceedings of the IEEE 18th International Parallel and Distributed Processing Symposium

    (2004)
  • BettiniL.

    Implementing Domain-Specific Languages with Xtext and Xtend

    (2013)
  • BinzT. et al.

    Tosca: portable automated deployment and management of cloud applications

    Advanced Web Services

    (2014)
  • CatanM. et al.

    Aeolus: mastering the complexity of cloud application deployment

    Service-Oriented and Cloud Computing

    (2013)
  • ChaisiriS. et al.

    Optimization of resource provisioning cost in cloud computing

    IEEE Trans. Serv. Comput.

    (2012)
  • Containers on Google Cloud Platform, 2015....
  • Custom Plugins for Apache JMeter, 2015....
  • DiS. et al.

    Dynamic optimization of multiattribute resource allocation in self-organizing clouds

    IEEE Trans. Parallel Distrib. Syst.

    (2013)
  • DiS. et al.

    Error-tolerant resource allocation and payment minimization for cloud system

    IEEE Trans. Parallel Distrib. Syst.

    (2013)
  • Docker, 2015....
  • Docker Hub Registry, 2015....
  • DraheimD. et al.

    Realistic load testing of web applications

    Proceedings of the IEEE 10th European Conference on Software Maintenance and Reengineering, CSMR

    (2006)
  • EysholdtM. et al.

    Xtext: implement your language faster than the quick and dirty way

    Proceedings of the ACM International Conference Companion on Object Oriented Programming Systems Languages and Applications Companion

    (2010)
  • FerryN. et al.

    Towards model-driven provisioning, deployment, monitoring, and adaptation of multi-cloud systems

    Proceedings of the IEEE 6th International Conference on Cloud Computing

    (2013)
  • FinkJ.

    Docker: a software as a service, operating system-level virtualization framework

    Code4Lib J.

    (2014)
  • Cited by (0)

    Dr. Yu Sun is an Assistant Professor of Computer Science in the Computer Science Department at California State Polytechnic University, Pomona. His research focuses on cloud computing, mobile computing, Internet-Of-Things (IoT) and large-scale distributed systems. Before joining Cal Poly Pomona, he was a post-doc research associate at Vanderbilt University, the Director of Engineering at Cloudpoint Labs, where he led the research and development on the high-precision 3D augmented reality technology for mobile platforms. He also worked in Amazon Web Services as a software development engineer and participated in the development of the first cloud-based mobile web browser project Amazon Silk. His research is conducted through the Software engineering, Cloud and Mobile computing (SoftCoM) Laboratory at Cal Poly Pomona, which he directs. He received his Ph.D. from the University of Alabama at Birmingham in 2011.

    Dr. Jules White is an Assistant Professor of Computer Science in the Department of Electrical Engineering and Computer Science at Vanderbilt University. He was previously a faculty member in Electrical and Computer Engineering at Virginia Tech and won the Outstanding New Assistant Professor Award at Virginia Tech. His research has won 3 Best Paper Awards and 2 Best Student Paper Awards. He has also published over 95 papers. Dr. White’s research focuses on securing, optimizing, and leveraging data from mobile cyber-physical systems. His mobile cyber-physical systems research spans four key focus areas: (1) mobile security and data collection, (2) high-precision mobile augmented reality, (3) mobile device and supporting cloud infrastructure power and configuration optimization, and (4) applications of mobile cyber-physical systems in multi-disciplinary domains, including energy-optimized cloud computing, smart grid systems, healthcare/manufacturing security, next-generation construction technologies, and citizen science. His research has been licensed and transitioned to industry, where it won an Innovation Award at CES 2013, attended by over 150,000 people, was a finalist for the Technical Achievement at Award at SXSW Interactive, and was a top 3 for mobile in the Accelerator Awards at SXSW 2013. His research is conducted through the Mobile Application computinG, optimizatoN, and secUrity Methods (MAGNUM) Group at Vanderbilt University, which he directs.

    Dr. Douglas C. Schmidt is a Professor of Computer Science at Vanderbilt University. His research interests include patterns, optimization techniques, and empirical analyses of middleware and domain-specific modeling tools that facilitate the development of distributed real-time and embedded systems. He received his Ph.D. in 1994 from the University of California, Irvine.

    View full text