Hybrid resource provisioning for minimizing data center SLA violations and power consumption

https://doi.org/10.1016/j.suscom.2012.01.005Get rights and content

Abstract

This paper presents a practical and systematic approach to correctly provision server resources in data centers, such that SLA violations and energy consumption are minimized. In particular, we describe a hybrid method for server provisioning. Our method first applies a novel discretization technique on historical workload traces to identify long-term workload demand patterns that establish a “base” load. It then employs two techniques to dynamically allocate capacity: predictive provisioning handles the predicted base load at coarse time scales (e.g., hours) and reactive provisioning handles any excess workload at finer time scales (e.g., minutes). The combination of predictive and reactive provisioning achieves a significant improvement in meeting SLAs, conserving energy and reducing provisioning cost.

We implement and evaluate our approach using traces from four production systems. The results show that our approach can provide up to 35% savings in power consumption and reduce SLA violations by as much as 21% compared to existing techniques, while avoiding frequent power cycling of servers.

Introduction

Data centers are very expensive to operate due to the power and cooling requirements of IT equipment like servers, storage and network switches. As demand for IT services increases, the energy required to operate data centers also increases. The EPA estimated that energy consumption in data centers exceeded 100 billion kWh in 2011, at a cost of $7.4 billion [10]. Rising energy costs, regulatory requirements and social concerns over green house gas emissions have made reducing power consumption critical to data center operators. However, energy efficiency is for naught if the data center cannot deliver IT services according to predefined SLA or QoS goals, as SLA violations result in lost business revenue. For example, Amazon found that every additional 100 ms of latency costs them a 1% loss in sales, and Google observed that an extra 500 ms in search page generation time reduced traffic by 20% [18]. Hence, another challenge data center operators face is provisioning IT resources such that SLA violations are minimized, to prevent business revenue loss. Today, SLA violations are often avoided by over-provisioning IT resources. This results in excessive energy consumption, and may also lead to increased expenditure due to capital overhead, maintenance costs, etc. Thus, an important question in data center resource management is how to correctly provision IT equipment, such that SLA violations are avoided and energy consumption is minimized.

The correct provisioning of resources is a difficult task due to variations in workload demands. Most data center workload demands are very bursty in nature and often vary significantly during the course of a single day. This makes it difficult to provision resources appropriately. A single size (static provisioning) cannot fit all, and will result in either over-provisioning or under-provisioning.

The solution we propose in this paper is based on three important observations. First, many workloads in data centers (e.g., Web servers) exhibit periodic patterns (i.e., daily, weekly and seasonal cycles). If we can identify these patterns in the workload, we can then adjust the resource allocation accordingly, and hence improve the accuracy of resource provisioning and reduce power consumption. Second, demand patterns are statistical in nature, and there will be deviations from historical patterns due to unforeseen factors such as flash crowds, service outages, and holidays. Though the volume of such fluctuations is small compared to the total demand, ignoring them completely can result in significant SLA violations. Third, provisioning is not free; there are various associated costs and risks. Frequent provisioning incurs both performance and energy penalties. For example, turning servers on can take a significant amount of time (up to several minutes) and consume a lot of power (close to peak power consumption) [12]. Frequent power cycling of servers causes “wear and tear”, which could result in server failure and service outage(s) [8].

Based on the above observations, we propose a novel resource provisioning approach in this paper. Our main contributions are:

  • 1.

    We present a detailed workload characterization study for three real applications and provide insight about their properties: variability and periodicity.

  • 2.

    We propose a novel analysis technique—“workload discretization”, to determine the “base” workload demand for a service. In particular, we propose a dynamic programming algorithm that can accurately capture the demand while minimizing the costs and risks from a provisioning perspective.

  • 3.

    We develop a hybrid approach to provision IT resources: a predictive provisioning approach handles the base workload at coarse time scales (e.g., hours) and a reactive provisioning approach handles any excess demand at finer time scales (e.g., minutes). A coordinated management of these two approaches achieves a significant improvement in energy efficiency without sacrificing performance.

  • 4.

    We implement our server provisioning system and evaluate it using empirical workload traces. The results reveal that our workload discretization algorithm better estimates the long-term workload demand from a resource provisioning perspective, and our hybrid provisioning system is superior to existing approaches in terms of meeting the system's SLA requirements while conserving power, reducing provisioning cost and avoiding frequent power cycling of servers.

The rest of the paper is organized as follows. Section 2 examines the demand distribution and properties of several workload traces from production systems. Section 3 describes our workload discretization technique and Section 4 presents our hybrid provisioning approach. Section 5 discusses our experimental evaluation. Section 6 reviews related work and finally Section 7 concludes the paper with a summary of our work and a description of future directions.

Section snippets

Workload characterization

To understand real data center workload demands, we begin with a detailed workload characterization of three types of applications.

Workload discretization

In Section 2, we showed that IT workloads often exhibit daily periodic patterns with a small percentage of noise (e.g., frequent and sporadic spikes). This section presents novel techniques to identify and discretize such patterns in historical workloads traces.

Our algorithms partition the workload demand into disjoint time intervals, such that within each time interval, the demand varies only slightly. A single representative demand value (e.g., mean or 90th percentile demand) is then chosen

Hybrid provisioning

Sections 2 Workload characterization, 3 Workload discretization show that although workload demands are bursty, there are predictable patterns in them. Resource provisioning based on these patterns can improve accuracy significantly. However, there can still be some deviations from these patterns due to the bursty nature of data center workload demands. In particular, excess workload demand or a sudden spike in demand can cause performance problems. Also, there is a cost and risk associated

Evaluation

This section evaluates our workload discretization and hybrid provisioning approaches via analytical trace analysis in Section 5.1 and via experimentation on a real test bed in Section 5.2.

We first present trace-based analysis results for different application traces and show that our hybrid server provisioning approach is superior to existing provisioning approaches. We then present our experimental results based on a real test bed implementation, which resemble our analysis results, and thus

Workload analysis and characterization

Numerous studies have examined workload demand traces. Characterization studies of interactive workloads such as Web or media servers indicate that demands are highly variable, although daily and weekly patterns are common [2], [7]. Rolia et al. find similar patterns in business application workloads [24].

A lot of research has been conducted to predict future workload demands. Vilalta et al. use classical time series analysis to distinguish between short-term and long-term patterns [28]. Dinda

Conclusion and future work

It is a challenging task to correctly provision IT resources in data centers to meet SLA requirements while minimizing power consumption. In this paper, we presented novel workload analysis and server provisioning techniques that enable data centers to meet their SLA requirements while conserving power and avoiding frequent power cycling of servers. We analyzed workload traces from production systems and developed a novel workload characterization technique to accurately capture predictable

Anshul Gandhi is a PhD student in the Computer Science Department at Carnegie Mellon University, under the direction of Mor Harchol-Balter. He received his BTech in Computer Science and Engineering from the Indian Institute of Technology, Kanpur. His research interests are in designing and implementing power management policies for datacenters as well as general performance modeling of computer systems.

References (32)

  • A. Gandhi et al.

    Server farms with setup costs

    Performance Evaluation

    (2010)
  • J. Rolia et al.

    Statistical service assurances for applications in utility grid environments

    Performance Evaluation

    (2004)
  • K. Appleby et al.

    Océano-SLA based management of a computing utility

  • M. Arlitt et al.

    Web server workload characterization: the search for invariants

  • M. Arlitt et al.

    Workload characterization of the 1998 world cup web site

    IEEE Network

    (2000)
  • M. Bennani et al.

    Resource allocation for autonomic data centers using analytic performance models

  • D. Brillinger

    Time Series: Data Analysis and Theory

    (1981)
  • M. Castellanos et al.

    iBOM: a platform for intelligent business operation management

  • L. Cherkasova et al.

    Characterizing locality, evolution, and life span of accesses in enterprise media server workloads

  • A. Coskun et al.

    Evaluating the impact of job scheduling and power management on processor lifetime for chip multiprocessors

  • P. Dinda et al.

    Host load prediction using linear models

    Cluster Computing

    (2000)
  • Report to congress on server and data center energy efficiency, public law, U.S. Environmental Protection Agency,...
  • A. Gandhi et al.

    Optimal power allocation in server farms

  • D. Gmach et al.

    Capacity management and demand prediction for next generation data centers

  • D. Gmach et al.

    Adaptive quality of service management for enterprise services

    ACM Transactions on the Web

    (2008)
  • J. Hellerstein et al.

    A statistical approach to predictive detection

    Computer Networks

    (2001)
  • Cited by (0)

    Anshul Gandhi is a PhD student in the Computer Science Department at Carnegie Mellon University, under the direction of Mor Harchol-Balter. He received his BTech in Computer Science and Engineering from the Indian Institute of Technology, Kanpur. His research interests are in designing and implementing power management policies for datacenters as well as general performance modeling of computer systems.

    Yuan Chen is a senior research scientist in the Sustainable Ecosystem Research group (SERg) at HP Labs <http://www.hpl.hp.com/> in Palo Alto, CA. Yuan's research area is energy efficient computing with a focus on performance and power modeling of services and applications in data centers, and control and optimization of data center workload and resource management. His most recent work is developing an integrated management of the IT, cooling, and power supply subsystems to improve the energy efficiency and reduce the cost and environmental footprint of data centers. Yuan's past work includes automated IT monitoring and management, content-based publish-subscribe systems, high performance computing, and constraint programming. Yuan has published over 40 technical papers in peer-reviewed journals and conferences proceedings, including the Best Paper Award of International Green Computing Conference (IGCC 2011) and the Best Paper Award of IEEE/IFIP Network Operations and Management Symposium (NOMS 2008). He has one patent granted and over 20 patent applications pending. Yuan received a B.S. from University of Science and Technology of China <http://www.ustc.edu.cn/en/>, a MS from Chinese Academy of Sciences <http://www.ict.ac.cn/english/>, and a PhD from Georgia Institute of Technology <http://www.cc.gatech.edu>, all in Computer Science.

    Dr. Daniel Gmach is a researcher at HP Labs Palo Alto. Before that he was a PhD student at the database group of the Technische Universität München where he graduated in 2009. He studied computer science at the University of Passau. His current research interests are in adaptive resource pool management of virtualized enterprise data centers, performance measurement and monitoring, hosting large-scale enterprise applications, database systems, and software engineering principles.

    Martin Arlitt is a senior research scientist at Hewlett-Packard Laboratories (HP Labs) in Palo Alto, CA, where he has worked since 1997. His general research interests are workload characterization and performance evaluation of distributed computer systems. His 70+ research papers have been cited more than 5,000 times (according to Google Scholar). He has 15 granted patents and many more pending. He is an ACM Distinguished Scientist and a senior member of the IEEE. He has served on the program committee of numerous top tier conferences, such as ACM SIGMETRICS, the world's premiere venue for research publications on performance evaluation. He is co-program chair of ACM SIGMETRICS/Performance conference for 2012. He is the creator of the ACM SIGMETRICS GreenMetrics workshop. He is co-chair of the IEEE Computer Society's Special Technical Community on Sustainable Computing. Since 2001, Martin has lived in Calgary, AB, where he is also a senior research associate in the Department of Computer Science at the University of Calgary. In 2008, he was named a member of Calgary's Top 40 under 40.

    Manish Marwah is a researcher in the Sustainable Ecosystem Research group (SERg) at HP Labs. He is conducting cross-disciplinary research on the applications of data mining techniques to enhance sustainability of cyber physical infrastructure, such as, data centers, buildings, etc. He received his PhD in Computer Science from University of Colorado at Boulder. His PhD thesis focused on building distributed systems with enhanced server fault-tolerance to provide seamless user experience. He has also worked for several years in the telecommunication industry, most recently as an architect with Avaya Labs. His research has led to more than 40 refereed papers in conferences and journals; and resulted in 8 issued patents (with over 35 pending). Manish received an MS in Computer Science and an MS in Mechanical Engineering from the University of Colorado at Boulder and a BTech from the Indian Institute of Technology, Delhi.

    View full text