A high-level energy consumption model for heterogeneous data centers

https://doi.org/10.1016/j.simpat.2013.05.006Get rights and content

Abstract

Data centers consume anywhere between 1.7% and 2.2% of the United States’ power. A handful of studies focused on ways of predicting power consumption of computing platforms based on performance events counters. Most of existing power-consumption models retrieve performance counters from hardware, which offer accurate measurement of energy dissipation. Although these models were verified on several machines with specific CPU chips, it is difficult to deploy these models into data centers equipped by heterogeneous computing platforms. While models based on resource utilization via OS monitoring tools can be used in heterogeneous data centers, most of these models were linear model. In this paper, we analyze the accuracy of linear models with the SPECpower benchmark results, which is a widely adopted benchmark to evaluate the power and performance characteristics of servers. There are 392 published results until October 2012; these servers represent most servers in heterogeneous data centers. We use R-squared, RMSE (Root Mean Square Error) and average error to validate the accuracy of the linear model. The results show that not all servers fit the linear model very well. 6.5% of R-squared values are less than 0.95, which means linear regression does not fit the data very well. 12.5% of RMSE values are greater than 20, which means there is still big difference between modeled and real power consumption. We extend the linear model to high degree polynomial models. We found the cubic polynomial model can get better results than the linear model. We also apply the linear model and the cubic model to estimate real-time energy consumption on two different servers. The results show that linear model can get accurate prediction value when server energy consumption swing in a small range. The cubic model can get better results for servers with small and wide range.

Introduction

The power requirements of today data centers range from 75 W/ft2 to 150–200 W/ft2 and will increase to 200–300 W/ft2 in the nearest future. Energy cost becomes a major part of data center operational cost. To reduce the operational cost in large-scale data centers, researchers developed a wide range of energy-saving and thermal management techniques (e.g., workload consolidation, live migration, CPU throttling solutions).

Workload consolidation is one of the most effective ways of conserving power by turning off spare servers. In many cases, the workload consolidation technique is incorporated with virtual machines, which are migrated from many physical machines into a smaller number of physical machines. This approach can save energy by reducing the number of active servers. Migration policies depend on both performance requirements and power consumption of different workloads. Power consumption models fall into two categories. Models in the first one can predict the power consumption of each physical machine; models in the second group aim to determine the impact of each virtual machine’s load on energy consumption. Previous studies show a strong correlation between performance events and the power consumption. Existing models rely on various performance events (e.g., CPU, IO, memory, and cache) to estimate energy consumption of of sub-components of a computing system; sub-components can be classified as either CPU-bound (e.g., CPU and cache) or IO-Bound (e.g., DRAM, HDD).

The existing power consumption models are practical for specific machines under certain conditions. These models are reasonably accurate for data centers equipped by homogeneous computing platforms. However, the heterogeneous and dynamic characteristics of modern data centers make these models less accurate and trustworthy. To apply the traditional models in a heterogeneous data center, system administrators must verify the models on different types of servers. When it comes to data centers containing a large number of heterogeneous computing components, it is unproductive and time consuming to adopt and validate a single model to predict energy consumption of a wide variety of servers.

Early energy consumption model use CPU utilization as only parameter [1]. Some studies try to monitor several performance counters related with CPU to estimate the power consumption [2]. Past studies use multiple performance counters to calculate the energy consumption including CPU, memory, disks and network [3], [4]. They extended the model with more parameters to get more accurate power consumptions. All of these models use linear model. Most of these models were only validated on several servers. Fan et al. validate the linear model in Google data center with thousands of servers, but they did not mention how many kinds of servers in their data center.

To address the aforementioned problem, we validate the accuracy of the linear model by comparing modeling results against available data obtained from the SPECPower_ssj2008 benchmark (or SPECpower for short), which is a widely adopted benchmark used to evaluate the power and performance characteristics of servers. Until October 2012, 392 servers have been evaluated by SPECpower. These servers came from 26 different vendors through past 6 years, these servers can represent the heterogeneous servers in the market. To our knowledge, this is the first power usage study of so many kinds of different servers. Some of our key findings and contributions are:

  • A wide validation of heterogeneous servers. We validated the linear energy model by 392 published results tested by different kinds of servers. We analyzed the accuracy though R-squared, RMSE and average error for different kinds of servers.

  • We found that not all servers fit the linear model well. 6.5% (25 kinds of servers) of R-squared values are less than 0.95, which means CPU utilization is not significantly correlation with power consumption.

  • Although the average RMSE of all servers is 14.86, there are 118 kinds of servers’ RMSE values bigger than 10 and 49 kinds are bigger than 20. Which means for these servers, even though R-squared value shows the linear model fit well, there is still big difference between modeled and real power consumption.

  • We also illustrate the different power consumptions of servers with same CPU. We find that different servers have different power consumption characters even with same CPU and similar workloads.

  • We found many servers swing in a wide range (max power consumption minus idle power consumption) does not fit the linear model well. We use high degree polynomial models to fit these data. We find cubic polynomial can get better results.

  • We apply the linear model and the cubic model on two different servers to validate our conclusion. One server has a small range and another has a big range. We illustrate several estimation results under different applications and benchmarks. The results show that the server with big range has bigger error.

The rest of this paper is organized as follows. Section 2 introduces the existing energy consumption models and the SPECpower benchmark. Section 3 presents our energy consumption model using high level performance counters. Section 4 shows validation results of SPECpower results. Section 5 validate the accuracy of models in real-time manner. The last section discusses the future work of this study.

Section snippets

Energy consumption model

Power management is becoming an important issue to be addressed in data centers. Managers have to reduce energy costs of servers and cooling systems in order to offer competitive services. It is straightforward to measure the energy consumption of an entire data center [5]. To schedule jobs or workload consolidation in an energy-efficient way, one has to estimate the energy consumed by each computer node in a data center.

Power meters retrieve system power usage accurately in real-time manner.

High level energy consumption model

We start this section by offering an overview of the energy consumption model. Then, we describe how to build the model using the measurement results of SPECpower. We also will demonstrate how to apply the proposed model in heterogeneous data centers.

Comparison results of same CPU

Previous studies show that CPU load is a major contributor to energy consumption. Since a CPU comes with matched North Bridge chips, we assume servers with the same CPU have a similar energy consumption. We choose seven different servers equipped with the same type of CPU (i.e., Intel Xeon E5-2670 2.60 GHz 2 chips 32 threads); All of the tested servers were single node. Table 1 lists the configuration of the seven servers and their power consumption. Our preliminary results show that not all the

Real-time energy consumption

In addition to SPECpower results, real-time power estimation does not limit to several points. The power consumption model developed in Section 3.2 can derive from the results of SPECpower. In this section, we use the model to estimate the energy consumption of servers in a real-time manner. We validate the model to estimate the real-time energy consumption during SPECpower test processing. SPECpower benchmark only generate CPU and memory utilization, it does not generate too much IO and

Future work

Comparing with those models relying on hardware performance event counters, our model has two compelling advantages over the hardware-event-counter-based models. First, our modeling approach can be easily applied to estimate the energy consumption of any types of servers. Second, the model can be readily incorporated in a data center equipped with a large number of heterogeneous servers.

Our model has been validated against the 392 measured results obtained from the previous studies. For most of

Conclusion

In this study, we proposed an energy consumption model developed at the CPU utilization rather than CPU chips. This work was motivated by the fact that the existing models are inadequate for future data centers equipped with heterogeneous servers. Our modeling approach can be easily adopted to estimate energy consumption of a wide range of heterogeneous servers, because our model is driven by CPU utilization retrieved from operating systems. We take a different approach to find the relationship

Acknowledgments

This work was made possible thanks to the NPU Fundamental Research Foundation under Grant No. JC20110227, the National Science and Technology Ministry No. 2011BAH04B05 National High-tech R&D Program of China (863) under Grant No. 2013AA01A215, and the National Natural Science Foundation of China under Grant No. 61033007. And this work was supported by China Scholarship Council. Xiao Qin’s research was supported by the U.S. National Science Foundation under Grants CCF-0845257 (CAREER),

References (25)

  • C. Isci et al.

    Runtime power monitoring in high-end processors: methodology and empirical data

  • W.L. Bircher, L.K. John, Complete system power estimation: a trickle-down approach based on performance events, in:...
  • D. Economou, S. Rivoire, C. Kozyrakis, P. Ranganathan, Full-system power analysis and modeling for server environments,...
  • T. Heath et al.

    Energy conservation in heterogeneous server clusters

  • X. Fan et al.

    Power provisioning for a warehouse-sized computer

    ACM SIGARCH Computer Architecture News

    (2007)
  • A Software Predict Energy Consumption....
  • Thermal Design Power List of Intel Xeon Processors....
  • W. Bircher et al.

    Complete system power estimation: a trickle-down approach based on performance events

  • W. Bircher et al.

    Complete system power estimation using processor performance events

    IEEE Transactions on Computers

    (2012)
  • K. Singh et al.

    Real time power estimation and thread scheduling via performance counters

    ACM SIGARCH Computer Architecture News

    (2009)
  • G. Contreras, M. Martonosi, Power prediction for intel XScale® processors using performance monitoring unit events, in:...
  • D. Snowdon12 et al.

    A platform for OS-level power management

  • Cited by (47)

    • Future data center energy-conservation and emission-reduction technologies in the context of smart and low-carbon city construction

      2023, Sustainable Cities and Society
      Citation Excerpt :

      As shown in Fig. 8(a), the IT equipment consume approximately 50% of the DC energy (Rteil, Bashroush, & Kenny, 2022). Among them, server energy consumption accounts for more than 25%, and idle servers consume approximately 5% of this energy (Zhang, Lu, & Qin, 2013). During the early years, the global server operation electricity cost reached approximately 7.2 billion dollars (Koomey, 2007).

    • A measurement-based power consumption model of a server by considering inlet air temperature

      2022, Energy
      Citation Excerpt :

      For some cases, the variable-CPU frequency of the model was replaced by the server utilization [20] or CPU utilization [21,22]. Additionally, Zhang et al. [23] used high-degree polynomial models to fit the server power consumption and found that the cubic polynomial can be the best choice. Specially, as for some of the above models, Lin et al. [24] analyzed the prediction errors among six power models and concluded that polynomial model has the lowest error (1.615%), followed by the power function model (2.794%) and the quadratic model (2.974%).

    • An intelligence energy consumption model based on BP neural network in mobile edge computing

      2022, Journal of Parallel and Distributed Computing
      Citation Excerpt :

      After continuous iterative training until the error convergence. To evaluate the prediction accuracy of different energy consumption models, the comparison models FSDL model [13], CMP model [32], AEC model [12], Cubic model [37], Power Regression model [39] are chosen make a comparative analysis. In this part, we evaluate the DSBF model through running the SPEC_CPU 2006 data set.

    • Machine learning-based prediction of air quality index and air quality grade: a comparative analysis

      2024, International Journal of Environmental Science and Technology
    View all citing articles on Scopus
    View full text