Profile-based power-aware workflow scheduling framework for energy-efficient data centers
Introduction
In today’s economy, the demand for everyday Internet usage is skyrocketing. Cloud-computing technology enables software-as-a-service (SaaS) over the Internet using the infrastructure in data centers. The large demand of services is escalating the growth of data centers, which predictably affects the demand in energy consumption, consequently increasing the operational costs of data centers [1]. A recent report [2] shows that 1.1% to 1.5% of the total power utilized in the world accounts for the data center energy utilization. In the US alone, 91 billion kWh of electricity were consumed in 2015 by data centers, which accounts for about 1.8% of the total US power consumption [3], [4]. In the same year, 416.2 TWh of electricity were utilized by data centers worldwide. The power consumption of information technology (IT) directly contributes to a larger carbon footprint.
Since 2013, various energy-efficiency measures have been considered to reduce the power consumption in data centers. Such measures include the design and manufacturing of energy-efficient servers, optimal placement of servers in data centers, efficient deployment of power distribution units, energy-efficient air management and cooling systems, and effective use of virtualization technologies [3], [5], [6]. Most of these improvements have been adopted by various hyper-scale data centers owned by IT giants, such as Facebook, Google, and Amazon. In contrast, small- to medium-scale data centers are usually owned by small enterprises, universities, private-sector businesses, and government organizations. These data centers are typically deployed as private cloud infrastructures and provide services to a limited number of clients. According to [7], only 5% of the power consumption related to data centers worldwide is due to hyper-scale data centers; the remaining 95% relates to small- and medium-scale data centers. The NRDC reported that energy management in small- to medium-scale data centers with consistent workloads is more significant than the dynamic nature of the workload in hyper-scale data centers. Furthermore, in these kinds of data centers, the workflows submitted are of homogeneous nature where similar kinds of applications are executed by a small group of clients in a multi-tenant environment.
In small- to medium-scale data centers, server consolidation provides a mechanism for efficient usage of server utilization [8], [9]. As a server consolidation technology, virtualization reduces the underutilization of servers by allowing the multitenancy of applications per physical server, thus maximizing the efficient use of space and reducing the energy, hardware, operational, and deployment costs. However, energy-aware mechanisms for deployment of tasks with varying workloads in virtualized environments is challenging. Zheng et al. [10] presented a distributed traffic-flow consolidation algorithm for distributing workloads in the data center. The proposed algorithm considers the consolidation of traffic flows into a small set of links and switches, shutting off the unused resources. They noted that, with the added complexity of the decentralized approach, they achieve a similar energy-performance tradeoff compared to centralized approaches. Wu et al. [11] proposed a light-weight Virtual Machine (VM) migration algorithm that considers the server utilization threshold to determine workload scheduling in a cluster. The use of a threshold can be controversial since the dynamic workloads can alter the utilization of various servers over a period; therefore, one threshold value may not provide an optimal solution. Wang et al. [12] used integer programming to model the ownership costs of VMs per physical machine (PM). They showed that the complexity of the proposed model has no effect on the performance of the consolidations. Shaw et al. [13] noted that VM consolidation increases the average response time of tasks, negatively affecting the energy-performance tradeoff. They proposed a heuristic approach for a restrictive VM consolidation approach. In addition to the results from the work, deploying an energy-efficient solution is complex, unpredictable, and might degrade the performance of an energy-efficient data center.
To address this challenging issue, we take inspiration from the concept of application profiles (AP) presented in [14]. In this work, based on the size of an application workload, a certain number of VMs are provisioned and deployed on the PMs in the data center. The new energy-management framework proposed in this paper utilizes realistic profiles of application workloads to achieve a greener and more energy-efficient data center while considering the utilization of resources and performance constraints. The framework devises a three-layer architecture: (i) Application Profile layer (APL), (ii) Virtual Machine layer (VML), and (iii) Physical Machine layer (PML). At the APL, APs are kept that contain application details along with the workload, estimated runtime, and resource requirements. The VML considers VM setup parameters, such as the number of CPU cores, memory assignment, and storage allocation. It is also responsible for VM placement, deployment, and migration on PMs. The PML considers on/off operations on PMs, temperature considerations, and dynamic voltage and frequency scaling (DVFS).
The work presented in this paper reviews the current work in VM placement on PMs in data centers. We focus on small- to medium-scale data centers routinely deployed in small organizations and universities. A common characterization of these data centers is the low variability and high certainty in application workloads, resulting in a near constant number of VMs. Due to infrequent variability in data workloads, the policy of hosting a certain number of VMs per PM is rarely updated, and usually, no adjustments are made [15]. A system model for the workflow assignment in the data center using a novel scheduler algorithm is presented. The performance of the proposed scheduler is validated through simulation studies. We compare the proposed scheduler with two scheduling algorithms, namely stochastic heterogeneous earliest finish time (HEFT) [16] and robust time cost (RTC) [17]. Results show that the proposed scheduler is 19% and 38% more energy efficient than RTC and HEFT, respectively for medium to large sized workloads.
The contributions of this paper are in three-fold:
- •
The concept of power-aware APs is highlighted through a motivational case study. A realistic workload is created using SentiStrength [18] and is processed on a Hadoop cluster using various configurations of VM deployment per PM. Results from the experimental testbed are used to devise a mechanism for defining power-aware APs.
- •
A power-aware framework is proposed for the efficient placement of application workloads in a virtualized data center. The framework utilizes the APs to compute the cost of executing a workflow in the data center, based on the power consumption requirements. A heuristic based scheduling algorithm for AP matching is developed based on criteria including CPU, memory, IO, and power consumption requirements. The run time complexity of the proposed approach is similar to RTC and HEFT schedulers.
- •
Extensive simulation studies are carried out to evaluate the proposed framework. The results from the scheduler are compared to the RTC and HEFT schedulers for nine different scenarios. Results show that the proposed algorithm is more efficient in terms of energy utilization.
The rest of the paper is organized as follows. Section 2 provides the background and related works. Section 3 details a motivational case study for the power efficiency of a data center, building the case for the proposed framework based on APs. Section 4 presents details for the proposed power-aware framework. Section 5 presents detailed experimental evaluations followed by the conclusions and future directions in Section 6.
Section snippets
Related works
This section presents an overview of related works in the area of energy efficient workflow scheduling strategies for data centers.
Motivational case study
The VM placement problem is a thoroughly investigated area in cloud computing. Many algorithms have been proposed and developed to optimize the various proposals and techniques [9], [12], [17], [22], [33], [38], [39], [40], [41], [42], [43], [45], [48]. A major facet for research in power-aware placement of VMs is reducing the power consumption of PMs, increasing the efficiency of the data center by tuning into parameters, such as CPU utilization, memory and I/O utilization, and the
Power-aware workflow scheduling framework
The research problem addressed in this paper focuses on optimization of a data center energy utilization. The concept of APs is used to place VMs in the cluster while maintaining a healthy tradeoff between task execution times and power consumption. In this section, we detail an energy-management framework utilizing realistic profiles of workflows with various application workloads to achieve a greener and more energy-efficient data center, while considering the utilization of resources and
Evaluation
This section presents a detailed experimental evaluation of the proposed scheduling algorithm. We choose to compare the proposed algorithm with two scheduling algorithms, namely stochastic HEFT [16] and RTC [17]. We modify the HEFT and RTC algorithms to enable dynamic workflow. This is to allow these algorithms to schedule all tasks in the workflow immediately as a new workflow arrives. Since both algorithms do not consider profiles, we modify the RTC algorithm to include the price factor of a
Conclusions
A significant research problem in cloud computing is finding a tradeoff between power efficiency while maintaining high performance efficiency. In this paper, we provide a detailed case study using various workloads to highlight the inefficient power workflow scheduling in Hadoop. We exploit the concept of building profiles for applications with certain workloads executing in small- to medium-scale data centers. A profile-based energy-efficient framework is proposed with a novel scheduler that
Acknowledgment
This work is partially supported by the Robotics and Internet of Things Lab in the Research and Innovation Center at Prince Sultan University, Saudi Arabia .
Basit Qureshi received his Ph.D. degree in computer science from University of Bradford in the year 2011. Prior to that he received his Master of Science degree in Computer Science from Florida Atlantic University in 2002 and his Bachelor of Science degree in Computer Science from Ohio University, OH USA in 2000. His research interests include Trust, Security and privacy issues in Wireless Networks, Robotics and Smart Cities applications. He is a member of IEEE, IEEE Computer Society, IEEE
References (56)
- et al.
Changes in time use and their effect on energy consumption in the united states
Joule
(2018) - et al.
Developing an optimized application hosting framework in clouds
J. Comput. System Sci.
(2013) - et al.
A stochastic scheduling algorithm for precedence constrained tasks on grid
Future Gener. Comput. Syst.
(2011) - et al.
Analyzing hadoop power consumption and impact on application qos
Future Gener. Comput. Syst.
(2016) Governing energy consumption in hadoop through cpu frequency scaling: an analysis
Future Gener. Comput. Syst.
(2016)- et al.
Energy-credit scheduler: an energy-aware virtual machine scheduler for cloud systems
Future Gener. Comput. Syst.
(2014) - et al.
Multi-objective energy-efficient workflow scheduling using list-based heuristics
Future Gener. Comput. Syst.
(2014) - et al.
Characterizing and profiling scientific workflows
Future Gener. Comput. Syst.
(2013) The limits to cloud price reduction
IEEE Cloud Comput.
(2017)- A. Shehabi, et al. United states data center energy usage report, (2016) [online] Available:...
A belief rule based expert system for datacenter PUE prediction under uncertainty
IEEE Trans. Sustain. Comput.
Joint cooling and server control in data centers: a cross-layer framework for holistic energy minimization
IEEE Syst. J.
Server consolidation techniques in virtualized data centers: a survey
IEEE Syst. J.
DISCO: distributed traffic flow consolidation for power efficient data center network
An energy efficient VM migration algorithm in data centers
Mathematical programming for server consolidation in cloud data centers
Countering the collusion attack with a multidimensional decentralized trust and reputation model
Springer J. Multimed. Tools Appl.
Robust scheduling of scientific workflows with deadline and budget constraints in clouds
Sentiment strength detection in short informal text
J. Am. Soc. Inf. Sci. Technol.
Performance and energy efficiency of big data applications in cloud environments: a hadoop case study
J. Parallel Distrib. Comput.
Bilateral electricity trade between smart grids and green datacenters: pricing models and performance evaluation
IEEE J. Sel. Areas Comm.
Oasis: scaling out datacenter sustainably and economically
IEEE Trans. Parallel Distrib. Syst.
Identification of critical parameters for mapreduce energy efficiency using statistical design of experiments
Issues in adopting agile development principles for mobile cloud computing applications
Cited by (41)
Energy-aware intelligent scheduling for deadline-constrained workflows in sustainable cloud computing
2023, Egyptian Informatics JournalDeadline-constrained energy-aware workflow scheduling in geographically distributed cloud data centers
2022, Future Generation Computer SystemsCitation Excerpt :Then, due to the relative advantages of virtual machine migration and the efficiency of migrating each virtual machine, the virtual machine with the most visible advantage in efficient virtual machine migration is selected as the migration target. Qureshi [35] has developed an energy-aware solution to effectively place application workloads in the data center. He studied the function of the application profile in solving the trade-off between energy and performance while considering the CPU, memory, network I/O, and required energy constraints to develop the actual application workload profile.
Development of an adaptive artificial neural network model and optimal control algorithm for a data center cyber–physical system
2022, Building and EnvironmentCitation Excerpt :Hardware-based strategies include the efficient arrangement of IT equipment and cooling systems [8,9], use of containment systems [10–12], introduction of high-efficiency IT equipment and cooling systems [4,13–16], improvement of airflow distribution systems [17–19], and outdoor air utilization using economizer mode [20–22]. Software-based approaches include efficient IT resource distribution [23–25], IT resource and cooling system scheduling [26–28], and optimal system control [29–31]. Both types of strategies effectively conserve energy.
A survey of domains in workflow scheduling in computing infrastructures: Community and keyword analysis, emerging trends, and taxonomies
2021, Future Generation Computer SystemsEfficient scientific workflow scheduling for deadline-constrained parallel tasks in cloud computing environments
2020, Information SciencesCitation Excerpt :In their framework, voltage and frequency can be scaled intelligently to match the features of tasks and processors. A profile-based efficient power-aware framework was presented by Qureshi [16] to achieve a good tradeoff among the cost of virtual machines (VMs), CPU utilization, load balance, and power usage in data centers. Several characteristics of application workflow requirements that involve CPU, memory size, network bandwidth, and power budget constraints are considered while assigning application workloads to the cloud center.
Basit Qureshi received his Ph.D. degree in computer science from University of Bradford in the year 2011. Prior to that he received his Master of Science degree in Computer Science from Florida Atlantic University in 2002 and his Bachelor of Science degree in Computer Science from Ohio University, OH USA in 2000. His research interests include Trust, Security and privacy issues in Wireless Networks, Robotics and Smart Cities applications. He is a member of IEEE, IEEE Computer Society, IEEE Communication Society and ACM.