Energy efficient VM scheduling and routing in multi-tenant cloud data center

doi:10.1016/j.suscom.2019.04.004

Sustainable Computing: Informatics and Systems

Volume 22, June 2019, Pages 139-151

https://doi.org/10.1016/j.suscom.2019.04.004 Get rights and content

Highlights

•
We model the joint optimization of server and network element energy consumption during VM scheduling and routing as an integer programming problem.
•
Two metaheuristic algorithms based on ant colony optimization are proposed as a solution to the problem.
•
The proposed solution is tested for three standard data center network topologies namely 3-Tier, B-Cube, and Hyper-Tree of different sizes and its effectiveness is compared with two standard heuristic solutions, first-fit and round robin.

Abstract

Cloud data center hosting many composite applications of multiple tenants consumes a massive amount of energy. Developing energy efficient mechanisms for data center management has become a vital issue for the cloud providers. Most of the works that deal with mechanisms for energy-efficient resource provisioning focus either on reducing the energy consumption of servers or reducing the energy consumption of network elements, but not both. We have formulated the problem of jointly optimizing the energy consumption of servers and network elements by optimal VM scheduling and routing as an integer programming problem. To solve it a phase-wise optimization approach with two ant colony based meta-heuristic algorithms is proposed. The topology features of the data center network and the communication patterns of the applications are considered in the construction of the solution. The solution is tested for three standard data center networks, 3-tier, B-Cube and Hyper-tree of different sizes and compared against two standard algorithms, first-fit and round-robin algorithms. The results showed that the proposed solution improves energy savings by 15% and 20% on an average when compared with first-fit and round-robin respectively.

Introduction

Cloud computing is a buzzing paradigm in the computing industry, and recently there is a massive surge in the use of cloud services to support many sought-after internet applications of different fields like e-commerce, social networking, on-demand video streaming, big-data analytics and so on. The reasons for this growing trend are the alleviation of the users from the burden of owning and maintaining the server resources, reduction in total cost of ownership, ability to access the resources and data from anywhere, and flexibility to scale up and down the resources required as per their dynamically changing needs.

To service this ever-increasing demand for cloud services, cloud providers are deploying large-scale data centers containing thousands of servers and network switches across the world. With the rapid growth in the number and the size of the data centers, the energy consumption and the costs associated with it have increased dramatically. As per the latest study on the US data center energy usage, the data centers in the U.S is estimated to consume 73 billion kWh, which is nearly 2% of the total energy consumed by the US [1]. As the electricity prices are also rapidly increasing with time, the cloud service providers are experiencing substantial energy costs. Apart from the operational costs, exorbitant levels of energy consumption lead to adverse effect on the environment through the large volumes of carbon emissions get released from these data centers. Hence reducing the energy usage has become a key concern for the cloud providers in the design and operation of large-scale data centers.

The energy cost of a data center mainly arises from the energy consumed by physical servers, networking components and cooling equipment. Typically, an idle server in a data center consumes more than 50% of the energy it consumes when it is fully loaded. Based on this finding a popular approach to save energy is server consolidation [2], [3], [4], [5], [6]. Server consolidation is the process of allocating virtual machines that service one or many tasks onto a few servers. The remaining servers are switched off entirely or switched to a low power mode to save energy, as the energy-disproportionate servers in data centers consume a considerable amount of power even when they are idle. Moreover, the network switches and the routers in the data center network contribute to a significant portion of the total power consumed by the data center. The networking components also consume a sizable power even when they are idle, almost equivalent to 30% of the power consumed when they are fully loaded. Like in the case of physical servers, a prudent approach to save energy is allocating the network flows of the virtual machines to a few number networking components and turning off the remaining ones as they do not have any load on them.

At a time, in a cloud data center, many composite applications belonging to different tenants need be deployed. These composite applications are made of several sub-tasks which are logically connected by data and flow dependencies. Generally, a virtual machine in a cloud can serve one or more tasks of an application. Each tenant requests several virtual machines to service the tasks belonging to their application. Virtual machines, belonging to a tenant, communicate and exchange data with each other during their execution. If the behavior of the application belonging to the tenant is known in advance, the communication patterns of the virtual machines can be predicted. If two communicating virtual machines are placed very far in the data center network, then the data that is being exchanged would go through many network elements resulting in higher network energy consumption. Here, the idea is to consolidate the virtual machines onto minimal number of servers in such a way that any two communicating virtual machines are placed very close to each other in the proximity of the data center network. Then the communication data among the virtual machines is routed through minimal number of links and switches to save the energy by turning off the remaining idle links and switches.

Contributions of the paper

•
We model the joint optimization of server and network element energy consumption during VM scheduling and routing as an integer programming problem
•
Two meta heuristic algorithms based on ant colony optimization are proposed as a solution to the problem.
•
The proposed solution is tested for three standard data center network topologies namely 3-Tier, B-Cube and Hyper-Tree of different sizes and its effectiveness is compared with two standard heuristic solutions, first-fit and round robin.

Section snippets

Related work

In recent times, effective energy management in the data centers has become a vital issue for cloud providers. The growing energy costs of the data centers have motivated the researchers to propose energy efficient mechanisms for data center management [3], [7], [8], [9], [10].

Sheikh et al. [11] proposed an evolutionary approach for optimizing the combined objective of performance–energy–temperature for scheduling of parallel tasks on multi-core servers. The solution here is aimed at reducing

Data center model

A heterogeneous single site cloud data center, consisting a set of m heterogeneous servers $ℍ$ {H₁, H₂, …, H_m} is considered here. The capacity of a server H_i in terms of computing power, ram size and storage is given by $Z_{i}^{comp}$ , $Z_{i}^{mem}$ and $Z_{i}^{storage}$ respectively. The data center network topology considered here is a hierarchical topology that connects the switches in different layers namely core, aggregate and edge. The set of switches is represented as $S$ {S₁, S₂, …}. As the switches are

Problem formulation

Given a data center network $G (ℍ \cup S, E)$ and virtual machine requests from n number of tenants, with a communication graph $C_{j} (\hat{V}, \hat{E})$ for each tenant indicating the communication pattern of the requested virtual machines, our aim is to consolidate the VM requests onto minimal number of servers and switches. The remaining servers and switches are turned off to save energy. So the objective here is to consolidate the virtual machines belonging to multiple tenants onto the physical servers in such a way

Energy efficient VM scheduling and routing

The solution to the problem is divided into two stages. The first stage is finding an optimal schedule of virtual machines belonging to different tenants. Once the virtual machines are placed optimally, the communication flows belonging to different tenants need to be routed through the links in such a way that reduces the total power consumption of network switches. This is termed as the second stage.

Experimental results

This section presents a detailed discussion of the results of the experiments carried out to test the proposed solution. The solution is implemented on a laptop with Intel core i5-4200U CPU with four cores and 6GB RAM. Java is used to implement the algorithms proposed in the solution. The efficiency of the solution is tested for three data centers of different sizes. The results of the above experiments are compared with two standard heuristic algorithms for VM scheduling, first fit and round

Conclusion

In this paper, communication aware energy efficient VM scheduling and routing problem in cloud data center is addressed. To save energy, the proposed two-phase solution first consolidates the virtual machines on to few servers while placing communicating virtual machines in close proximity. Then it consolidates the communication flows of virtual machines on to few switches and links. The solution is tested for three different standard data center network architectures of three different sizes.

References (21)

A. Beloglazov et al.
Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing
Future Gener. Comput. Syst.
(2012)
T. Baker et al.
An energy-aware service composition algorithm for multiple cloud-based IOT applications
J. Netw. Comput. Appl.
(2017)
Y. Li et al.
An online power metering model for cloud environment
2012 IEEE 11th International Symposium on Network Computing and Applications
(2012)
J. Koomey et al.
United States Data Center Energy Usage Report, Technical Report LBNL-1005775
(2016)
N. Quang-Hung et al.
Epobf: energy efficient allocation of virtual machines in high performance computing cloud
Transactions on Large-Scale Data-and Knowledge-Centered Systems XVI
(2014)
M.A. Oxley et al.
Makespan and energy robust stochastic static resource allocation of a bag-of-tasks to a heterogeneous computing system
IEEE Trans. Parallel Distribut. Syst.
(2015)
L. Wang et al.
Greendcn: A general framework for achieving energy efficiency in data center networks
IEEE J. Select. Areas Commun.
(2014)
K. Ye et al.
Profiling-based workload consolidation and migration in virtualized data centers
IEEE Trans. Parallel Distribut. Syst.
(2015)
S.U. Khan et al.
A pure nash equilibrium-based game theoretical method for data replication across multiple servers
IEEE Trans. Knowledge Data Eng.
(2009)
T. Baker et al.
Greeaodv: an energy efficient routing protocol for vehicular ad hoc networks
International Conference on Intelligent Computing
(2018)

There are more references available in the full text version of this article.

Cited by (20)

Multi-search-routes-based methods for minimizing makespan of homogeneous and heterogeneous resources in Cloud computing
2023, Future Generation Computer Systems
Citation Excerpt :
Liu et al. [39] proposed OEMACS combining OEM local search techniques and ACO to resolve VMs deployment in Cloud computing, which reduced the energy consumption and improved the effectiveness of different resources compared with conventional heuristic and other evolutionary-based approaches. Chakravarthy et al. [13] proposed two ant colony-based algorithms (TACO) to address VM scheduling and routing in multi-tenant Cloud data centers aiming at improving the utilization of energy in Cloud computing. GA imitates the process of natural evolution as the search route of the local search algorithm.
Cloud computing, as a large-scale distributed computing system dynamically providing elastic services, is designed to meet the requirement of delivering computing services to users as subscription-oriented services. In general, the problems of resource scheduling in Cloud computing like minimizing makespan are usually NP-Hard problems. Various common algorithms including heuristic, meta-heuristic and machine learning are applied in resource scheduling of Cloud computing to obtain the solutions, which however are still probable and imperative to be optimized. Through innovatively applying heuristic algorithms namely LPT (Longest Processing Time) and BFD (Best Fit Decreasing) as the basic search routes and integrating these with neighborhood search algorithm namely OneStep, this paper proposes multi-search-routes-based algorithms containing LPT-OneStep, BFD-OneStep and their combinations for the sake of enhancing theoretical performance and improving solutions of scheduling schemes especially for problems of minimizing makespan for homogeneous and heterogeneous resources. Theoretical derivations prove that the proposed algorithms possess better theoretical approximation ratios for $P | | C_{m a x}$ . Extensive experiments on simulation environment demonstrate the proposed algorithms outperform than corresponding compared algorithms for minimizing makespan problems in both homogenous resources and heterogeneous resources, which validates the superiority of the proposed algorithms.
Growable Genetic Algorithm with Heuristic-based Local Search for multi-dimensional resources scheduling of cloud computing
2023, Applied Soft Computing
Multi-Dimensional Resources Scheduling Problem (MDRSP, usually a multi-objective optimization problem) has attracted focus in the management of large-scale cloud computing systems as the collaborative operation of various devices in the cloud affects resource utilization and energy consumption. Effective management of the cloud requires a higher performance method to solve MDRSP. Considering the complex coupling between multi-dimensional resources and focusing on virtual machines allocation, we propose GGA-HLSA-RW (GHW, a novel family of genetic algorithms) to optimize the utilization and energy consumption of the cloud. In GGA-HLSA-RW, we add a growth stage to the genetic algorithm and construct a Growable Genetic Algorithm (GGA) using the Heuristic-based Local Search Algorithm (HLSA) with Random multi-Weights (RW) as the growth route. Based on the GHW, we propose GHW-NSGA II and GHW-MOEA/D by applying the sorting strategies and population regeneration mechanism of NSGA II and MOEA/D. To evaluate the performance of GHW, we carry out extensive experiments on the simulation dataset and AzureTraceforPacking2020 for the problems of minimizing the maximum utilization rate of resources for each dimension and minimizing total energy consumption. Experiment results demonstrate the advantages of growth strategy and dimensionality reduction strategy of GHW, as well as validate the applicability and optimality of GHW in realistic cloud computing. The experiments also demonstrate our proposed GHW-NSGA II and GHW-MOEA/D have better convergence rates and optimality than state-of-the-art NSGA II and MOEA/D.
Deep reinforcement learning-based algorithms selectors for the resource scheduling in hierarchical Cloud computing
2022, Journal of Network and Computer Applications
Citation Excerpt :
Finally, we conclude this paper in Section 6. Frequently used approaches for resource scheduling of Cloud computing contain migration such as VMs migration (Sudarshan Chakravarthy et al., 2019), application migration (Duc et al., 2019), task migration (Miao et al., 2020), and workload migration (Fiandrino et al., 2017); Queuing model such as M/M/S (Ding et al., 2020) and M/G/1 (Li, 2009); multi-phase approach (Laili et al., 2020); as well as scheduling algorithm. The core of these approaches is still the scheduling algorithm including several categories according to the solution way: direct allocation algorithms, search algorithms, and machine learning algorithms.
Cloud computing environment is becoming increasingly complex due to its large-scale information growth and increasing heterogeneity of computing resources. Hierarchical Cloud computing dividing the system into multi-levels with multiple subsystems to support the adaptability to abundant requests from users has been widely applied and brings great challenges to resource scheduling. It is critical to find an effective way to address the complex scheduling problems in hierarchical Cloud computing, whose scenarios and optimization objectives often change with the types of subsystems. In this paper, we propose a scheduling framework to select the scheduling algorithms (SFSSA) for different scheduling scenarios considering no algorithm well suitable to all scenarios. To concretize SFSSA, we propose deep learning-based algorithms selectors (DLS) trained by labeled data and deep reinforcement learning-based algorithms selectors (DRLS) trained by feedback from dynamic scenarios to complete the algorithms selection regarding the scheduling algorithms as selectable tools. Then, we apply strategies including pre-trained model, long experience reply and joint training to improve the performance of DRLS. To enable the quantitative comparison of selectors, we introduce a weighted cost model for the trade-off between solution and complexity. Through multiple sets of experiments in hierarchical Cloud computing with multi subsystems for five types of scheduling problems and varying weights of cost, we demonstrate DLS and DRLS outperform baseline strategies. Compared with random selector, greedy selector, round-robin selector, single best selector, virtual best selector and single fast selector, DLS reduces the cost by 47.4%, 46.1%, 33.9%, 47.9%, 19.3%, 18.8% under stable parameter ranges, and DRLS reduces the cost by 41.1%, 40.6%, 11.7%, 42.3%, 11.5%, 12.5% in dynamic scenarios respectively. In experiments, we also validate DRLS has stronger adaptability than DLS in dynamic scheduling scenarios and DRLS using all of strategies achieves the best performance.
An ACO for energy-efficient and traffic-aware virtual machine placement in cloud computing
2022, Swarm and Evolutionary Computation
This paper formulates a virtual machine placement (VMP) problem, where the total power consumption of physical machines (PMs) and switches and the total network bandwidth resource consumption among VMs are jointly minimized. To address the problem, we present an energy- and traffic-aware ant colony optimization (ETA-ACO) algorithm. Three novel schemes are introduced to enhance the performance of ETA-ACO, including an energy- and bandwidth-aware PM selection scheme, a traffic-based VM ordering scheme, and a direct information exchange scheme. The first scheme consists of two steps when selecting a PM to host a given VM. In the first step, PMs with lower power consumption are preserved. In the second step, the one with the lowest bandwidth resource consumption is chosen to host the VM. In the second scheme, ETA-ACO places VMs in descending order by their traffic demands. The third scheme constructs new solutions by spreading the components of the best solution over a group of constructed solutions. Simulation results demonstrate that the three novel schemes are effective in adapting ETA-ACO for the VMP problem. Besides, ETA-ACO outperforms a number of state-of-the-art heuristics and metaheuristics in terms of solution quality.
A comparative study on the online shopping willingness of fresh agricultural products between experienced consumers and potential consumers
2021, Sustainable Computing: Informatics and Systems
Citation Excerpt :
Third, enterprises should improve the logistics service system to improve the perceived value of consumers. On the one hand, fresh e-commerce enterprises are required to speed up the research and development of cold chain logistics technology, improve the construction of logistics infrastructure system, and reduce the possible consumption of fresh agricultural products from the place of origin to the hands of consumers to the minimum [24]; on the other hand, they are required to integrate the existing logistics resources of the society and develop high-quality logistics and distribution services to meet the specific needs of consumers in combination with the actual conditions of enterprises so as to improve consumers’ perceived value and satisfaction and enhance their online shopping willingness. Fourth, enterprises should strengthen the credibility of online word-of-mouth and improve the guiding role of online word-of-mouth.
Fresh e-commerce has developed rapidly in the past decade. However, compared with the huge number of Internet users and the huge scale of network economy in China, the utilization rate of fresh e-commerce is relatively low and the profit rate is not ideal. Based on the online survey data of 202 experienced consumers and 192 potential consumers, Amos structural equation model is used to empirically analyze the main factors that affect the online purchase willingness of fresh agricultural products of different types of consumers. The results show that: product quality, logistics service quality, online word of mouth and website information quality have significant impact on experienced consumers to varying degrees. Product quality, online word of mouth and logistics service quality have significant impact on potential consumers, but website information quality has no significant impact on potential consumers. On the basis of strictly controlling product quality and improving cold chain logistics system, fresh e-commerce enterprises should strengthen the comparative advantage of online shopping, strengthen the publicity and guidance to consumers, enhance the attraction of fresh agricultural products online shopping to consumers, promote consumers' choice and form online shopping habits.
An optimal energy efficient cross-layer routing in MANETs
2020, Sustainable Computing: Informatics and Systems
In mobile ad hoc network (MANET), battery power is a significant resource for mobile devices. Hence conservation of energy and prolonging the network lifetime must be considered when designing routing protocols. But design of an energy-efficient routing protocol requires various factors of nodes from different layers, such as remaining energy, total traffic load and the number of channel contentions. Traditional layered approach has been found ineffective in handling power-related problems which can affect all the layers of the stack. To solve these issues, we propose a technique known as cross-layer routing protocol which utilize particle-swarm optimization (PSO) algorithm. The data success rate, node mobility and predicted remaining energy are measured from the network layer and PSO algorithm is applied to establish stable and energy efficient paths. After establishing the set of paths using PSO, the network contention is measured from the MAC layer and the contention window (CW) is adjusted dynamically based on the measured contention and predicted remaining energy. Simulation results prove that the proposed technique provides increased delivery ratio of packet with reduced consumption of energy and overhead when compared with existing techniques.

View all citing articles on Scopus

View full text

Energy efficient VM scheduling and routing in multi-tenant cloud data center

Highlights

Abstract

Introduction

Section snippets

Related work

Data center model

Problem formulation

Energy efficient VM scheduling and routing

Experimental results

Conclusion

Future Gener. Comput. Syst.

J. Netw. Comput. Appl.

2012 IEEE 11th International Symposium on Network Computing and Applications

United States Data Center Energy Usage Report, Technical Report LBNL-1005775

Epobf: energy efficient allocation of virtual machines in high performance computing cloud

Transactions on Large-Scale Data-and Knowledge-Centered Systems XVI

Makespan and energy robust stochastic static resource allocation of a bag-of-tasks to a heterogeneous computing system

IEEE Trans. Parallel Distribut. Syst.

Greendcn: A general framework for achieving energy efficiency in data center networks

IEEE J. Select. Areas Commun.

Profiling-based workload consolidation and migration in virtualized data centers

IEEE Trans. Parallel Distribut. Syst.

A pure nash equilibrium-based game theoretical method for data replication across multiple servers

IEEE Trans. Knowledge Data Eng.

Greeaodv: an energy efficient routing protocol for vehicular ad hoc networks

International Conference on Intelligent Computing