1 Introduction

Nowadays, Massively Multiplayer Online Games (MMOGs) has attracted a lot of attention from both academia and industry. A recent report [1] states that the number of MMOG players have grown to 20 million worldwide persons in 2010. A MMOG is a seamless virtual world where millions of world-wide participants play roles and interact with their surroundings via avatars, which are virtual roles. There are several types of MMOGs, including MMORPG (Massively Multiplayer online role-playing games, e.g., World of Warcraft), MMOFPS (Massively Multiplayer Online First Person Shooter, e.g., Firefall), MMORTS (Massively Multiplayer Online Real Time Strategy, e.g., Boom Beach) and so on.

Online Gaming used to be implemented as a client-server architecture, where a central game server is responsible to handle commands from players and returns state updates about the game session to its connected clients. In order to provide good quality of experience to players, MMOG providers used to overly provision game servers in order to comply with the delay aware constraint. However, the static resource provisioning method would lead to resource waste during slower gaming times and increase MMOG providers’ total operational costs. To address this problem, MMOG operators have changed the system architecture from client/server to cloud computing. In MMOG clouds, games are stored and run on remote cloud servers. After receiving a player request, the MMOG cloud platform will allocated a virtual machine (VM) to run the player’s requests.

A main challenge for MMOG service providers is to find the best tradeoff between two contradictory aims: improving the quality of experience (QoE) and reducing the cost of energy consumption by MMOG infrastructures. On the one hand, they need to offer sufficient number of high-performance servers that enable high quality of experiences that lead to player satisfaction and loyalty. On the other hand, they have to reduce energy consumption as much as possible in order to reduce the total cost of ownership and increases the return on investment of cloud infrastructures. For example, leading MMOG companies operate over 10,000 servers and cost over $50 million per year [2].

In this paper, we propose a dynamic resource provisioning scheme that is able to automatically manage physical resources of a MMOG cloud infrastructure in such a way to minimize the energy cost of MMOG service providers while achieving just-good-enough QoE for game players. More specifically, the primary contributions of this paper are as follows:

  • We consider the problem of dynamic resource provisioning from the perspective of MMOG operators with their own datacenter, and aim at cutting down the energy cost of MMOG operators, by combines the global resource allocation among physical servers with the QoE-driven dynamic resource distribution at physical server level.

  • We formulate the problem into a constrained optimization problem and utilize the genetic algorithm to solve the problem. Our proposed optimization model considers multiple types of resources, including CPU, memory and network bandwidth, and the power consumption incurred due to migrating virtual machines.

  • We conduct extensive experiments using real-world data to evaluate the effectiveness of our resource provisioning policy. Our experimental results show that, compared with other alternatives, our resource provisioning policy can save at least 9.8% of energy consumption, while providing even better QoE for game players.

The rest of this paper is structured as follows. The related works are summarized in Sect. 2. In Sect. 3, the system model is introduced. Our proposed genetic algorithm is presented in Sect. 4. In Sect. 5, the experimental evaluations and results are discussed. Finally, conclusions are made and future work is discussed in the last section.

2 Related Work

In recent years, extensive efforts have been put into the research of enhancing the performance of cloud gaming systems. Wang et al. [3] presented a load assignment solution for cloud-based distributed interactive applications in order to minimize the inter-action delay among clients. Choy et al. [4] proposed a hybrid edge-cloud architecture for reducing the processing delay on the server side. Chen et al. [5] developed a heuristic algorithm to solve the inter-player delay optimization problem that aims at minimizing this inter-player delay, while preserving good-enough absolute response delay experienced by players. These studies are different from our work because they focus solely on improving the performance of cloud gaming systems and thus cannot provide energy savings.

There is a huge body of work on how to provide a tradeoff between monetary costs and the user QoE for hosting cloud gaming in public clouds. Wang ea al. [6] proposed two practical online algorithms, one deterministic and another randomized, that dynamically combine the two instance to serve time-varying demands at minimum cost. Nae ea al. [7] proposed a service-level-agreement aware cost-effective model for hosting and operating MMOGs based on cloud-computing principles. Basiri et al. [8] addressed the resource provisioning problem for cloud gaming to minimize the operational costs of the cloud gaming while guaranteeing the QoE for users. In contrast to our approach, these approaches are not designed for the private cloud environment. That is, they have acquired resources (virtual machines) from public clouds to serve as rendering and game servers.

Some server provisioning issues for cloud gaming providers with their own datacenter have been studied recently. Wu et al. [9] presented an online control algorithm to perform intelligent request dispatching and server provisioning. The objective of their work is to reduce the provisioning cost of cloud gaming service providers while still ensuring the user QoE requirements. Hong et al. [10] studied the VM placement problems for maximizing the total net profit for service providers while maintaining just-good-enough gaming QoE, Lee et al. [11] proposed a zone-based server consolidation solution for MMORPGs, which exploits the unique spatial locality property of gamers’ interactions, to reduce hardware investment and energy consumption while maintaining user-perceived service quality. In contrast, our scheme focuses on CPU, memory and network resources, and combines the global resource allocation among physical servers with the QoE-driven dynamic resource distribution at physical server level in order to guarantee the QoE for game players while minimizing power consumption.

3 Problem Formulation

3.1 System Overview

We consider a large-scale cloud infrastructure to support MMOG. A virtualized computing environment is assumed, where there are multiple MMOGs hosted in VMs. The MMOG clouds use two techniques to partition a virtual game world for parallelizing: zoning and replication. Zoning partitions the game world into adjacent areas to be handled independently by separate VMs. Replication copies the game area onto different VMs when the game area has a large number of avatars interacting with each other. The avatars served by a VM are called active entities. The avatars which are active in the other VMs are called shadow entities. The state of active entities and shadow entities are periodically synchronized across VMs. Multiple VMs can run on the same physical machine (PM).

The architecture of our resource provisioning scheme is shown in Fig. 1(a). The Historical Database module is employed to collect history data of game players and send collected data to the Load Predictor. Next, the Load Predictor module uses neural network to predict the number of players in the next control period by history data. The Energy Optimizer module is then updated using the predicted number of players hosted by each VM and a new instance of the optimization problem is constructed. The reconfiguration module utilizes a capacity model to estimate the resource needs of VMs and implements a genetic algorithm to solve the new optimization problem instance, generating a new resource provisioning scheme (i.e. how much resource is allocated to each VM, which servers must be active and which VMs can be placed on which server). The above optimization process is periodically performed.

Fig. 1.
figure 1

The architecture of our resource provisioning scheme and neural network predictor

3.2 Load Prediction

The number of players hosted by a game area (VM) can change over time. It is difficult to determine the number of players by analytical means, therefore we employ a neural network prediction model to predict the number of players in the near future. Figure 1(b) illustrates the architecture of our load predictor. The load predictor has two hidden layers and one output layer. We input the number of players from time slot 1 to time slot 20 and predict the number of players at time slot 21. The hidden layers has 10 neurons. Each layer’s neuron performs calculations based on the following equations [12]:

$$ k_{j} = g_{1} \left( {\sum\nolimits_{i = 1}^{20} {x_{i} w_{i,j} + b_{j} } } \right)\quad \forall j \in \left\{ {1,2, \ldots ,10} \right\} $$
(1)
$$ x_{21} = g_{2} \left( {\sum\nolimits_{i = 1}^{10} {k_{i} w^{\prime}_{i} + b^{\prime}} } \right) $$
(2)
$$ g_{1} \left( y \right) = \frac{1}{{1 + e^{ - y} }} $$
(3)
$$ g_{2} \left( y \right) = y $$
(4)

where \( w_{i,j} \) and \( w^{\prime}_{i} \) are the connection weights between hidden layer and output layer, \( g_{1} \left( y \right) \) and \( g_{2} \left( y \right) \) are is the activation function, and \( b_{j} \) and \( b^{\prime} \) are biases in the hidden layer and the output layer, respectively. In addition, \( x_{i} \)(i = 1, 2, …, 20) and \( k_{i} \) are an input of the load predictor and the output layer, respectively. \( x_{21} \) is an output. The Levenberg-Marquardt algorithm is employed for the network training and the neural network predictor is trained only once in an offline fashion.

3.3 Delay Model

Interaction delay is the most critical user-perceived QoE metric for the MMOG cloud. We define the interaction delay as the lag between the time the client sends a player’s command to the VM and the time the corresponding game frame is displayed to the player. Interaction delay mainly consists of network delay, processing delay and playout delay. Network delay is essentially the network round-trip time, which can be measured by tools such as Ping. Processing delay represents the time required by VM to receive and process a player’s command, and to encode and packetize the corresponding game frame for the client. As for playout delay, it is the time for the client to receive, decode, and display the encoded frame. Because playout delay is usually constant and occur at the client side, we do not consider it in our model for the sake of brevity. Finally, we define the interaction delay \( ID_{i,j} \) of a game player i connected to a VM j as the sum of processing delay \( PD_{i,j} \) and network delay \( ND_{i,j} \), which can be represented as below

$$ ID_{i,j} = ND_{i,j} + PD_{i,j} $$
(5)
$$ ND_{i,j} = D_{i,j}^{player - VM} + D_{i,j}^{VM - player} + 2D_{i,j}^{propagation} $$
(6)
$$ D_{i,j}^{propagation} = S_{i,j} /R_{l} $$
(7)
$$ D_{i,j}^{player - VM} = L_{i} /R_{i}^{ul} $$
(8)
$$ D_{i,j}^{VM - player} = L_{j} /R_{i}^{dl} $$
(9)

In these expressions, \( R_{i}^{ul} \) and \( R_{i}^{dl} \) are the uplink and downlink data rate of player i, \( S_{i,j} \) is the distance between user i and VM j, \( R_{l} \) is the speed of light, \( D_{i,j}^{player - VM} \) is the transmission delay for player’s data sent to VM, and \( D_{i,j}^{VM - player} \) is the transmission delay for VM’s data sent to the player. \( D_{i,j}^{propagation} \) represents propagation delay between player i and VM j. \( L_{i} \) is the player’s data-packet length, which is transmitted to VM, and \( L_{j} \) is the VM’s data-packet length, which is transmitted to player.

In order to mathematically express the processing delay, each VM is modeled as a G/G/1 queuing system to deal with an arbitrary arrival distribution and service time distribution. The queuing discipline is assumed to be first-comefirst-serve (FCFS). According to the queuing foundations, the average processing time of requests served at VM j can be calculated using the following equation

$$ PD_{i,j} = \frac{{\lambda_{j} \cdot \left( {\delta_{1}^{2} + \delta_{2}^{2} } \right)}}{{2 \cdot \mu_{j} \cdot \left( {\mu_{j} - \lambda_{j} } \right)}} $$
(10)

where \( \lambda_{j} \) is players arrival rates and \( \mu_{j} \) is the average service time. \( \delta_{1}^{2} \) and \( \delta_{2}^{2} \) are the variance of service time and the variance of inter-arrival time, respectively.

3.4 VM Capacity Model

We propose in this subsection an analytical model for VM capacity in MMOG clouds. Our model considers three main types of resources used by VMs: CPU, memory, and network bandwidth.

Previous work [13] shows that CPU capacity is approximately proportional to its throughput. Thus, the CPU capacity \( R_{j}^{CPU} \) of VM j can be modeled as:

$$ R_{j}^{CPU} = e \cdot t_{j} $$
(11)

where e is an experimentally determined model parameter and \( t_{j} = 1/\mu_{j} \) represents CPU throughput.

Inspired by the linear capacity model in [14], we can define the memory capacity \( R_{j}^{mem} \) for a VM j as follow

$$ R_{j}^{mem} = \sum\nolimits_{i = 1}^{{\left| {v_{j} } \right|}} {AE_{i} \cdot m_{cs} + BE_{j} \cdot m_{es} + m_{game} + m_{world} } $$
(12)

where \( v_{j} \) is the set of VMs serving in the same game area as VM j, \( AE_{i} \) is the number of the avatars for VM i, \( BE_{j} \) is the number of the non-player characters for VM j, \( m_{cs} \) is the amount of memory needed to store the state of one avatar, \( m_{es} \) is the amount of memory needed to store the state of an non-player character, \( m_{game} \) is the amount of memory needed to run the actual game engine with no game world loaded and \( m_{world} \) is the amount of memory used for the game world being played.

Each VM has an incoming and outgoing network bandwidth capacity. As in [14], we define the outgoing network bandwidth capacity \( R_{j}^{out} \) as follows

$$ R_{j}^{out} = \frac{{AE_{j} \cdot d_{out} + \left( {\left| {v_{j} } \right| - 1} \right) \cdot \left( {AE_{j} + BE_{j} } \right) \cdot d_{updt} }}{{T_{s} }} $$
(13)

where \( d_{out} \) represents the amount of data sent to a client, \( d_{updt} \) is the amount of data exchanged between VMs for updating a single entity state and \( T_{s} \) is the control time period.

Similarly, the incoming network bandwidth capacity \( R_{j}^{in} \) for a VM j is defined as [14]

$$ R_{j}^{in} = \frac{{AE_{j} \cdot d_{in} + \sum\nolimits_{i = 1,i \ne j}^{{\left| {v_{j} } \right|}} {\left( {AE_{i} + BE_{i} } \right) \cdot d_{updt} } }}{{T_{s} }} $$
(14)

where \( d_{in} \) is the amount of data received from a client.

3.5 Power Model

The processor is one of the largest power consumers in today’s servers. Recent study has shown that the server power consumption can be accurately described by a linear relationship between the power consumption and CPU utilization [15]. In order to save energy, servers are switched off when they are idle. Because today’s server use very little power when in power-off mode, we neglect the power consumption during power-off mode. Finally, the server power consumption \( P_{i} \) can be modeled as

$$ P_{i} = \left\{ {\begin{array}{*{20}l} {(P_{i}^{max} - P_{i}^{min} ) \cdot U_{i}^{CPU} + P_{i}^{min} } \hfill & {U_{i}^{CPU} > 0} \hfill \\ 0 \hfill & {otherwise} \hfill \\ \end{array} } \right. $$
(15)

where \( U_{i}^{CPU} \) is CPU utilization of server i, \( P_{i}^{min} \) and \( P_{i}^{max} \) are the average power consumption when the server i is idle and fully utilized, respectively.

Similar to previous work [16], we also consider the power consumption costs incurred when moving VMs. The power consumption \( MP_{j}^{s,i} \) of migrating a VM j from a server s to a server i can be expressed as follows:

$$ MP_{j}^{s,i} = v_{s,i} \cdot R_{j}^{mem} + z_{s,i} $$
(16)

where \( v_{s,i} \) and are \( z_{s,i} \) experimentally determined model parameters.

3.6 Optimization Model

Suppose that we are given N VMs \( i \in NV \) that are to be placed on M physical servers \( j \in NP \). Let \( CN_{j} \) be the number of players connected to VM j, \( ID^{th} \) be max tolerable interaction delay of players, \( R_{i}^{CT} ,R_{i}^{mT} ,R_{i}^{oT} \), and \( R_{i}^{iT} \) are the resource threshold of CPU, memory, outgoing and incoming network bandwidth associated with each server respectively. The decision variable \( x_{i,j} \) indicates if VM i is assigned to physical server j. Let \( \overline{{p_{j} }} \) be an index of the server hosing VM j in the previous control time period and \( \overline{{x_{i,j} }} \) be the values of the variables \( x_{i,j} \) in the previous control time period. With the models defined above, we formulate the power optimization problem as:

$$ min\quad \quad \left[ {\sum\nolimits_{i = 1}^{M} {P_{i} } + \sum\nolimits_{i = 1}^{M} {\sum\nolimits_{j = 1}^{N} {\left[ {MP_{j}^{{\overline{{p_{j} }} ,i}} \cdot \hbox{max} \left( {0, x_{i,j} - \overline{{x_{i,j} }} } \right)} \right]} } } \right] $$
(17)
$$ s.t.\quad \quad \sum\nolimits_{j = 1}^{N} {x_{i,j} \cdot R_{j}^{CPU} \le R_{i}^{CT} } \quad \quad \forall i \in NP $$
(18)
$$ \sum\nolimits_{j = 1}^{N} {x_{i,j} \cdot R_{j}^{mem} \le R_{i}^{mT} } \quad \quad \forall i \in NP $$
(19)
$$ \sum\nolimits_{j = 1}^{N} {x_{i,j} \cdot R_{j}^{out} \le R_{i}^{oT} } \quad \quad \forall i \in NP $$
(20)
$$ \sum\nolimits_{j = 1}^{N} {x_{i,j} \cdot R_{j}^{in} \le R_{i}^{iT} } \quad \quad \forall i \in NP $$
(21)
$$ \sum\nolimits_{i = 1}^{M} {x_{i,j} } = 1\quad \quad \forall j \in NV $$
(22)
$$ ID_{p,j} \le ID^{th} \quad \quad \forall j \in NV, \forall p \in CN_{j} $$
(23)
$$ x_{i,j} \in \left\{ {0,1} \right\}\quad \quad \forall i \in NP,\forall j \in NV $$
(24)

The objective function in Eq. (17) minimizes the power consumed by the MMOG cloud. Equations (18), (19), (20) and (21) impose CPU, memory, outgoing and incoming network bandwidth constraints on each sever, respectively. Equation (22) ensures that each VM is assigned to only one of the servers. Equation (23) ensures the QoE requirement. Equation (24) defines the domain of the variables of the problem.

4 Genetic Algorithm Design and Analysis

Due to the NP-Hard nature of the optimization problem described above, it is not possible to develop an algorithm to find the best solutions in practically acceptable times. This section will show how to apply a genetic algorithm to efficiently search for good solutions in large solution spaces. The proposed genetic algorithm is mainly related to the following elements: (1) chromosome encoding; (2) crossover; (3) mutation; (4) fitness function; (5) selection strategy. Specific implementation of these elements results in distinct genetic algorithms with varying degrees of success.

4.1 Chromosome Encoding

A chromosome in the proposed genetic algorithm is composed of an N-by-M matrix. The columns of the matrix correspond to the different physical machines (PMs) and the rows correspond to the different VMs. The elements of the matrix represent genes. The value of a gene is 0 or 1, representing if a VM is assigned to a physical machine. Figure 2(a) shows an example VM placement and its corresponding chromosome.

Fig. 2.
figure 2

Chromosome encoding, crossover and mutation operators for our genetic algorithm

4.2 Crossover

Crossover is a genetic operator that aims to combine the better characteristics among the preferred chromosomes. At present, various crossover methods have been developed for particular problems to provide effective implementation of genetic algorithms.

The proposed genetic algorithm adopts a single-point crossover operator, which is described in Fig. 2(c). The two parent chromosomes swap a row of genes randomly selected with each other, so that produce new chromosomes.

4.3 Mutation

The mutation operator is similar to hill-climbing method, where a small change is made to a current solution in order to explore its neighborhood solution in the search space. The idea behind mutation is to introduce some extra genetic variability into the population. In our work, the mutation operator firstly randomly selects a gene in the chromosome and then inverts the value of the chosen gene. Finally, the value of other genes in the same row is either 0 or 1 satisfying the condition that the sum of the value of all genes in the row is equal to 1. Figure 2(b) shows how the mutation operator operates.

4.4 Fitness Function

The role of the fitness function is to numerically measure the performance of the chromosome. For real-world applications of genetic algorithms, choosing the fitness function is the most important step. In this paper, the fitness f(x) of an individual x in the population of the proposed genetic algorithm is designed in Eq. (25) below:

$$ f\left( x \right) = \left\{ {\begin{array}{*{20}l} {E_{min} /E\left( x \right)} \hfill & {if\,x\,is\,fesible} \hfill \\ {E_{min} /E\left( x \right) + \infty } \hfill & {otherwise} \hfill \\ \end{array} } \right. $$
(25)

Where E represents the objective function, E(x) is the power consumption of the current solution x and \( E_{min} \) represents the power consumption of the best solution in the previous population. The fitness function penalizes an infeasible solution, and make sure that the fitness value of any feasible solution is more than that of any infeasible solution and that the less power consumption and the greater the fitness value is.

4.5 Selection Strategy

The selection strategy copes with which of the chromosomes in the current population will be used to reproduce child in hopes that offspring will have even higher fitness. We employ rank-based roulette wheel selection as a selection mechanism. Rank-based selection strategies firstly sort individuals in the population according to their fitness and then computes selection probabilities according to their ranks rather than fitness values. Let NS be the number of solutions in each population in genetic algorithm. Then the selection probability, P(x) for individual x is define as:

$$ P\left( x \right) = \frac{R\left( x \right)}{{\sum\nolimits_{i = 1}^{NS} {R\left( x \right)} }} $$
(26)

In Eq. (26), \( R\left( x \right) \) represents the rank of individual x and can be scaled linearly using the following formula [17]

$$ R\left( x \right) = 2 - SP + \left[ {2 \times \left( {SP - 1} \right) \times \frac{{\left( {L\left( x \right) - 1} \right)}}{{\left( {NS - 1} \right)}}} \right] $$
(27)

Where SP is the selective pressure and limited to the range [1, 2]. L(x) represents the position of individual x in the sorted population.

4.6 The Description of the Proposed Genetic Algorithm

Based on the above definitions, the proposed genetic algorithm can be summarized in the following steps.

  1. 1.

    Randomly generate an initial population of NS chromosomes.

  2. 2.

    Select NS/2 pairs of chromosomes from a current population according to the selection strategy.

  3. 3.

    Apply the crossover operators to each of the selected pairs in Step 2 to generate NS chromosomes with a predefined crossover probability CP.

  4. 4.

    Apply the mutation operators to each of the generated NS chromosomes with a predefined mutation probability MP.

  5. 5.

    Randomly remove one chromosome from the current population and add the best chromosome in the previous population to the current one.

  6. 6.

    If the maximum number MI of iterations is reached, stop this algorithm. Otherwise, return to Step 2.

5 Performance Evaluation

5.1 Experimental Environment

Based on open source software and commodity hardware, we have implemented a MMOG cloud that has 6 physical servers of two different types, including 3 IBM Flex System x220 servers, 3 IBM Flex System x440 servers. Their parameters are listed in Table 1. All the physical servers are connected through a 10 Gigabit Ethernet network. We adopt Xen Server as the virtualization software on physical servers and utilize OpenStack to create and manage VMs. There are 62 VMs running on 6 physical servers in the MMOG cloud. These VMs are divided into three finer categories: a proxy server for dispatching requests and a monitor server as well as 60 game servers. Xen’s credit scheduler, Xen’s balloon driver and weight-based proportional sharing [18] are used to dynamically adjust the allocation of CPU, memory and network resources needed by VMs, respectively. The proxy server and monitor server are both equipped with single Intel Xeon E5-4620 2.2 GHz core and 2 GB RAM. Network bandwidth between the proxy server and the player’s computer is 100 Mbps, while network bandwidth between the proxy server and game servers is 10 Gbps.

Table 1. Specification of two types of server used in our testbed

We employ the Ganglia Monitoring System [19] to monitor and collect the information of the MMOG cloud. Therefore, Ganglia Monitoring Daemon is installed on each game server to monitor its status, and Ganglia Meta Daemon is installed on the monitor server to collect the monitoring information. We select the BZFlag MMOFPS [20] and the Stendhal MMORPG [21] as the example applications for the evaluation of the proposed resource provisioning strategy. The BZFlag MMOFPS is deployed on 35 VMs and the Stendhal MMORPG is deployed on 25 VMs. We utilize the API provided by the KBEngine [22] game engine to implement an account generator that is responsible for creating a large number of players for the experimental game, and a proxy player machine that is responsible for generating the action requests and automatically sending them on behalf of the players to the game server. In addition, players can move across game maps dynamically. The power consumption of the cloud infrastructures is measured with a WattsUp Pro power meter, which has an accuracy of ±1% of the measured value.

The workloads used in our experiments are generated based on the web traces from World of Warcraft (WoW) website [23]. We choose two kinds of 284-min traffic pattern from the traces and then scaled them to the range of the number of players that our experimental setup can handle. Figure 3 shows these scaled workloads and the data are sampled every 2 min. The setting values for various parameters in the genetic algorithm have a direct effect on the algorithm performance. Appropriate parameter values were determined on the basis of preliminary computational experiments. The final parameter settings were determined to be NS = 50, CP = 0.5, MP = 0.9 and MI = 100.

Fig. 3.
figure 3

Workload traces for two MMOG applications

We choose a control interval \( T_{s} \) of two minutes. The number of control periods is 142. Every two minutes, our resource provisioning scheme controls VM resizing and the number of servers running in active and power-off modes according to system workload. When reconfiguring the MMOG cloud, the overhead of the physical server boot time must be considered since it greatly affects the performance during boot time. In this work, we use the double control periods (DCP) model proposed by Zheng et al. [24] to compensate the overhead. One control period is responsible for adjusting the number of active physical servers, and the other control period helps to switch on the additional servers in advance. Finally, in order to reduce the total migration time of all migrated VMs, we utilize simultaneous migrations to shorten the total migration time of all migrated VMs, that is, the MMOG cloud can perform multiple VM migrations simultaneously on different servers as long as these servers are not busy with other VM migrations.

5.2 Prediction Model Validation

In this section, we compare the neural network prediction against the well-known ARMA model. To evaluate the accuracy of the prediction models, we define two metrics, the mean absolute percentage error (MAPE) and the coefficient of determination \( \left( {{\text{R}}^{2} } \right) \), which is calculated as follows:

$$ MAPE = \frac{1}{n}\sum\nolimits_{i = 1}^{n} {\left| {\frac{{\gamma_{i} - \widehat{\gamma }_{i} }}{{\gamma_{i} }}} \right|} $$
(28)
$$ R^{2} = 1 - \frac{{\sum\nolimits_{i = 1}^{n} {\left( {\gamma_{i} - \widehat{\gamma }_{i} } \right)}^{2} }}{{\sum\nolimits_{i = 1}^{n} {\left( {\gamma_{i} - \widehat{\gamma }} \right)}^{2} }} $$
(29)

where n is the total number of samples, \( \gamma_{i} \) and \( \widehat{\gamma }_{i} \) are the real value and the model-predicted value, and \( \bar{\gamma } \) is the sample mean of \( \gamma_{i} \). We run our neural network predictor and the ARMA predictor on MATLAB and use the traffic pattern for the Stendhal MMORPG as the input data of two predictors. The predicted results are shown in Fig. 4 and Table 2. The result shows that our neural network predictor is significantly better than the ARMA model and further validates the effectiveness of the proposed neural network predictor.

Fig. 4.
figure 4

Workload prediction using ARMA model and neural network model

Table 2. Predictive accuracy of the ARMA predictor and the neural network predictor.

5.3 Evaluating Effectiveness

To evaluate the effectiveness of our resource management scheme, we compared the power consumption and performance for our policy and for two other policies including a Linux performance policy, and a power and delay aware placement policy (QDH). The performance policy is a standard Linux baseline policy that sets the CPU speed to the highest available frequency all the time. The performance policy provides no VM consolidation and reproduces the behavior of a typical MMOG cloud. The QDH policy utilizes a heuristic proposed by Hong et al. [10] to dynamically consolidate VMs in order to minimize power consumption while maintaining just-good-enough gaming QoE. We conduct a set of experiments using two MMOG applications deployed in 60 VMs. The VM placement for the performance policy is statically configured such that interaction delay targets can be met under the peak workload. The interaction delay target for two MMOG applications are set to 0.1 s and 0.2 s, respectively.

The interaction delay results for three schemes are shown in Fig. 5(a) and (b). The horizontal straight line represents the interaction delay targets. As expected, the performance scheme meets desired QoE targets at any time. In comparison, our policy and QDH policy maintain substantially interaction delay below a predefined threshold. However, compared to the QDH policy, our approach is able to provide better assurance of interaction delay targets in the face of dynamically changing workloads. The reason is that we develop more comprehensive system models which take account of three types of resources and heterogeneous server types. The better performance of our policy compared to the QDH policy is also confirmed by a smaller value of QoE degradation shown in Table 3. The QoE degradation is defined as a ratio of the number of players missing their interaction delay targets to the total number of players served over the experiment.

Fig. 5.
figure 5

The results of effectiveness and scalability evaluation

Table 3. Comparison of three resource provisioning policies

Figure 5(c) and Table 3 show the power consumption and the energy consumption during a 284-min period for three policies, respectively. As can be seen from Table 3, the energy consumption reduction from our policy to the performance policy is about 54.5%. This is because, in the case of low loads, VMs are consolidated using live migration, hence a smaller number of the physical servers are running while other servers are turned off, thus gives us significant energy saving. Comparing our policy with the QDH policy, the results indicate an energy reduction of about 9.8%. This small difference is mainly obtained by finding better solution for VM placement problem.

In summary, compared to the performance scheme, our policy provides significant energy savings with sacrificing a little performance. Compared to the QDH policy, our policy provides better performance and saves more energy.

5.4 Evaluating Scalability

The last set of experiments is used to examine whether the proposed resource provisioning is scalable for large size of the MMOG cloud with thousands of VMs. The scalability of our resource provisioning scheme is determined by the computational complexity of the genetic algorithm for VM placement problem. Since the consolidation ratio, which is defined as the average number of VMs hosted per server, is an important factor which affects the time to solve the VM placement problem, we considered three type of consolidation ratio for this experiment: 2:1, 4:1 and 6:1. The execution time of the genetic algorithm was measured on a 3.8 GHz Intel Core i7 machine. Figure 5(d) shows the computation time curve for three consolidation ratio. From the figure we can see that the execution time of the genetic algorithm increases as the consolidation ratio decreases. It is because that, when the consolidation ratio decreases, the number of physical servers accommodating VMs rises up, making the placement problem harder. In addition, we can see that the genetic algorithm takes less than 10 min to solve the difficult placement problem of up to 6000 VMs. Therefore, our resource provisioning scheme can be suitable for large-scale MMOG clouds.

6 Conclusion

In this paper, we formulate the problem of dynamic resource provisioning from the perspective of cloud gaming providers with their own datacenter into a constrained optimization problem and utilize the genetic algorithm to solve the problem. Our experimental results show that, compared with other alternatives, our resource provisioning policy provides significant energy savings while achieving just-good-enough QoE for gamers. As continuity of this work, we propose to use the social structure of the players in online games to predict the number of players.