Hybrid ant genetic algorithm for efficient task scheduling in cloud data centers

https://doi.org/10.1016/j.compeleceng.2021.107419Get rights and content

Highlights

Abstract

Cloud computing is a computing paradigm which meets the computational and storage demands of end users. Cloud-based data centers need to continually improve their performance due to exponential increase in service demands. Efficient task scheduling is essential part of cloud computing to achieve maximum throughput, minimum response time, reduced energy consumption and optimal utilization of resources. Bio-inspired algorithms can solve task scheduling difficulties effectively, but they need a lot of computational power and time due to high workload and complexity of the cloud environment. In this research work, Hybrid ant genetic algorithm for task scheduling is proposed. The proposed algorithm adopts features of genetic algorithm and ant colony algorithm and divides tasks and virtual machines into smaller groups. After allocation of tasks, pheromone is added to virtual machines. The proposed algorithm effectively reduces solution space by dividing tasks into groups and by detecting loaded virtual machines. Due to the minimum solution space of proposed algorithm, convergence and response time is significantly decreased. It finds a feasible scheduling solution to minimize the running time of workflows and tasks. The proposed algorithm achieved 64% decrease in execution time and 11% decrease in overall data center costs.

Introduction

Cloud computing is a rapidly emerging field that aims to provide computing resources on demand. Its tremendous growth is due to its scalable, dynamic, and customizable environment. Moreover, it provides virtual resource abstraction and hides technical aspects of management. Currently, cloud computing has three administrative models that provide software, infrastructure and platform services [23], [24], [25]. However, researchers have also proposed other models, such as cloud and sensor integration platform (SC-iPaaS), health platform as a Service (HPas), and social cloud. These models operate on a pay-per-use policy; thus, they do not require business enterprises to invest in their network infrastructure. Instead, organizations get their computing and storage resources from public cloud providers. Some organizations also have private clouds, but they can opt for public clouds if they need additional resources. The cloud provider is responsible for providing resources on demand. A cloud can deal with abrupt changes in workload demands by offering auto-scaling feature. There are two types of scaling in data centers, vertical scaling and horizontal scaling. A cloud data center may contain thousands of machines. Energy consumption is a critical concern due to unprecedented growth of the cloud data centers. High energy consumption increases operational costs of data centers and emission of carbon dioxide (CO2) gases into the environment. Resource optimization is required for the reduction in energy usage and minimizing the operational cost of data centers.

Task scheduling in data centers is a critical research area. Load balancing, scalability, and performance of the data centers depend on efficient task scheduling mechanism. Task scheduling aims to map the best resources to the workload to minimize the running time of workflow and improve resource utilization. In cloud data centers, many applications execute in parallel. Parallelism cannot be fully achieved if jobs are not properly scheduled. Inefficient scheduling results in high execution time, high cost and low resource utilization, reducing the overall performance of the cloud. Efficient scheduling algorithms are used to achieve the objectives of better resource utilization. Scheduling algorithms map tasks to virtual machines in a fashion that reduces the cost and execution time of workflows. Cloud providers’ services operate according to the pay-per-use policy. Quality-of-service (QoS) provided to the users is written in the Service level agreement (SLA). Every task has different computation, storage, bandwidth, and response time requirements. If a task cannot obtain its required resources, it results in SLA violation. If SLA violations are high in a data center, then the cloud provider's QoS is compromised.

The task scheduling problem is NP hard. It is assumed that NP-hard problems are difficult to solve by any algorithm in polynomial time. Researchers have proposed many heuristics to solve the problem of task scheduling. Scheduling algorithms have two types, the first is heuristic-based, and the second is a random guided search based. Heuristic-based algorithms achieve good results in task scheduling problems and can be executed on both homogeneous and heterogeneous systems. These can be further divided into various types: list-based heuristic, rule-based heuristic, and meta-heuristic. Rule-based heuristics algorithms are traditional algorithms, for example Max-Min [1], Min-Min [2], and Minimum Execution Time (MCT) [3]. Max-Min algorithms first allocate tasks with maximum execution time to the processor, followed by smaller tasks. Min-Min algorithms give first preference to the execution of smaller tasks. MCT algorithms allocate tasks to the processor according to what it can execute in a short time. Rule based heuristic algorithms are simple and their performance is very good, but when the workload is higher, their performance is compromised. Famous list-based heuristic algorithms are Heterogeneous earliest fish time (HEFT) [4], Critical path on processor (CPOP) [5], and Parental prioritization-based task scheduling in heterogeneous systems (PPEFT) [6]. In these algorithms, tasks are prioritized and allocated to the processors for the execution. Dynamic frequency scaling technique (DVFS) is also used for task scheduling to decrease energy utilization. DVFS works according to the fact that power consumption is directly proportional to frequency and voltage. These algorithms reduce the frequency of servers when tasks are not time critical, and low voltage is applied to machines. List-based heuristic algorithms perform better than rule-based heuristic algorithms when the workload is high.

Meta-heuristic algorithms, also called swarm intelligent algorithms, include bio-inspired algorithms like Ant colony optimization (ACO), Artificial bee colony algorithm (ABC), Genetic algorithm (GA), and Particle swarm optimization (PSO). In nature, each individual can solve the problem with limited capacity. Therefore, these algorithms can solve NP-hard problems effectively. For example, bees, ants, and birds can search for food with very little intelligence but feature self-organization and parallelism. All individuals can work without central control and can find an optimized solution. With bio-inspired algorithms, we can find an optimal solution of complex problems. GA works on the rule of inheritance, where features are inherited to the child from the parents. With the evolution of time, populations with good features survive. In GA [7], we begin with a random population then this population passes its properties to new offspring. The population with an optimized solution is selected and others are removed. Over time, the algorithm reaches the global optimal solution. ACO [8] works according to the nature of ants. Ants leave their nests for the food. They add pheromones on their path. More pheromone is added to the shortest path because ants return along that path most quickly. Other ants start to follow the path which has more pheromones. In this way, after some time a global optimum is achieved. ABC [9] simulates the foraging behavior of honeybees. Honeybees are divided into two groups: scout bees that find an optimal solution from food sources, and onlooker bees that find the new food source. PSO [10] is an optimization algorithm inspired by flocks of birds to find a global optimum.

With the increase in usage of mobile devices, social media, and low-cost connectivity to the Internet, cloud usage is increased. Cloud computing is used by the scientists to solve scientific problems because of its high computational power and storage capacity [20], [21], [22]. Many companies have billions of users, for example, in the 3rd quarter of 2020, Facebook records 2.7 billion active users. This means that 80 million people used the platform on a daily basis. To handle parallel requests from such a huge number of clients, it is necessary to schedule tasks to appropriate servers and VMs. Inappropriate scheduling can increase the cost of using cloud services, workload execution time, SLA violations, cost, and energy consumption, while decreasing the overall quality of cloud services. It is caused due to inefficient utilization of resources. According to one study of 5000 servers for six months, utilization was only 10 to 50%. As discussed earlier, the cloud computing model works according to a pay-per-use policy, and the above-mentioned problems can result in high costs for both users and cloud providers. Motivated by these issues, a Hybrid ant genetic algorithm (HAGA) for efficient task scheduling in the cloud data centers is proposed. The contributions of this paper are as follows:

  • (1)

    Proposed a HAGA algorithm for efficient task scheduling in clouds data centers.

  • (2)

    Designed a mutation technique.

  • (3)

    Introduced a new evaporation technique.

  • (4)

    Perform load balancing in datacenter.

  • (5)

    Provide large-scale comparisons of different algorithms.

  • (6)

    Results show that the HAGA has less computational time, response time and convergence time as compared to other algorithms used in comparison like Min-Min, Max-Min, Adaptive incremental genetic algorithm AIGA [11], and GA.

The proposed work was compared with a wide range of algorithms, including Max-Min, Min-Min, GA, and AIGA. The proposed HAGA algorithm outperforms other algorithms in the literature for different performance metrics like execution time, response time, fitness value, convergence time, and SLA violations.

The rest of the paper is organized as follows. In Section 2, relevant literature is reviewed and existing techniques are compared. In Section 3, objective function is defined. Section 4 describes the proposed HAGA algorithm, in detail. Section 5 evaluates the performance of the proposed algorithm, while Section 6 concludes the work.

Section snippets

Related work

In literature, many researchers have proposed different scheduling algorithms to achieve the maximum performance of the cloud. Jang et al. [12] presented a model in which a scheduling function is called by the task scheduler after every fixed scheduling interval. The proposed scheduler function evaluates resources required by the tasks and compares them with available resources to satisfy user's SLA. The function iterates and generates the optimal schedule. Duan et al. [11] proposed an

Task scheduling problem

Cloud data centers consist of thousands of physical servers on which virtual machines are running. User tasks run on virtual machines. A virtual machine is chosen on the basis of demands. The cloud environment has two types of scheduling. One is the selection of servers to run virtual machines on them. This type of scheduling directly affects the efficiency of data centers, energy consumption and utilization of resources. In the second type of scheduling, virtual machines are selected to run

Proposed approach

The genetic algorithm is an evolutionary algorithm and works on the principles of nature. In nature, individuals that can adapt themselves to the changing environment can survive while others cannot survive. Features of individuals are written on genes that are stored in chromosomes. Individuals with good environmental adaptability survive easily as compared to individuals with less adaptability to the environment. In genetic algorithm, young individuals are produced by the crossover and

Results and discussion

The proposed algorithm, HAGA, is simulated and results are compared for performance evaluation. The proposed algorithm is compared with two well-known scheduling algorithms Max-Min [1], Min-Min [2], and two famous meta-heuristics algorithms Standard genetic algorithm (GA) [7] and Adaptive incremental genetic algorithm (AIGA) [11]. Cloudsim plus [19] simulator is used to simulate algorithms. Cloudsim plus is a simulation tool which is built on the famous Cloudsim simulator, designed for large

Conclusion and future work

In a cloud computing environment, running time of tasks and workflows affects the performance of the data center. An increase in task execution time results in high user costs and operational costs. Evolutionary algorithms can efficiently solve NP-hard problems. These algorithms have a large solution space and take a lot of time and computational power during genetic operations. Their convergence time increases, and the response time of algorithms also increases exponentially. Proposed

CRediT authorship contribution statement

Muhammad Sohaib Ajmal: Conceptualization, Methodology, Data curation, Visualization, Software, Validation, Writing – original draft. Zeshan Iqbal: Methodology, Data curation, Software, Visualization, Validation. Farrukh Zeeshan Khan: Conceptualization, Methodology, Data curation, Visualization. Muneer Ahmad: Data curation, Visualization, Software, Validation, Supervision, Writing – original draft. Iftikhar Ahmad: Investigation, Methodology, Visualization, Validation. Brij B. Gupta:

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Muhammad Sohaib Ajmal: is student of MS Computer Science in University of Engineering and Technology Taxila, Pakistan. He is currently working as network design engineer. His research interests are focused on Optimization Algorithms, Networks, Virtualization, Cloud computing and Software Defined Networks.

References (25)

  • X. Li et al.

    Artificial bee colony algorithm with memory

    Appl Soft Comput J

    (2016)
  • H. Wang et al.

    Visual saliency guided complex image retrieval

    Pattern Recognit Lett

    (2020)
  • Y. Mao et al.

    Max-min task scheduling algorithm for load balance in cloud computing

    Proc. Int. Conf. Comput. Sci. Inf. Technol.

    (2014)
  • X. He et al.

    QoS guided min-min heuristic for grid task scheduling

    J Comput Sci Technol

    (2003)
  • N.A. Mehdi et al.

    Minimum completion time for power-aware scheduling in cloud computing

  • H. Topcuoglu et al.

    Performance-effective and low-complexity task scheduling for heterogeneous computing

    IEEE Trans Parallel Distrib Syst

    (2002)
  • A. Mazrekaj et al.

    The Experiential Heterogeneous Earliest Finish Time Algorithm for Task Scheduling in Clouds

    CLOSER

    (2019)
  • M.S. Arif et al.

    Parental prioritization-based task scheduling in heterogeneous systems

    Arab J Sci Eng

    (2019)
  • Z. Chenhong et al.

    Independent tasks scheduling based on genetic algorithm in cloud computing

  • G. Li et al.

    Ant colony optimization task scheduling algorithm for SWIM based on load balancing

    Futur Internet

    (2019)
  • M. Agarwal et al.

    A PSO algorithm based task scheduling in cloud computing

    Int J Appl Metaheuristic Comput

    (2019)
  • K. Duan et al.

    Adaptive incremental genetic algorithm for task scheduling in cloud environments

    Symmetry

    (2018)
  • Cited by (44)

    • Multi objective trust aware task scheduling algorithm in cloud computing using whale optimization

      2023, Journal of King Saud University - Computer and Information Sciences
    • Efficient task scheduling in cloud networks using ANN for green computing

      2024, International Journal of Communication Systems
    View all citing articles on Scopus

    Muhammad Sohaib Ajmal: is student of MS Computer Science in University of Engineering and Technology Taxila, Pakistan. He is currently working as network design engineer. His research interests are focused on Optimization Algorithms, Networks, Virtualization, Cloud computing and Software Defined Networks.

    Zeshan Iqbal received his M.Sc. and Ph.D. Degree in Computer Engineering from University of Engineering and Technology (UET), Taxila, Pakistan, in 2006 and 2014, respectively. Currently he is working as Assistant Professor at Department of Computer Science, UET Taxila, Pakistan. His research interests focus on Distributed Systems, Computer and Network Virtualization, Machine Learning and Software Defined Networks in Cloud.

    This paper is for special section VSI-bioc. Reviews processed and recommended for publication by Guest Editor Dr. Xiaochun Cheng.

    View full text