Distributed Grey Wolf Optimizer for scheduling of workflow applications in cloud environments
Introduction
A workflow application is a commonly used term to describe applications that comprise dependent tasks (i.e., applications with data or control flow dependencies [1], [2]). Cloud computing is a term used to describe a network of remote servers on the Internet that provides several services such as storage management and data processing services [3], [4], [5]. It provides Virtual Machines (VMs) or compute resources to execute workflow applications (e.g., bioinformatics, astronomy and physics applications [6]). A key factor to the success of data processing service in cloud environments is the scheduling techniques that schedule workflow applications on cloud environment such that the cost of using compute resources is minimized. However, the task scheduling problem in cloud computing environments is an -hard problem [7]. Therefore, several researchers have attempted in the recent years to find solutions for the scheduling problem using optimization-based scheduling algorithms (e.g., Particle Swarm Optimization (PSO) [8], PSO with load balancing mutation [9], Pareto-based Grey Wolf Optimizer (PGWO) [10], Multi-objective ant colony system (MOACS) [11], hybrid Gravitational Search Algorithm (GSA) [12], Genetic algorithm (GA) using an adaptive penalty function [13] and hybrid Bat and Binary Bat algorithm (BBA) [14]). The optimization algorithms which are the bases of the optimization-based scheduling algorithms may easily get trapped in local optima earlier than expected because of some limitations in their exploration methods [15], [16], [17], [18], [19]. Besides, the performance of the optimization algorithms degrades when dealing with medium and high dimensional optimization problems. It is also important to note that the hybrid scheduling algorithms (e.g., hybrid GSA, hybrid BBA) require normally more execution time than the traditional scheduling optimization algorithms (e.g., GSA, BA). This is because the hybrid optimization algorithms integrate in their optimization loops one or more local or global search methods. Therefore, the length of an iteration of a hybrid algorithm is the sum of the time required for the optimization operators to complete and the time required for the integrated search method to finish execution. Besides, the execution time of an iteration varies from one iteration to another depending on the complexity of the integrated search method inside the hybrid algorithm. This means that the average length of an iteration in the hybrid optimization algorithms is longer than the average length of an iteration in the traditional optimization algorithms [16], [17], [18].
The Distributed Grey Wolf Optimizer (DGWO) is a parallel version of the Grey Wolf Optimizer (GWO) algorithm [18]. This means that the optimization process of DGWO can be performed in parallel machines. There are two main reasons that make DGWO an interesting choice for solving optimization problems. First, the DGWO has faster convergence rate than popular optimization algorithms such as the Grey Wolf Optimizer, Cuckoo search [20], [21], [22], memory-based hybrid Dragonfly algorithm [23] and Fireworks algorithm with differential mutation [24]. Second, DGWO has one parameter (the vector (Section 3.1)) which does not require fine tuning.
In this paper, we propose an optimization-based scheduling algorithm for workflow applications based on the DGWO algorithm. We model the task scheduling problem of workflow applications in the proposed algorithm as a minimization problem (i.e., provide a mapping of dependent tasks of a workflow application to compute resources on a cloud computing environment such that the total execution cost (i.e., computation and data transmission costs) of the application is minimized). Finally, the candidate solutions in the scheduling problems can be generated in DGWO using the largest order value (LOV) method [25] as described in Section 4.1. We experimentally evaluated DGWO against well-known optimization-based scheduling algorithms such as Particle Swarm Optimization (PSO) [8] and GWO [26] using two types of workflows: balanced and imbalanced workflows. We noticed that the overall experimental results on balanced workflows indicate that DGWO outperforms the other algorithms when applied to scheduling problems with various data sizes. Moreover, we noticed that the overall experimental results conducted on imbalanced workflows indicate more clearly that DGWO performs better than the other algorithms.
To sum up, the main contributions of this paper are summarized as follows:
- 1.
We introduce a discrete variation of DGWO (Algorithm 4) that can be used to solve various types of scheduling problems.
- 2.
We modify the dynamic scheduling algorithm proposed in [8], [10] in order to make it suitable to DGWO. Like the original dynamic scheduling algorithm, the modified algorithm can deal with both balanced and imbalanced workflow applications. Algorithm 3 aims to minimize the computation and data transmission costs.
- 3.
We use WorkflowSim and real scientific workflows to evaluate the performance of DGWO against the simulation results reported in [27] for two algorithms: PSO and Binary PSO (BPSO). The experimental results suggest that DGWO provides the lowest makespans for different real scientific workflows with different sizes. Besides, it indicates that the performance of DGWO improves with the increase of size of workflow compared to BPSO and PSO.
- 4.
We also conduct experiments to evaluate and compare DGWO to two popular scheduling algorithms, PSO and GWO, using balanced and imbalanced workflows. The experimental results show that DGWO is the fastest converging algorithm.
The rest of the paper is organized as follows: Section 2 provides a review of recent methods that attempt to solve the task scheduling problem. Section 3 provides background discussions about the Grey Wolf Optimizer algorithm, the Distributed Grey Wolf Optimizer and the task scheduling problem in workflow applications. Section 4 discusses the proposed scheduling algorithms in details. Section 5 presents the experimental results. Finally, Section 6 presents the conclusions of this paper and discusses some future work directions.
Section snippets
Recent work
The task scheduling problem in cloud computing environments is an -hard problem [7]. In the recent years, several researchers have attempted to find solutions for the scheduling problem using optimization-based scheduling algorithms [8], [9], [10], [11], [12], [13], [14], [28], [29], [30], [31], [32], [33], [34]. This section provides an overview of recently proposed optimization-based scheduling algorithms.
Pandey et al. [8] proposed Particle Swarm Optimization-based Heuristic, one of the
Preliminaries
This section provides a summary of some of the underlying concepts of the Grey Wolf Optimizer algorithm (Section 3.1) and the Distributed Grey Wolf Optimizer algorithm (Section 3.2). The final part of the section (Section 3.3) provides a description about the mathematical formulation of the task scheduling problem in cloud computing environments.
Distributed Grey Wolf Optimizer for cloud workflowscheduling
One of the most important services of cloud environments is the distributed computing service that provides various distributed compute resources to execute applications such as workflow applications. The scheduling process of tasks in workflow applications requires assigning the tasks to compute resources in the cloud. This process mainly depends on the availability of compute resources and the network load. Unfortunately, the scheduling process is classified as an -hard problem [47]. This
Experiments
This section is divided into seven subsections as follows: Section 5.1 provides the measurement of comparison, Section 5.2 describes the experimental setup for the algorithms; including data and implementation, Section 5.3 provides a comparison between the algorithms when applied to balanced workflows, Section 5.4 provides a comparison between algorithms when applied to imbalanced workflows, Section 5.5 discusses the overall experimental results and the limitations of DGWO, Section 5.6 provides
Conclusions and future work
The current paper presented an optimization-based scheduling algorithm for workflow applications based on the Distributed Grey Wolf Optimizer (DGWO) algorithm. The task scheduling problem of workflow applications was modeled as an optimization problem as follows: Provide a mapping of dependent tasks of a workflow application to compute resources on a cloud computing environment such that the total execution cost (computation and data transmission costs) of the application is minimized.
The
CRediT authorship contribution statement
Bilal H. Abed-alguni: Conceptualization, Methodology, Investigation, Validation, Writing - original draft, Supervision. Noor Aldeen Alawad: Experimentation, Visualization, Reviewing and editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (63)
- et al.
Game-score: Game-based energy-aware cloud scheduler and simulator for computational clouds
Simul. Model. Pract. Theory
(2019) - et al.
Enhanced particle swarm optimization for task scheduling in cloud computing environments
Procedia Comput. Sci.
(2015) - et al.
A GSA based hybrid algorithm for bi-objective workflow scheduling in cloud computing
Future Gener. Comput. Syst.
(2018) - et al.
Cuckoo search via Lévy flights
- et al.
Memory based hybrid dragonfly algorithm for numerical optimization problems
Expert Syst. Appl.
(2017) - et al.
Grey wolf optimizer
Adv. Eng. Softw.
(2014) - et al.
Task scheduling in a cloud computing environment using hgpso algorithm
Cluster Comput.
(2018) - et al.
Review of optimization techniques applied for the integration of distributed generation from renewable energy sources
Renew. Energy
(2017) - et al.
An efficient symbiotic organisms search algorithm with chaotic optimization strategy for multi-objective task scheduling problems in cloud computing environment
J. Netw. Comput. Appl.
(2019) - et al.
Exchange strategies for multiple ant colony system
Inform. Sci.
(2007)