Elsevier

Applied Soft Computing

Volume 77, April 2019, Pages 547-566
Applied Soft Computing

An intelligent water drops-based workflow scheduling for IaaS cloud

https://doi.org/10.1016/j.asoc.2019.02.004Get rights and content

Highlights

  • A comprehensive review of the existing strategies along with their merits and demerits.

  • We develop an efficient mechanism to find the multiple PCPs of the workflow dynamically.

  • We devise an efficient PCP-VM assignment strategy to improve the efficient resources utilization.

  • Finally, we evaluate the performance of the proposed algorithm using popular scientific workflows.

Abstract

Cloud computing is an emerging technology in a distributed environment with a collection of large-scale heterogeneous systems. One of the challenging issues in the cloud data center is to select the minimum number of virtual machine (VM) instances to execute the tasks of a workflow within a time limit. The objectives of such a strategy are to minimize the total execution time of a workflow and improve resource utilization. However, the existing algorithms do not guarantee to achieve high resource utilization although they have abilities to achieve high execution efficiency. The higher resource utilization depends on the reusability of VM instances. In this work, we propose a new intelligent water drops based workflow scheduling algorithm for Infrastructure-as-a-Service (IaaS) cloud. The objectives of the proposed algorithm are to achieve higher resource utilization and minimize the makespan within the given deadline and budget constraints. The first contribution of the algorithm is to find multiple partial critical paths (PCPs) of a workflow which helps in finding suitable VM instances. The second contribution is a scheduling strategy for PCP-VM assignment for assigning the VM instances. The proposed algorithm is evaluated through various simulation runs using synthetic datasets and various performance metrics. Through comparison, we show the superior performance of the proposed algorithm over the existing ones.

Introduction

Cloud computing is an emerging technology in a distributed environment with a collection of large-scale heterogeneous systems [1]. This technology enables to deliver various computing resources over the Internet and follows an on-demand service model where the users are charged as per their requirements [2]. The cloud environment provides three types of services such as Infrastructure as a Service, Platform as a Service and Software as a Service [3]. The IaaS cloud offers various computing resources to the users on demand basis [4]. To maximize the revenues, an IaaS service provider provides a better QoS to the users as per the service-level agreement [5], [6] and maximizes the resource utilization. On the other hand, workflow applications require a suitable VM instance in order to complete the execution within a given deadline. A workflow is represented by a DAG in which the nodes represent the tasks and edges represent the inter-dependency of the task. A workflow scheduling problem comes under NP-hard problems [7] and uses dynamic and optimization strategies to solve the problem efficiently. The list of used abbreviations with their descriptions and full form are shown in Table 1.

Several workflow scheduling strategies were studied in the recent past and most of them focused on minimizing makespan or meeting the deadline [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23]. However, they ignore the resource and budget constraints. The execution cost of workflow needs to minimize because cloud providers provide the services on a pay-per-use model. On the other hand, in a dynamic environment, the higher resource utilization depends on the reusability of VM instances. To achieve the desired objectives, an intelligent strategy is required to assign and execute the tasks of a workflow to a minimum number of VM instances while meeting the deadline. From literature [11], [12], [13], [22], we observed that finding a PCP of a workflow and assigning tasks of the path to suitable VM instances may achieve the target to some extent. A PCP of a workflow consists of a set of interrelated tasks need to be executed sequentially. Furthermore, we also noticed that finding multiple PCPs dynamically and assigning tasks to belong to the path to suitable VM instances may achieve the desired objectives. An illustration is shown in Fig. 1. The tasks belong to PCPs P1, P2, and P3 are assigned to the VM instance I1, I2, and I3 respectively.

On the other hand, the meta-heuristic algorithms are the high-level problem-independent algorithms which have a set of rules or strategies to find a better solution to a given problem [24]. The convergence speed of the meta-heuristic algorithms is the global (or nearly global) optimal and better than the traditional techniques. Therefore, the meta-heuristic algorithms have also been increasingly used to solve the workflow scheduling problems [25], [26]. In this work, we adopt a meta-heuristic technique, called IWD technique [27], [28] to find the multiple PCPs of a workflow dynamically. The IWD algorithm is used to solve different scientific problems such as traveling salesman’s problem, multidimensional knapsack problem, N- Queen Problem, etc. [28]. From [27], [28], we also observe that the IWD has faster convergence rate as compared to the exiting meta-heuristics techniques.

In this paper, we use the IWD technique to find multiple PCPs of a workflow application. The IWD algorithm is an iterative generation process which has an ability to find the multiple PCPs of a workflow dynamically. On the other hand, the existing algorithms find a single PCP only in its time cycle and fail to produce multiple PCPs. The existing algorithms [11], [12], [13], [22] require different types of information about the graph while finding a PCP such as the weight of the edges, earliest start time and latest finish time of the nodes, etc. However, the IWD algorithm finds the multiple PCPs based on the knowledge of the structure and the dependency of the nodes. This proves that the IWD algorithm better than the existing techniques in finding multiple PCPs. Authors in [29] have proposed workflow scheduling algorithm using an iterative heuristic. However, the time complexity of the algorithm is very high, i.e., O(n3 d2m) where n, d, and m are the number of tasks, number of Pareto non-dominated solutions, and a number of resources respectively and resource utilization is also not been considered.

In this work, we propose a new IWD-based dynamic workflow scheduling algorithm for IaaS cloud referred to as IWD-DWS. The contribution of the proposed algorithm is two folds. The first contribution is to find multiple PCPs of the workflow which helps in finding the best-fit VM instances. The objective of the first contribution is to minimize the makespan and total execution cost while meeting the deadline. The second contribution is to schedule the tasks of PCPs to the best-fit VM instances. The objective of the second contribution is to utilize the computing resources efficiently through the reusability of the VM instances. Finally, we evaluate the proposed algorithm over synthetic datasets using various performance matrixes. The major contribution of this work are summarized as follow

  • (a)

    A comprehensive review of the existing strategies along with their merits and demerits.

  • (b)

    We develop an efficient mechanism to find the multiple PCPs of the workflow dynamically.

  • (c)

    We devise an effective PCP-VM assignment strategy to improve efficient resources utilization.

  • (d)

    Finally, we evaluate the performance of the proposed algorithm using popular scientific workflows with different sizes.

The rest of the paper is organized as follows. The related work of exiting workflow scheduling strategies is discussed in Section 2. The overview of the IWD algorithm, system model, and the problem formulation are presented in Section 3. The proposed IWD-DWS algorithm is discussed in Section 4. The performance evaluation of the proposed algorithm is discussed in Section 5. Finally, the conclusion and future scope are given in Section 6.

Section snippets

Related work

Extensive research works have been taken place on workflow scheduling in cloud environment [8], [9], [10], [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23]. Here, we review some of the workflow scheduling algorithms.

Mao and Humphrey have developed a workflow scheduling strategy in a dynamic cloud environment [8]. Each data center contained various types of VM instances at different prices. The algorithm finds the most effective VM instance for each task based on the

Preliminaries

In this section, we first present the overview of the IWD algorithm followed by the system and workflow model. Next, we elaborate on the problem formulation to be addressed by the proposed algorithm.

Proposed IWD-DWS algorithm

The proposed IWD-DWS algorithm divided into two stages, selection of PCPs and PCP-VM assignment.

Performance evaluation

The performance of the proposed algorithm IWD-DWS is evaluated over four well known scientific workflows with different sizes and their detail descriptions are presented by Juve et al. [31]. For the sake of completeness, brief discussions about the scientific workflows are given. Each of the workflows has a different structure in terms of a number of tasks and computational characteristics. The Montage is an astronomy application which is used to create custom mosaics of the sky based on the

Conclusion and future work

In this paper, we have proposed a new intelligent water drops-based workflow scheduling for IaaS cloud referred as IWD-DWS. The proposed algorithm is divided into two phases, namely selection of PCPs and PCP-VM assignment. We have applied intelligent water drops metaheuristic technique to find the multiple critical paths which in finding suitable VM instances. We have devised a scheduling strategy to assign the tasks of a PCP to a suitable VM instance to minimize makespan and meeting the

References (33)

  • CasasIsrael et al.

    A balanced scheduler with data reuse and replication for scientific workflows in cloud computing systems

    Future Gener. Comput. Syst.

    (2017)
  • VermaAmandeep et al.

    A hybrid multi-objective particle swarm optimization for scientific workflow scheduling

    J. Parallel Comput.

    (2017)
  • T. Chatterjee, V.K. Ojha, M. Adhikari, S. Banerjee, U. Biswas, V. Snasel (2014), Design and implementation of a new...
  • BanerjeeS. et al.

    Development and analysis of a new cloudlet allocation strategy for QoS improvement in cloud

    Arab. J. Sci. Eng.

    (2014)
  • M. Mao, M. Humphrey, Auto-scaling to minimize cost and meet application deadlines in cloud workflows, in: Proceeding of...
  • M. Malawski, G. Juve, E. Deelman, J. Nabrzyski, Cost-and deadline-constrained provisioning for scientific workflow...
  • Cited by (26)

    • AILS: A budget-constrained adaptive iterated local search for workflow scheduling in cloud environment

      2022, Expert Systems with Applications
      Citation Excerpt :

      Besides, the optimal resource capacity for parallel tasks derived from data distribution structures are estimated by two mathematical models. Adhikari and Amgoth (2019) designed the IWD-DWS with a PCPs selection strategy and a PCP-VM assignment strategy to enhance the resource utilization and minimize the makespan under the budget and deadline constraints. The ICTS algorithm with level sorting, task prioritizing, and VM selection was proposed by Amoon et al. (2019) to schedule the workflow in cloud computing systems.

    • Energy-efficient virtual-machine mapping algorithm (EViMA) for workflow tasks with deadlines in a cloud environment

      2022, Journal of Network and Computer Applications
      Citation Excerpt :

      WorkflowSim is an extension of cloudSim that allows workflow scheduling algorithm developers to simulate scheduling algorithms. The inputs to the simulation environment are as follows: (1) the average bandwidth between resources is 20 MBps as in Arabnejad et al. (2018), Mboula et al. (2020), which is the average bandwidth setting offered by Amazon Web Services (Palankar et al., 2008; Sahni and Vidyarthi, 2015), (2) the processing matrix for each VM is measured in Million Instruction Per Second (MIPS) as in Rodriguez and Buyya (2018), Singh et al. (2019), Adhikari and Amgoth (2019), (3) the task lengths are set in Million Instruction (MI) as in Singh et al. (2019). In this experiment, the job that is close to its deadline is selected first and submitted to a VM with high processing speed for execution.

    • An extended intelligent water drop approach for efficient VM allocation in secure cloud computing framework

      2022, Journal of King Saud University - Computer and Information Sciences
    • Genetically-modified Multi-objective Particle Swarm Optimization approach for high-performance computing workflow scheduling

      2022, Applied Soft Computing
      Citation Excerpt :

      In the remainder of this work, we will mainly be focusing on the Task-level and Service-level scheduling following the Static scheduling strategy. Before starting to expose some related works of workflow scheduling on distributed systems, it is worth mentioning that there is a broad spectrum of researches focusing on this topic in the literature [26–29][4,5,30–32]. The proposed approaches by researchers vary according to various parameters such as the number (single, double or multiple) or the type (Cloud-user-side or Cloud-provider-side) of the criteria to be optimized, the number of workflows or users and the nature of the distributed infrastructure [6,10,25].

    • Application of binary PSO for public cloud resources allocation system of video on demand (VoD) services

      2021, Applied Soft Computing
      Citation Excerpt :

      The study of Babu et al. is a good example for that they assign the independent generalized tasks to the VMs at an IaaS level [44]. In the same way, in the study of Adhikari et al. they propose achieving minimum number of VM resources usage while minimizing the total execution time [45]. Under the budget constraint and completion time limit, partial critical paths are found by using the intelligent water drops optimization technique.

    View all citing articles on Scopus
    View full text