Genetic-based algorithms applied to a workflow scheduling algorithm with security and deadline constraints in clouds

https://doi.org/10.1016/j.compeleceng.2017.12.004Get rights and content

Abstract

There have been a number of metaheuristic scheduling techniques for cloud described in the literature, as well as their applications. The efficiency of metaheuristic techniques has been established in a wide range of workflow scheduling algorithms for cloud environments. However, it is still unknown whether the metaheuristic that is chosen, is suitable for solving the problem of optimization. This paper examines the effect of both Particle Swarm Optimization (PSO) and Genetic-based algorithms (GA) on attempts to optimize workflow scheduling. A security and cost-aware workflow scheduling algorithm was selected to evaluate the performance of the metaheuristics. Three algorithms were evaluated in three real-world workflows with a risk rate constraint that ranged between 0 and 1 with a 0.1 step. The findings indicate that GA-based algorithms significantly outperformed the PSO both in term of cost-effectiveness and response time.

Introduction

scientific workflow is a recent paradigm for distributed programming that is deployed in computational experiments in scientific A areas such as physics, astronomy, and biology. A workflow can be defined as a procedure involving a series of steps designed to simplify the complexity of executing and managing applications [1]. It is commonly represented as a Directed Acyclic Graph (DAG), where each node carries out a task, and each edge denotes a precedence or flow constraints between the tasks. Previously, workflows were deployed in computational grids. However, owing to the increase in the complexity of the scientific applications for handling big data, more powerful and scalable infrastructures are needed to run complex workflows within a reasonable amount of time [2].

Cloud computing is an infrastructure which can be rapidly and elastically provisioned on demand [3]. It can offer different virtual machine configurations that are capable of executing workflows. Clouds can be classified as private, community-based, public or hybrid clouds. Private cloud is owned and used by a single organization, and it does not charge the services it offers. Community-based clouds are designed for a specific community that has shared concerns. Public cloud shares services with multiple-tenants and these are charged on the basis of their use. Finally, hybrid cloud consists of a mix of public, communal and private clouds, where public/communal cloud resources are integrated with private cloud to create a single environment. There is a wide range of cloud services (e.g., virtual machine (VM) types, storage policy, and cloud data center locations) [1], [4]. For this reason, scheduling is a key factor in a workflow executed in a cloud environment and has been the subject of investigation in recent years [5]. Scheduling algorithms are designed to map out the tasks of the workflow for VMs based on scheduling criteria and subject to the constraints of the users. However, the heterogeneity of these tasks and wide range of cloud services, lead to an NP-hard optimization problem. Carrying out optimal scheduling within a reasonable time is a challenge because some variables have to be taken into account such as the fact that there are (a) several types of VMs with different capacities and prices, (b) tasks with heterogeneous loads, and (c) other attributes that impede the process of optimization.

In view of this, metaheuristic techniques have been employed in workflow scheduling in clouds to find a near-optimal scheduling scheme. However, research on the use of metaheuristics for workflow scheduling in cloud has been largely restricted to a single technique. In a recent study, it was found that Particle Swarm Optimization (PSO) is the most widely employed technique, and more than 50% of the scheduling algorithms were based on metaheuristics [6]. These findings suggest that there is a need to understand whether scheduling algorithms that are optimized by metaheuristics can achieve a better performance than any alternative and whether PSO is, in fact, a good choice.

This paper examines the effects of different metaheuristics on a workflow scheduling algorithm for cloud. Our major objective was to investigate the differences between the optimization obtained by Particle Swarm Optimization, the Genetic Algorithm (GA), and the Multi-Population Genetic Algorithm (MPGA). The approach adopted in this study involves a mixed methodology based on a security-aware and cost-aware workflow scheduling algorithm [7] that applied PSO to optimize the combinatorial scheduling scheme, and Genetic-based algorithms. Real-world workflows were deployed to evaluate the suitability of each metaheuristic. In addition, an analysis was conducted of the time spent until the stagnation of each optimization technique. Statistical Tests were conducted that assumed p ≤ 0.05 to check if there is a significant difference between PSO and GA-based algorithms.

The findings should be a useful addition to solutions provided by workflow scheduling algorithms based on multidimensional optimization, and show an efficient method for achieving an optimization beyond that of PSO. However, the reader should bear in mind that this study does not claim that workflow scheduling problems optimized by GA-based algorithms always outperform PSO. As mentioned earlier, the optimization efficiency of GA-based and PSO algorithms has only been examined in a single workflow scheduling algorithm.

The remainder of the paper is structured as follows: Section 2 summarizes the related work. Section 3 formulates the problem and system models in detail. The methodology employed for the experiments is outlined in Section 4. Section 5 explains the experimental planning and examines the results of the evaluation. Finally, in Section 6 there are some concluding remarks and suggestions for future research in the area.

Section snippets

Related work

Scheduling scientific workflows is an important and challenging area of research in cloud computing. There have been numerous studies on the algorithms needed to find the optimal workflow scheduling in different scenarios and comply with the constraints of the users. For instance, workflow scheduling algorithms have been created to enable users to meet their deadline [4], [5] although these incur high execution costs. Thus, other studies have addressed the issue of budgetary constraints in

System models and problem formulation

Our analysis is based on the scheme proposed by Li et al. called “A security-aware and cost-aware scheduling algorithm for heterogeneous tasks of scientific workflow in clouds”¥ [7]. The problem consists of a scientific workflow scheduling for reducing execution costs under deadline constraints. In this section, there is a description of the workflow, cloud data center, and security models. The account is also taken of the process of task analysis, the risk analysis of the workflow and the

Methods

The problem defined in Section 3 is NP-hard, where developing a metaheuristic approach could be an alternative to solve it instead of deterministic methods. Two common metaheuristic approaches that are adopted to solve complex problems are (a) cooperation and competition between people in a given population (genetic algorithms) and (b) particle swarm algorithms for simulating social behavior. In this paper, GA and MPGA algorithms are implemented, and we compared with PSO algorithm proposed in

Experimental results

In this section, there is an outline of our simulation parameters for applying GA, MPGA, and PSO algorithms. We have also included three synthetic workflows based on real scientific applications as in [7]: Cybershake, SIPHT, and Epigenomics, which commonly involve big data. The detailed description of each of these, including their structure, data, and computational requirements, are provided as follows and described in [2].

Cybershake is used to characterize earthquake hazards by means of a

Conclusions and future work

In this study, a previous PSO approach is compared with two evolutionary algorithms (GA and MPGA), for the optimization of a workflow scheduling algorithm in a cloud with security restrictions. It was found that the GA and MPGA reduce the cost of executing workflows, and is able to comply with security constraints and deadlines. Moreover, one initialization (bestSec), two mutation (mutStrong, mutLimit), and three crossover (crossUniform, crossAverage, crossBLX) operators are introduced. This

Acknowledgments

The authors would like to express their thanks to CAPES for accessing the periodicals, FAPESP and CNPq for providing the necessary resources, and USP for offering its facilities for this research. Henrique Yoshikazu Shishido is grateful to UTFPR for awarding him a scholarship.

Henrique Yoshikazu Shishido obtained his M.Sc. Degree in Computer Science from the State University of Maringá in 2010. He is an adjunct professor at UTFPR since 2010. He is a Ph.D. candidate at the University of São Paulo where he is undertaking research in workflow scheduling. His research interests include scheduling for distributed systems, high-performance computing, and health informatics.

References (25)

  • Z. Li et al.

    A security and cost aware scheduling algorithm for heterogeneous tasks of scientific workflow in clouds

    Future Gen Comput Syst

    (2016)
  • M. Masdari et al.

    Towards workflow scheduling in cloud computing: a comprehensive analysis

    J Netw Comput Appl

    (2016)
  • Cited by (82)

    • Look-ahead workflow scheduling with width changing trend in clouds

      2023, Future Generation Computer Systems
      Citation Excerpt :

      In [18], Abdullahi et al. focus on symbiotic organism search (SOS), a new method for solving numerical optimization problems, and propose discrete symbiotic organism search (DSOS) for deadline-constrained workflow scheduling. In [19], Shishido et al. introduce several operators, including initialization, mutation and crossover. Based on this, a genetic algorithm (GA) is used to minimize WEC subject to deadline and security constraints.

    • Use of whale optimization algorithm and its variants for cloud task scheduling: a review

      2023, Handbook of Whale Optimization Algorithm: Variants, Hybrids, Improvements, and Applications
    • Genetically-modified Multi-objective Particle Swarm Optimization approach for high-performance computing workflow scheduling

      2022, Applied Soft Computing
      Citation Excerpt :

      Compared to MOPSO and MOHEFT [40], MSMOOA results improve solutions quality. The GA-based algorithms are also present in the literature [41]. Zhu et al. [42] have introduced an Multi-objective Evolutionary Scheduling for Cloud (EMS-C) algorithm.

    View all citing articles on Scopus

    Henrique Yoshikazu Shishido obtained his M.Sc. Degree in Computer Science from the State University of Maringá in 2010. He is an adjunct professor at UTFPR since 2010. He is a Ph.D. candidate at the University of São Paulo where he is undertaking research in workflow scheduling. His research interests include scheduling for distributed systems, high-performance computing, and health informatics.

    Júlio Cezar Estrella obtained his B.Sc. Degree in Computer Science at São Paulo State University in 2002 and a Ph.D. in Computer Science at the University of São Paulo in 2010. He joined the Institute of Mathematics and Computer Science in 2010, where he works as an associate professor. His research interests include Internet of Things, Oriented-Service Architecture, and Performance Evaluation.

    Claudio Fabiano Motta Toledo obtained his B.Sc. Degree in Applied and Computational Mathematics at the State University of Campinas in 1995, and M.Sc. and Ph.D. in Electrical Engineering at the State University of Campinas in 1999 and 2005, respectively. He is an associate professor at the University of São Paulo, and his research interest includes optimization, metaheuristics, and evolutive systems.

    Márcio da Silva Arantes obtained his B.Sc. Degree in Computer Science from the Federal University of Lavras in 2012, and M.Sc. and a Ph.D. in Computer Science from the University of São Paulo in 2014 and 2017, respectively. He is a researcher at SENAI/SC. His research interests include mathematical modeling, evolutive algorithms and mission planning of UAVs for agriculture.

    Reviews processed and recommended for publication to the Editor-in-Chief by Guest Editor Dr. L. Bittencourt.

    View full text