Resource-efficient workflow scheduling in clouds

doi:10.1016/j.knosys.2015.02.012

Knowledge-Based Systems

Volume 80, May 2015, Pages 153-162

https://doi.org/10.1016/j.knosys.2015.02.012 Get rights and content

Abstract

Workflow applications in science and engineering have steadily increased in variety and scale. Coinciding with this increase has been the relentless effort to improve the performance of these applications through exploiting the abundance of resources in hyper-scale clouds and with little attention to resources efficiency. The inefficient use of resources when executing scientific workflows results from both the excessive amount of resources provisioned and the wastage from unused resources among task runs. In this paper, we address the problem of resource-efficient workflow scheduling. To this end, we present the Maximum Effective Reduction (MER) algorithm, a resource efficiency solution that optimizes the resource usage of a workflow schedule generated by any particular scheduling algorithm. MER trades the minimal makespan increase for the maximal resource usage reduction by consolidating tasks with the exploitation of resource inefficiency in the original workflow schedule. The main novelty of MER lies in its identification of “near-optimal” trade-off point between makespan increase and resource usage reduction. Finding such a point is of great practical importance and can lead to: (1) improvements in resource utilization, (2) reductions in resource provisioning, and (3) savings in energy consumption. Another significant contribution of this work is MER’s broad applicability. In essence, MER can be applied to any environments that deal with the execution of (scientific) workflows of many precedence-constrained tasks although MER best suits for the IaaS cloud model. Based on results obtained from our extensive simulations using scientific workflow traces, we demonstrate MER is capable of reducing the amount of actual resources used by 54% with an average makespan increase of less than 10%. The efficacy of MER is further verified by results (from a comprehensive set of experiments with varying makespan delay limits) that show the resource usage reduction, makespan increase and the trade-off between them for various workflow applications.

Introduction

As resouce capacity in clusters and more recently clouds becomes increasingly abundant with the prevalence of large-scale multi-core systems and advances in virtualization technologies, such resource abundance has been “relentlessly” exploited particularly to improve applications performance. Scientific workflows (such as Montage [1], CyberShake [2], LIGO [3], Epigenomics [4] and SIPHT [5]) in particular can take great advantage of abundant resources as they are mostly resource-intensive with a good degree of scalability. However, the exploitation of resource abundance is a double-edged sword in that the performance improvement from such exploitation is often achieved at the expense of resource efficiency.

The inefficient use of resources when executing scientific workflows results from both the excessive amount of resources provisioned and the wastage from unused resources (Fig. 1), including idle time¹ among task runs. The optimization of resource efficiency is of great practical importance considering its numerous benefits in the economic and environmental sustainability of large-scale computer systems like corporate data centers and clouds.

While previous work on workflow scheduling has focused on increasing the performance (makespan) with a limited amount of resources (resource scarcity), the advent of multi-core processors and cloud computing (resource abundance) has brought much attention to resource efficiency.² Existing workflow scheduling algorithms may be able to adapt to deal with resource abundance by limiting the number of resources to be used (resource limit) at the time of scheduling. However, it is only a partial and ad hoc solution; for example, the schedule in Fig. 1c with a resource limit of 2 shows that makespan increases by 21% compared to the schedule in Fig. 1b in return for the use of one less resource. Moreover, the efficacy of such a solution varies for different applications, and even with executions of a particular application with different inputs (e.g., data and/or parameter values). Dynamic resource provisioning with public clouds as in [7], [8] might be an alternative; however, the poor resource utilization—sourced from uneven widths of different levels in a workflow (see Fig. 1)—within the hour still remains. Since it is very difficult, if not impossible, to find the optimal resource amount for scheduling a given workflow application, and since current workflow scheduling algorithms perform quite well in terms of makespan, the post-processing of output workflow schedules may be a practical approach to optimizing resource usage.

In this paper, we consider the problem of efficient use of resource abundance for running large-scale scientific workflows. To this end, we develop a new algorithm (Maximum Effective Reduction or MER) to effectively trade makespan increase for resource usage reduction. (Maximum Effective Reduction or MER), a workflow schedule optimization algorithm that minimizes the resource usage of a workflow schedule generated by any particular scheduling algorithm. MER is a post-optimization algorithm that takes an existing workflow schedule and consolidates tasks in the schedule into a smaller amount of resources compared to that used for the input schedule. The algorithm aims to find a minimal makespan increase that reduces the resource usage the most; this reduction is defined as effective reduction (ER) in this paper. ER is used to measure resource efficiency based primarily on the difference between the resource usage reduction and makespan increase in a resultant schedule as compared to the input schedule.

In our previous work [9], we have shown the potential that the resource usage of workflow schedule can be reduced by consolidating tasks exploiting idle/inefficiency slots; however, the degree of such reduction is mostly limited due to the preservation of makespan (Fig. 1d). MER significantly extends the primitive solution in [9] by allowing makespan increase/delay to maximize resource usage reduction. In particular, MER incorporates a simple, yet effective heuristic to estimate the makespan increase given the tasks being considered for consolidation. The estimate is used for the degree of delay allowed in makespan (makespan delay limit or simply delay limit), and is based on the estimated ER. We have verified the efficacy of this heuristic by a comparison with the best empirical delay limits found in our experiments. A preliminary version of this work can be also found in [10].

Based on results obtained from our extensive simulations using scientific workflow traces, we demonstrate MER is capable of reducing the amount of actual resources used by 54% with an average makespan increase of less than 10%. The efficacy of MER is further verified by results (from a comprehensive set of experiments with varying makespan delay limits) that show the resource usage reduction, makespan increase and the trade-off between them for various workflow applications.

The rest of this paper is organized as follows. Section 2 describes our workflow schedule optimization problem. In Section 3, we detail our solution algorithm. Section 4 demonstrates the efficacy of MER with results obtained from our extensive simulations. Section 5 provides a suvery on related work followed by our conclusion in Section 6.

Section snippets

The problem of workflow schedule optimization

In this section, we define the workflow schedule optimization problem describing workflow and system models.

Maximum effective reduction algorithm

The Maximum Effective Reduction (MER) algorithm is an workflow schedule optimization technique for workflow scheduling algorithms and it optimizes the trade-off between makespan increase and resource usage reduction. MER consists of the following phases/sub-algorithms: (1) delay limit identification, (2) task consolidation and (3) resource consolidation.

In essence, MER seeks to optimize a workflow schedule by consolidating tasks in two ways: (1) filling idle time slots resulting from data

Evaluation

In this section, we evaluate MER with four different scheduling algorithms and under five different workflow applications. The scheduling algorithms we examine are Critical Path First (CPF) [9], Dynamic Critical Path (DCP) [12], Critical-Path-on-a-Processor (CPOP) [13] and a greedy algorithm based on earliest finish time (EFT). We have implemented CPOP and EFT with a slight modification in order to run with an unlimited number of resources. Since the performance of scheduling algorithms is not

Related work

In this section, we review previous studies on workflow scheduling and discuss resource efficiency.

The execution of scientific workflows is typically planned and coordinated by schedulers/resource managers (e.g., [16], [17]) particularly with distributed resources. At the core of these schedulers are scheduling algorithms.

Traditionally, workflow scheduling focuses on the minimization of makespan (i.e., high performance) within tightly coupled computer systems like compute clusters with an

Conclusion

The abundance of cloud resources and their elasticity with a pay-as-you-go pricing model provide great opportunities for scientists and engineers in terms particularly of costs and availability. However, the current public/commodity cloud solutions are neither designed explicitly taking into account scientific computing, nor levelled with the performance of traditional HPC systems. The performance and cost of a cloud cluster deployment are heavily dependent on how effectively the resource

Acknowledgement

Dr. Young Choon Lee would like to acknowledge the support of the Australian Research Council Discovery Early Career Researcher Award Grant DE140101628. Prof. Albert Zomaya would like to acknowledge the support of the Australian Research Council Discovery Grant DP130104591. Dr. Hyuck Han's work was partly supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2014R1A1A2055032).

References (23)

H.M. Fard et al.
Multi-objective list scheduling of workflow applications in distributed computing infrastructures
J. Paral. Distrib. Comput.
(2014)
J.C. Jacob et al.
Montage: a grid portal and software toolkit for science-grade astronomical image mosaicking
Int. J. Comput. Sci. Eng.
(2009)
R. Graves et al.
Cybershake: a physics-based seismic hazard model for southern california
Pure Appl. Geophys.
(2010)
A. Abramovici et al.
Ligo: the laser interferometer gravitational-wave observatory
Science
(1992)
S. Bharathi, A. Chervenak, E. Deelman, G. Mehta, M.-H. Su, K. Vahi, Characterization of scientific workflows, in:...
J. Livny et al.
High-throughput, kingdom-wide prediction and annotation of bacterial non-coding RNAs
PLoS ONE
(2008)
L.A. Barroso et al.
The case for energy-proportional computing
Computer
(2007)
M. Mao, M. Humphrey, Auto-scaling to minimize cost and meet application deadlines in cloud workflows, in: Proceedings...
C. Lin, S. Lu, Scpor: an elastic workflow scheduling algorithm for services computing, in: Proceedings of the 2011 IEEE...
Y.C. Lee, A.Y. Zomaya, Stretch out and compact: workflow scheduling with resource abundance, in: Proceedings of the...

Y.C. Lee, H. Han, A.Y. Zomaya, On Resource Efficiency of Workflow Schedules, in: Proceedings of the 14th International...

Cited by (88)

Energy-aware intelligent scheduling for deadline-constrained workflows in sustainable cloud computing
2023, Egyptian Informatics Journal
It is challenging to handle the non-linear power consumption model, complex workflow structures, and diverse user-defined deadlines for energy-efficient workflow scheduling in sustainable cloud computing. Although metaheuristics are very attractive to solve this problem, most of the existing work regards the problem as a black-box and ignores the use of domain knowledge. To make up for their shortcomings, this paper tailors an energy-aware intelligent scheduling algorithm (EIS) with three new mechanisms. First, we derive the optimal execution time that minimizes energy consumption for each task on a given resource. Second, based on the optimal execution time of each workflow task, the EIS distributes the workflow slack time (difference between its completion time and deadline) to reduce the voltages and frequencies of task executions for energy saving. Third, the EIS mines the idle time gaps caused by task precedence constraints to further reduce dynamic energy consumption whilst satisfying workflows’ deadline constraints. To measure the performance of the EIS, we conduct extensive comparison experiments based on actual workflow applications. The results demonstrate that the energy consumption of the EIS is much lower than that of the competitors under different deadlines, and has a faster descend rate with the evolution process.
Multi-objective workflow scheduling based on genetic algorithm in cloud environment
2022, Information Sciences
Citation Excerpt :
With the rapid increase number of clients, cloud service providers should design an efficient and secure system for resource allocation and scheduling [28]. Generally, a system is considered efficient if it can provide the best performance for all applications running within it under resource constraints [17]. Therefore, to meet the requirements, it is necessary to manage resources effectively and schedule tasks reasonably.
In recent years, cloud computing plays a crucial role in many real applications. Thus, how to solve workflow scheduling problems, i.e., allocating and scheduling different resources, under the cloud computing environment becomes more important. Although some evolutionary algorithms (EAs) can solve workflow scheduling problems with a small scale, they show some disadvantages on larger scale workflow applications. In this paper, a multi-objective genetic algorithm (MOGA) is applied to optimize workflow scheduling problems. To enhance the search efficiency, this study proposes an initialization scheduling sequence scheme, in which each task’s data size is considered when initializing its virtual machine (VM) instance. Relying on the initial scheduling sequence, a proper trade-off between the makespan and the energy consumption, which are two optimization objectives in this study, can be achieved. In the early evolution stage, traditional crossover and mutate operators are performed to keep the population’s exploration. On the contrary, the longest common subsequence (LCS) of multiple elite individuals, which can be regarded as a favorable gene block, is saved during the later evolution stage. Based on the LCS, the probability of some favorable gene blocks being destroyed will be reduced when performing the crossover operator and the mutate operator. Hence, the integration of the LCS in GA can satisfy different requirements in different evolution stages, and then to attain a balance between the exploration and the exploitation. Extensive experimental results verify that the proposed GA combined with LCS, named as GALCS in this paper, can find a better Pareto front than the ordinary GA as well as other state-of-the-art algorithms. Furthermore, effectivenesses of the new proposed strategies are also verified by a set of experiments.
TOPSIS inspired cost-efficient concurrent workflow scheduling algorithm in cloud
2022, Journal of King Saud University - Computer and Information Sciences
Scheduling is a decision-making mechanism that enables the sharing of resources among several activities by determining their execution order on the set of available resources. In distributed systems, it is a great challenge to schedule multiple workflows submitted at different times. In particular, concurrent workflow scheduling with time constraints makes the problem more complex in the cloud due to the dynamics of the cloud such as elasticity, non-homogeneous resource types, various pricing schemes, and virtualization. A well-managed deadline workflow scheduling is required to improve end-user satisfaction and system performance. In the meantime, the intrinsic uncertainty in the cloud increases the difficulties of scheduling problems. Therefore, it is a great challenge to improve system performance and optimize several scheduling criteria simultaneously. To address the above issues, a novel concurrent workflow scheduling method for heterogeneous distributed environments based on the new Multi-Criteria Decision Making (MCDM) method i.e., TOPSIS (Technique of Order Preference by Similarity to Ideal Solution) is presented. A weighted sum of execution time, cost and communication time are used to find out the optimal resource among the existing resources as per the workflow task requirements. The proposed method minimizes the makespan and execution cost of the workflow and improves the resource efficiency under uncertain environment. The performance of the proposed work is compared with the state-of-the-art algorithms such as Cloud-based Workflow Scheduling Algorithm (CWSA), Earliest Finish Time-Maximum Effective Reduction (EFT-MER) and Heterogeneous Earliest-Finish-Time (HEFT) algorithms based on deadline constraint and resource utilization. Our experimental results demonstrate that the proposed T-CCWSA outperforms current state-of-the-art heuristics with the criteria of achieving the deadline constraint, minimizing the cost of execution and resource efficiency.
A survey of domains in workflow scheduling in computing infrastructures: Community and keyword analysis, emerging trends, and taxonomies
2021, Future Generation Computer Systems
Workflows are prevalent in today’s computing infrastructures as they support many domains. Different Quality of Service (QoS) requirements of both users and providers makes workflow scheduling challenging. Meeting the challenge requires an overview of state-of-art in workflow scheduling. Sifting through literature to find the state-of-art can be daunting, for both newcomers and experienced researchers. Surveys are an excellent way to address questions regarding the different techniques, policies, emerging areas, and opportunities present, yet they rarely take a systematic approach and publish their tools and data on which they are based. Moreover, the communities behind these articles are rarely studied. We attempt to address these shortcomings in this work.
We introduce and open-source an instrument used to combine and store article meta-data. Using this meta-data, we characterize and taxonomize the workflow scheduling community and four areas within workflow scheduling: (1) the workflow formalism, (2) workflow allocation, (3) resource provisioning, and (4) applications and services. In each characterization, we obtain important keywords overall and per year, identify keywords growing in importance, get insight into the structure and relations within each community, and perform a systematic literature survey per part to validate and complement our taxonomies
INHIBITOR: An intrusion tolerant scheduling algorithm in cloud-based scientific workflow system
2021, Future Generation Computer Systems
With the development of cloud computing technologies, more and more scientific workflows have been executed in clouds. However, cloud-based scientific workflows face many threats due to the resource sharing. Adversaries in clouds can directly or indirectly destroy them by means of side channels, virtual machine escape, and so on, which will cause interruption or produce incorrect outputs. Cloud-based scientific workflows are often applied in important scientific research fields, their failures will bring huge losses. Therefore, we propose an Intrusion toleraNt scHeduling algorIthm in cloud-Based scIenTific wORkflow system (INHIBITOR) to enhance the security. This algorithm constructs three replicas for each sub-task and designs a voting mechanism to realize the result verification. Based on this framework, INHIBITOR studies how to schedule these sub-task replicas and deduces the constraints which should be satisfied for intrusion tolerant scheduling. Furthermore, an elastic resource provisioning strategy is presented to improve resource utilization. To verify the effectiveness of INHIBITOR, we conduct experiments with WorkflowSim toolkit and use success rate, task completion rate and execution costs to evaluate it. Experimental results demonstrate that, compared with existing methods, INHIBITOR can not only increase the success rate by around 12.54%, but also improve the efficiency by about 4.7% and reduce execution costs by around 11.29%.
Decentralized and scalable hybrid scheduling-clustering method for real-time applications in volatile and dynamic Fog-Cloud Environments
2023, Journal of Cloud Computing

View all citing articles on Scopus

View full text

Resource-efficient workflow scheduling in clouds

Abstract

Introduction

Section snippets

The problem of workflow schedule optimization

Maximum effective reduction algorithm

Evaluation

Related work

Conclusion

Acknowledgement

J. Paral. Distrib. Comput.

Montage: a grid portal and software toolkit for science-grade astronomical image mosaicking

Int. J. Comput. Sci. Eng.

Cybershake: a physics-based seismic hazard model for southern california

Pure Appl. Geophys.

Ligo: the laser interferometer gravitational-wave observatory

Science

High-throughput, kingdom-wide prediction and annotation of bacterial non-coding RNAs

PLoS ONE

The case for energy-proportional computing

Computer