Scheduling parameter sweep workflow in the Grid based on resource competition
Highlights
► We propose a new Grid workflow scheduling algorithm for parameter sweep workflows. ► The Besom algorithm uses resource competition to schedule multiple workflow instances. ► A single-level feedback loop can be handled during workflow execution. ► The Besom algorithm performs better for workflows with complex structure and setting.
Introduction
Computational science that involves large data sets and intensive computation is performed on distributed computing networks. Such scientific applications necessitate the use and management of a supercomputing infrastructure. The Grid [1], which offers supercomputing power through shared distributed resources, can be used to fulfil this necessity. However, several different functions, such as resource discovery and security management [1], [2], are required in order to manage and provide access to distributed Grid resources. To address this need, software tools have been developed for these functionalities and are implemented in multilayered architecture [1], [3]. Software at a lower layer provides services to the upper layer with end-user applications–specifically, e-Science applications–at the top layer.
To facilitate e-Science applications, workflow technology has been introduced into scientific domains and has become a necessary tool to help scientists perform their work in the Grid environment [4], [5], [6], [7]. Traditionally, workflow is a representation of business processes that consists of a set of tasks and the dependencies between them, specified in control structures or patterns [8]. Workflow has the ability to orchestrate services from heterogeneous sources, such as web services and existing software packages, to streamline business processes. In the same manner, scientific workflows can be utilised to orchestrate heterogeneous computation procedures developed by different parties to represent scientific processes in e-Science. Scientific workflow management systems such as Kepler [9] and Pegasus [10] (which usually provide a graphical user interface for composing workflows) can then be used to automatically execute scientific processes in the Grid.
In order to execute a scientific workflow in the Grid, a workflow scheduler is required. The workflow scheduler employs various Grid workflow scheduling algorithms to decide which task in the workflow is to be executed by which Grid resource so that the execution of the scientific workflow can meet constraints or criteria such as performance and cost. However, scheduling workflows in the Grid can be challenging. Unlike traditional scheduling problems in which resources are likely to be static [11], the distributed resources in the Grid are heterogeneous and may change their availability. Because the Grid has less centralised control [1], [12], a resource may join the Grid and make itself available to run applications, but may also become unavailable due to the decision of its owner or hardware failure. Scheduling in the Grid therefore usually needs to be done rapidly in order to make use of the immediate resources.
In some scientific experiments, the same process is repeated several times with varying input parameters to study or to optimise the parameters in the experiments [13]. For example, quantum chemical calculations optimise four parameters to give the best pseudo-atom surface [14]. This process is called parameter sweep and can also be automated using a workflow. Abramson, Enticott, and Altintas [14] proposed a technique to parallelise the execution of a parameter sweep workflow running in the Grid environment by cloning the repeated tasks to improve the overall execution time. This can be viewed as concurrently executing multiple workflow instances of the same workflow specification. As this results in that every instance competes for the same set of resources, it is therefore necessary to make sure that the scheduling of tasks and the allocation of Grid resources allow the execution of these multiple workflow instances to meet execution constraints.
Although many Grid workflow scheduling techniques exist, most of these scheduling algorithms only deal with a single Grid workflow at a time and assume that Grid workflows contain no loop structure [15], [16]. These techniques neglect the issues that may arise when multiple instances of a scientific workflow are executed in parallel and in iteration. Another issue is that existing Grid workflow scheduling algorithms do not take into account the dependency of tasks on specialised resources; nor do they consider the popularity of a resource [16], [17], [18]. For example, assigning a lengthy task to a resource that is shared by many tasks might delay the overall execution. This limitation is amplified with parallel execution of parameter sweep workflows due to the fact that every instance of a parameter sweep workflow, as mentioned earlier, requires the same set of resources. The situation may be exacerbated if there is competition for a scarce special resource.
In this paper, we propose a scheduling technique for multiple scientific workflow instances, based on the resource dependencies of tasks, that is also able to handle loop structure. The proposed algorithm aims to minimise the makespan of the parameter sweep workflow execution. We use parameter sweep workflows as a case study to illustrate the execution of workflow with high resource competition.
The rest of this paper is structured as follows. Section 2 describes the related work in the area of Grid workflow scheduling. The proposed parameter sweep workflow scheduling technique is explained Section 3, and the extension for loop handling is explained in Section 4. The scheduler that implements the proposed technique is briefly described in Section 5 while the simulation results and the evaluation are presented in Section 6. Finally, Section 7 concludes this paper along with potential future research directions.
Section snippets
Related work
To facilitate the understanding of the context of this research, this section presents existing work related to scientific workflows and workflow scheduling. The workflow concept is first explained along with scientific workflow which is the workflow used in science and research communities. Parameter sweep workflow, a special class of scientific workflow used in parametric study, is subsequently described. Lastly, several existing workflow scheduling techniques are described.
Scheduling parallel execution of parameter sweep workflow
This section describes our proposed scheduling technique named “Besom”; the word means a type of broom that is associated with witches in order to imply that the proposed algorithm can cleverly “sweep” parameters. To elaborate the technique, each of the terms involved is first explained, followed by the explanation of the Besom scheduling algorithm. For ease in understanding, in this section we assume that the parameter sweep workflow does not contain loop structures. The support for loop
Scheduling cyclic workflow graph
Usually, Grid workflow scheduling algorithms such, as the Min–Min and HEFT algorithm, assume that workflows lack loop structures [18], [36], [38], and leave the handling of loops as a separate concern that can be addressed by methods such as the loop unrolling technique suggested by Prodan and Fahringer in [24]. However, to unroll a loop, the number of loop iteration must be known or predicted [24]. This information might not be available or the prediction might not be reliable in every
Scheduler implementation
In this section, we briefly describe the prototype design of the Grid workflow scheduler that uses the Besom scheduling algorithm. The prototype has been developed as a parameter sweep workflow scheduler in Nimrod/K [13], [14], which is built on the Kepler system [9] and provides workflow schedulers with access to the underlying features of Grid middleware tools as well as Nimrod’s parameter exploration tools. It gives a scheduler access to all the file transfers, compute resource scheduling
Performance evaluation
To evaluate and analyse the behaviour of the Besom scheduling algorithm, we compare its performance with the three existing batch-mode scheduling algorithms: the Min–Min algorithm, the Max–Min algorithm and the XSufferage algorithm. In order to control the resource availability and network conditions, we use a simulation environment, implemented into Nimrod/K [14], and create dummy actors to represent tasks in parameter sweep workflow. These actors simulate task execution by going into sleep
Conclusion and future work
In this research, we proposed a new Grid workflow scheduling algorithm which aims to minimise the makespan of the entire execution of parameter sweep workflows. In addition, the algorithm is designed to reduce bottlenecks in workflow executions in the Grid (by considering resource competition); to manage multiple workflow instances; and to schedule workflows incorporating a loop structure.
The Besom algorithm works by retrieving information about the resource requirements of the tasks in the
Sucha Smanchat is currently a lecturer at the Faculty of Information Technology, King Mongkut’s University of Technology North Bangkok, Thailand. He obtained his Ph.D. at Monash University in Melbourne, Australia. During his study, he was involved with the development of a prototype scheduler for the Nimrod/K system. His current research interests are in Grid and Cloud workflow scheduling techniques.
References (60)
- et al.
Application of grid computing to parameter sweeps and optimizations in molecular modeling
Future Generation Computer Systems
(2005) - et al.
Embedding optimization in computational science workflows
Journal of Computational Science
(2010) - et al.
Scheduling workflow applications on processors with different capabilities
Future Generation Computer Systems
(2006) - I.T. Foster, The anatomy of the grid: enabling scalable virtual organizations, in: Proceedings of the 7th International...
- et al.
The Grid 2: Blueprint for A New Computing Infrastructure
(2004) - I. Foster, Globus toolkit version 4: software for service-oriented systems, in: IFIP International Conference on...
- Z. Jianting, Ontology-driven composition and validation of scientific grid workflows in Kepler: a case study of...
- et al.
Using web services and scientific workflow for species distribution prediction modeling
- D.P. Deana, Supporting large-scale science with workflows, in: Proceedings of the 2nd Workshop on Workflows in support...
- et al.
Examining the challenges of scientific workflows
Computer
(2007)
The application of petri nets to workflow management
The Journal of Circuits, Systems and Computers
Scientific workflow management and the Kepler system
Concurrency and Computation: Practice and Experience
Pegasus: a framework for mapping complex scientific workflows onto distributed systems
Scientific Programming
Scheduling: Theory, Algorithms, and Systems
What is the grid?—a three point checklist
GRIDtoday
Parameter exploration in science and engineering using many-task computing
IEEE Transactions on Parallel and Distributed Systems
Taxonomies of the multi-criteria grid workflow scheduling problem
Workflow scheduling algorithms for grid computing
Performance-effective and low-complexity task scheduling for heterogeneous computing
IEEE Transactions on Parallel and Distributed Systems
Scientific workflow: a survey and research directions
Workflow patterns
Distributed and Parallel Databases
A survey of automated web service composition methods
Taverna: a tool for the composition and enactment of bioinformatics workflows
Bioinformatics
Wings: intelligent workflow-based design of computational experiments
IEEE Intelligent Systems
Scientific grid workflows
Parameter scan of an effective group difference pseudopotential using grid computing
New Generation Computing
Cited by (14)
Taxonomies of workflow scheduling problem and techniques in the cloud
2015, Future Generation Computer SystemsCitation Excerpt :Scheduling criteria influence the design and approach of scheduling techniques. In contrast to grid workflow scheduling where minimizing makespan (or time) is dominant [1,14–17], most of the cloud workflow scheduling techniques are multi-objectives, in which time and cost are considered together in the scheduling [6,18,19]. Nevertheless, other objectives have also been considered.
A new optimization phase for scientific workflow management systems
2014, Future Generation Computer SystemsCitation Excerpt :Although, many design frameworks [35] have been developed, they are not accessible for scientists working with scientific workflows and workflow management systems. Instead, parameter sweeps and manual trial and error are still the methods of choice for life science researcher today [36,37], which was therefore used as comparison here. To our knowledge there is no other comprehensive approach for general workflow optimization freely available today, except Nimrod/OK [7], which is a more specialized workflow parameter optimization tool from the engineering domain and apparently not available for download.
Identifying information requirement for scheduling Kepler workflow in the cloud
2014, Procedia Computer ScienceGenetas - an optimized task scheduling strategy using genetic algorithm for parallel and distributed computing environment
2021, ARPN Journal of Engineering and Applied Sciences
Sucha Smanchat is currently a lecturer at the Faculty of Information Technology, King Mongkut’s University of Technology North Bangkok, Thailand. He obtained his Ph.D. at Monash University in Melbourne, Australia. During his study, he was involved with the development of a prototype scheduler for the Nimrod/K system. His current research interests are in Grid and Cloud workflow scheduling techniques.
Maria Indrawan works in the Faculty of Information Technology at Monash University. Maria received her Ph.D. from Monash University in Melbourne, Australia. Her areas of interest and expertise include data management in e-science, data mining, information retrieval and pervasive computing. She has authored numerous articles in these topic areas and has served on a number of program committees of international conferences. In 2009 she spent her sabbatical at the San Diego Supercomputer Centre (SDSC) and California Institute of Telecommunication and Information Technology (CALIT2) in the USA, working with scientists from such varied fields as computer science, biochemistry and physics. Through exposure to a number of e-science projects at those institutions, she developed a greater interest in data management for e-science.
Sea Ling works in the Faculty of Information Technology at Monash University as a Senior Lecturer. Sea received his Ph.D. from Monash University in Melbourne, Australia. He is interested in Petri net theory, workflows and software engineering. Recently, his research involves applying these theories to various research areas such as Grid computing and pervasive computing. He has supervised a number of research students in these topic areas, authored numerous articles and has served on a number of program committees of international conferences.
Colin Enticott is currently employed as a research scientist by the MeSsAGE Lab group at Monash University, Melbourne. He has been active in the area of distributed computing for over 10 years. His research focus is in the area of Grid Computing and Workflow Management and is about to complete his Ph.D. Previous projects include the Nimrod Portal and deploying Nimrod on GrangeNet.
David Abramson is a full professor of computer science at Monash University, Clayton, Australia, where he was the department chair from 1997 to 2002. He is currently the Director of the Monash e-Education Centre, the Science Director of the Monash e-Research Centre and the Director of the Monash e-Science and Grid Engineering Lab (MeSsAGE).
Before joining Monash University, in 1997, he held appointments at Griffith University, CSIRO, and RMIT. At CSIRO, he was the program leader of the Division of Information Technology High Performance Computing Program, and was also an adjunct associate professor at RMIT in Melbourne. He has held senior management positions in the Cooperative Research Centre for Intelligent Decisions Systems and the Cooperative Research Centre for Enterprise Distributed Systems.
He has chaired a number of international conferences, and has published more than 200 papers and technical documents. He has participated in seminars and received awards around Australia and internationally, and has received more than $8 million in research grants. His current research interests are in high-performance computer systems design and software engineering tools for programming parallel and distributed supercomputers. He is a Fellow of the Association for Computing Machinery (ACM), a Fellow of the Australian Computer Society (ACS) and a senior member of the IEEE.