skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: On the Feasibility of Simulation-Driven Portfolio Scheduling for Cyberinfrastructure Runtime Systems

Conference ·

Runtime systems that automate the execution of applications on distributed cyberinfrastructures need to make scheduling decisions. Researchers have proposed many scheduling algorithms, but most of them are designed based on analytical models and assumptions that may not hold in practice. The literature is thus rife with algorithms that have been evaluated only within the scope of their underlying assumptions but whose practical effectiveness is unclear. It is thus difficult for developers to decide which algorithm to implement in their runtime systems.To obviate the above difficulty, we propose an approach by which the runtime system executes, throughout application execution, simulations of this very execution. Each simulation is for a different algorithm in a scheduling algorithm portfolio, and the best algorithm is selected based on simulation results. The main objective of this work is to evaluate the feasibility and potential merit of this portfolio scheduling approach, even in the presence of simulation inaccuracy, when compared to the traditional one-algorithm approach. We perform this evaluation via a case study in the context of scientific workflows. Our main finding is that portfolio scheduling can outperform the best one-algorithm approach even in the presence of relatively large simulation inaccuracies.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1922323
Resource Relation:
Journal Volume: 13592; Conference: 25th Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP 2022) - Lyon, , France - 6/3/2022 4:00:00 AM-6/3/2022 4:00:00 AM
Country of Publication:
United States
Language:
English

References (24)

A Survey on Scheduling Strategies for Workflows in Cloud Environment and Emerging Trends journal August 2019
Validity of the single processor approach to achieving large scale computing capabilities conference January 1967
Workflow scheduling algorithms in cloud environment - A survey conference March 2014
GridSim: a toolkit for the modeling and simulation of distributed resource management and scheduling for Grid computing journal November 2002
CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms journal August 2010
Obtaining dynamic scheduling policies with simulation and machine learning
  • Carastan-Santos, Danilo; de Camargo, Raphael Y.
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '17 https://doi.org/10.1145/3126908.3126955
conference January 2017
Versatile, scalable, and accurate simulation of distributed applications and platforms journal October 2014
Developing accurate and scalable simulators of production workflow management systems with WRENCH journal November 2020
WfCommons: A framework for enabling scientific workflow research and development journal March 2022
Exploring portfolio scheduling for long-term execution of scientific workloads in IaaS clouds conference November 2013
Self-tuning systems journal January 1999
Online Tuning of EASY-Backfilling using Queue Reordering Policies journal October 2018
Workflow scheduling in heterogeneous computing systems : A survey conference October 2017
LogGOPSim: simulating large-scale applications in the LogGOPS model
  • Hoefler, Torsten; Schneider, Timo; Lumsdaine, Andrew
  • Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing - HPDC '10 https://doi.org/10.1145/1851476.1851564
conference January 2010
DISSECT-CF: A simulator to foster energy-aware scheduling in infrastructure clouds journal November 2015
Fostering Energy-Awareness in Simulations behind Scientific Workflow Management Systems conference December 2014
A Survey of Data-Intensive Scientific Workflow Management journal March 2015
CloudNetSim++: A toolkit for data center simulations in OMNET++ conference December 2014
FogNetSim++: A Toolkit for Modeling and Simulation of Distributed Fog Environment journal January 2018
A taxonomy and survey on scheduling algorithms for scientific workflows in IaaS cloud computing environments: Workflow Scheduling Algorithms for Clouds journal December 2016
The self-tuning dynP job-scheduler conference January 2002
Portfolio-Based Selection of Robust Dynamic Loop Scheduling Algorithms Using Machine Learning conference May 2014
PSINS: An Open Source Event Tracer and Execution Simulator for MPI Applications book January 2009
On the validity of flow-level tcp network models for grid and cloud simulations
  • Velho, Pedro; Schnorr, Lucas Mello; Casanova, Henri
  • ACM Transactions on Modeling and Computer Simulation, Vol. 23, Issue 4 https://doi.org/10.1145/2517448
journal October 2013

Similar Records

Related Subjects