FGFS: Feature Guided Frontier Scheduling for SIMT DAGs

Ghose, Anirban; Dey, Soumyajit

doi:10.1007/s11227-022-04323-8

FGFS: Feature Guided Frontier Scheduling for SIMT DAGs

Published: 16 February 2022

Volume 78, pages 11702–11743, (2022)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

237 Accesses
Explore all metrics

Abstract

In the past decade, heterogeneous multicore architectures with support for Single Instruction Multiple Thread (SIMT) style computing have become the standard platform of choice for scheduling HPC applications. Here, applications are typically modelled as a set of data-parallel tasks with dependencies represented in the form of a directed acyclic graph (DAG). The relevant execution time information for each constituent task in the DAG is known beforehand and is leveraged by scheduling algorithms (List or Cluster based) to ascertain near-optimal schedules at runtime. However, given an online setting, where applications are submitted by multiple users and the types of applications are not restrictive, the chances of knowing execution time information for every program are highly unlikely. In this context, we propose a class of intelligent algorithms for heterogeneous CPU-GPU platforms that leverage static analysis-assisted machine learning techniques for deciding how device assignments should be made at runtime, thus bypassing the requirement for expensive offline profiling passes. We formalize relevant task-level ranking metrics and discuss how existing scheduling techniques can be adapted for our proposed class of algorithms. We also devise an online cluster scheduling algorithm that supports dynamic task arrival by determining in any given scheduling epoch, mapping decisions for a subset of tasks in a DAG. We perform a detailed comparative analysis between our proposed cluster and list scheduling heuristics via extensive simulation experiments using a variety of heterogeneous multicore platform configurations and observe performance speedups in the range of 1.1–1.5× for cluster scheduling over that of list scheduling.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Efficient New Static Scheduling Heuristic for Accelerated Architectures

A Scheduling Theory Framework for GPU Tasks Efficient Execution

A novel simulated annealing-based optimization approach for cluster-based task scheduling

Article 27 May 2021

References

Arabnejad H, Barbosa J (2012) Fairness resource sharing for dynamic workflow scheduling on heterogeneous systems. In: ISPA, pp 633–639
Arabnejad H, Barbosa JG (2014) List scheduling algorithm for heterogeneous systems by an optimistic cost table. IEEE Trans Parallel Distrib Syst 25(3):682–694
Article Google Scholar
Ashbaugh B, Bader A, Brodman J et al (2020) Data parallel c++: enhancing sycl through extensions for productivity and performance. In: IWOCL
Bittencourt LF, Sakellariou R, Madeira ERM (2010) Dag scheduling using a lookahead variant of the heterogeneous earliest finish time algorithm. In: PDP, pp 27–34
Boeres C, Filho JV, Rebello VEF (2004) A cluster-based strategy for scheduling task on heterogeneous processors. In: SBAC-PAD, pp 214–221
Capodieci N, Cavicchioli R, Bertogna M, et al (2018) Deadline-based scheduling for gpu with preemption support. In: RTSS, IEEE, pp 119–130
Chawla NV, Bowyer KW, Hall LO et al (2002) Smote: synthetic minority oversampling technique. J Artif Intell Res 16:321–357
Article Google Scholar
Chingchit S, Kumar M, Bhuyan LN (1999) A flexible clustering and scheduling scheme for efficient parallel computation. In: IPPS/SPDP, pp 500–505
Cirou B, Jeannot E (2001) Triplet: a clustering scheduling algorithm for heterogeneous systems. In: ICPPW, pp 231–236
Cordeiro D, Mounié G, Perarnau S et al (2010) Random graph generation for scheduling simulations. In: ICST, SIMUTools ’10, pp 1–10
Ghose A, Dey S, Mitra P et al (2016) Divergence aware automated partitioning of opencl workloads. In: ISEC, pp 131–135
Ghose A, Dokara L, Dey S et al (2017) A framework for opencl task scheduling on heterogeneous multicores. Parallel Process Lett 27(3–4):1–32
MathSciNet Google Scholar
Ghose A, Maity S, Kar A et al (2021) Orchestration of perception systems for reliable performance in heterogeneous platforms. In: DATE, pp 1757–1762
Grewe D, O’Boyle MF (2011) A static task partitioning approach for heterogeneous systems using opencl. In: CC, Springer, pp 286–305
Grewe D, Wang Z, O’Boyle MF (2013) Opencl task partitioning in the presence of gpu contention. In: LCPC, Springer, pp 87–101
Hagras T, Janecek J (2003) A simple scheduling heuristic for heterogeneous computing environments. In: SPDP, pp 104–110
Hsu CC, Huang KC, Wang FJ (2010) Online scheduling of workflow applications in grid environment. In: GPC, pp 300–310
Ijaz S, Munir EU (2019) Mopt: list-based heuristic for scheduling workflows in cloud environment. J Supercomput 75(7):3740–3768
Article Google Scholar
Ilavarasan E, Thambidurai P (2007) Low complexity performance effective task scheduling algorithm for heterogeneous computing environments. J Comput Sci 3(2):94–103
Article Google Scholar
Ilavarasan E, Thambidurai P, Mahilmannan R (2005) High performance task scheduling algorithm for heterogeneous computing system. In: ICA3PP. Springer, pp 193–203
Jedari B, Dehghan M (2009) Efficient dag scheduling with resource-aware clustering for heterogeneous systems. In: Computers and Information Science, pp 249–261
Kanemitsu H, Lee G, Nakazato H et al (2011) A processor mapping strategy for processor utilization in a heterogeneous distributed system. J Comput 3(11):1–8
Google Scholar
Kanemitsu H, Hanada M, Nakazato H (2016) Clustering-based task scheduling in a large number of heterogeneous processors. IEEE Trans Parallel Distrib Syst 27(11):3144–3157
Article Google Scholar
Kang W, Lee K, Lee J et al (2021) Lalarand: Flexible layer-by-layer cpu/gpu scheduling for real-time dnn tasks. In: RTSS, IEEE, pp 329–341
Khalid YN, Aleem M, Prodan R et al (2018) E-osched: a load balancing scheduler for heterogeneous multicores. J Supercomput 74(10):5399–5431
Article Google Scholar
Kofler K, Grasso I, Cosenza B et al (2013) An automatic input-sensitive approach for heterogeneous task partitioning. In: SC, ACM, pp 149–160
Lattner C, Adve V (2004) Llvm: a compilation framework for lifelong program analysis & transformation. In: CGO, p 75
Liu K, Chen J, Jin H et al (2009) A min-min average algorithm for scheduling transaction-intensive grid workflows. In: AusGrid, Australian Computer Society, Inc., pp 41–48
NVIDIA (2007) Opencl computing sdk. https://developer.nvidia.com/opencl
NVIDIA, Vingelmann P, Fitzek FH (2020) Cuda, release: 10.2.89. https://developer.nvidia.com/cuda-toolkit
Pouchet LN (2012) Polybench benchmark suite. https://web.cse.ohio-state.edu/~pouchet.2/software/polybench/
Sakellariou R, Zhao H (2004) A hybrid heuristic for dag scheduling on heterogeneous systems. In: IPDPS, pp 111–123
Senapati D, Sarkar A, Karfa C (2021) Hmds: A makespan minimizing dag scheduler for heterogeneous distributed systems. ACM Trans Embed Comput Syst 20(5s)
Stone JE, Gohara D, Shi G (2010) OpenCL: a parallel programming standard for heterogeneous computing systems. MCSE 12(3):66
Google Scholar
Topcuoglu H, Hariri S, Wu MY (2002) Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans Parallel Distrib Syst 13(3):260–274
Article Google Scholar
Wang H, Sinnen O (2018) List-scheduling vs. cluster-scheduling. IEEE Trans Parallel Distrib Syst, pp 1736–1749
Wen Y, Wang Z, O’Boyle MFP (2014) Smart multi-task scheduling for opencl programs on cpu/gpu heterogeneous platforms. In: HiPC, pp 1–10
Xiang Y, Kim H (2019) Pipelined data-parallel cpu/gpu scheduling for multi-dnn real-time inference. In: RTSS, IEEE, pp 392–405
Yu Z, Shi W (2008) A planner-guided scheduling strategy for multiple workflow applications. In: ICPPW, pp 1–8
Zhao H, Sakellariou R (2003) An experimental investigation into the rank function of the heterogeneous earliest finish time scheduling algorithm. In: Euro-Par 2003 Parallel Processing. Springer, pp 189–194
Zhao H, Sakellariou R (2006) Scheduling multiple dags onto heterogeneous systems. In: IPDPS, pp 14 – 28
Zhou H, Bateni S, Liu C (2018) S^ 3dnn: supervised streaming and scheduling for gpu-accelerated real-time dnn workloads. In: RTAS, IEEE, pp 190–201

Download references

Author information

Authors and Affiliations

Computer Science and Engineering Department, Indian Institute of Technology, Kharagpur, Kharagpur, West Bengal, 721302, India
Anirban Ghose & Soumyajit Dey

Authors

Anirban Ghose
View author publications
You can also search for this author inPubMed Google Scholar
Soumyajit Dey
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Anirban Ghose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ghose, A., Dey, S. FGFS: Feature Guided Frontier Scheduling for SIMT DAGs. J Supercomput 78, 11702–11743 (2022). https://doi.org/10.1007/s11227-022-04323-8

Download citation

Accepted: 15 January 2022
Published: 16 February 2022
Issue Date: June 2022
DOI: https://doi.org/10.1007/s11227-022-04323-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FGFS: Feature Guided Frontier Scheduling for SIMT DAGs

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Efficient New Static Scheduling Heuristic for Accelerated Architectures

A Scheduling Theory Framework for GPU Tasks Efficient Execution

A novel simulated annealing-based optimization approach for cluster-based task scheduling

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now