Skip to main content
Log in

FGFS: Feature Guided Frontier Scheduling for SIMT DAGs

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

In the past decade, heterogeneous multicore architectures with support for Single Instruction Multiple Thread (SIMT) style computing have become the standard platform of choice for scheduling HPC applications. Here, applications are typically modelled as a set of data-parallel tasks with dependencies represented in the form of a directed acyclic graph (DAG). The relevant execution time information for each constituent task in the DAG is known beforehand and is leveraged by scheduling algorithms (List or Cluster based) to ascertain near-optimal schedules at runtime. However, given an online setting, where applications are submitted by multiple users and the types of applications are not restrictive, the chances of knowing execution time information for every program are highly unlikely. In this context, we propose a class of intelligent algorithms for heterogeneous CPU-GPU platforms that leverage static analysis-assisted machine learning techniques for deciding how device assignments should be made at runtime, thus bypassing the requirement for expensive offline profiling passes. We formalize relevant task-level ranking metrics and discuss how existing scheduling techniques can be adapted for our proposed class of algorithms. We also devise an online cluster scheduling algorithm that supports dynamic task arrival by determining in any given scheduling epoch, mapping decisions for a subset of tasks in a DAG. We perform a detailed comparative analysis between our proposed cluster and list scheduling heuristics via extensive simulation experiments using a variety of heterogeneous multicore platform configurations and observe performance speedups in the range of 1.1–1.5× for cluster scheduling over that of list scheduling.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Arabnejad H, Barbosa J (2012) Fairness resource sharing for dynamic workflow scheduling on heterogeneous systems. In: ISPA, pp 633–639

  2. Arabnejad H, Barbosa JG (2014) List scheduling algorithm for heterogeneous systems by an optimistic cost table. IEEE Trans Parallel Distrib Syst 25(3):682–694

    Article  Google Scholar 

  3. Ashbaugh B, Bader A, Brodman J et al (2020) Data parallel c++: enhancing sycl through extensions for productivity and performance. In: IWOCL

  4. Bittencourt LF, Sakellariou R, Madeira ERM (2010) Dag scheduling using a lookahead variant of the heterogeneous earliest finish time algorithm. In: PDP, pp 27–34

  5. Boeres C, Filho JV, Rebello VEF (2004) A cluster-based strategy for scheduling task on heterogeneous processors. In: SBAC-PAD, pp 214–221

  6. Capodieci N, Cavicchioli R, Bertogna M, et al (2018) Deadline-based scheduling for gpu with preemption support. In: RTSS, IEEE, pp 119–130

  7. Chawla NV, Bowyer KW, Hall LO et al (2002) Smote: synthetic minority oversampling technique. J Artif Intell Res 16:321–357

    Article  Google Scholar 

  8. Chingchit S, Kumar M, Bhuyan LN (1999) A flexible clustering and scheduling scheme for efficient parallel computation. In: IPPS/SPDP, pp 500–505

  9. Cirou B, Jeannot E (2001) Triplet: a clustering scheduling algorithm for heterogeneous systems. In: ICPPW, pp 231–236

  10. Cordeiro D, Mounié G, Perarnau S et al (2010) Random graph generation for scheduling simulations. In: ICST, SIMUTools ’10, pp 1–10

  11. Ghose A, Dey S, Mitra P et al (2016) Divergence aware automated partitioning of opencl workloads. In: ISEC, pp 131–135

  12. Ghose A, Dokara L, Dey S et al (2017) A framework for opencl task scheduling on heterogeneous multicores. Parallel Process Lett 27(3–4):1–32

    MathSciNet  Google Scholar 

  13. Ghose A, Maity S, Kar A et al (2021) Orchestration of perception systems for reliable performance in heterogeneous platforms. In: DATE, pp 1757–1762

  14. Grewe D, O’Boyle MF (2011) A static task partitioning approach for heterogeneous systems using opencl. In: CC, Springer, pp 286–305

  15. Grewe D, Wang Z, O’Boyle MF (2013) Opencl task partitioning in the presence of gpu contention. In: LCPC, Springer, pp 87–101

  16. Hagras T, Janecek J (2003) A simple scheduling heuristic for heterogeneous computing environments. In: SPDP, pp 104–110

  17. Hsu CC, Huang KC, Wang FJ (2010) Online scheduling of workflow applications in grid environment. In: GPC, pp 300–310

  18. Ijaz S, Munir EU (2019) Mopt: list-based heuristic for scheduling workflows in cloud environment. J Supercomput 75(7):3740–3768

    Article  Google Scholar 

  19. Ilavarasan E, Thambidurai P (2007) Low complexity performance effective task scheduling algorithm for heterogeneous computing environments. J Comput Sci 3(2):94–103

    Article  Google Scholar 

  20. Ilavarasan E, Thambidurai P, Mahilmannan R (2005) High performance task scheduling algorithm for heterogeneous computing system. In: ICA3PP. Springer, pp 193–203

  21. Jedari B, Dehghan M (2009) Efficient dag scheduling with resource-aware clustering for heterogeneous systems. In: Computers and Information Science, pp 249–261

  22. Kanemitsu H, Lee G, Nakazato H et al (2011) A processor mapping strategy for processor utilization in a heterogeneous distributed system. J Comput 3(11):1–8

    Google Scholar 

  23. Kanemitsu H, Hanada M, Nakazato H (2016) Clustering-based task scheduling in a large number of heterogeneous processors. IEEE Trans Parallel Distrib Syst 27(11):3144–3157

    Article  Google Scholar 

  24. Kang W, Lee K, Lee J et al (2021) Lalarand: Flexible layer-by-layer cpu/gpu scheduling for real-time dnn tasks. In: RTSS, IEEE, pp 329–341

  25. Khalid YN, Aleem M, Prodan R et al (2018) E-osched: a load balancing scheduler for heterogeneous multicores. J Supercomput 74(10):5399–5431

    Article  Google Scholar 

  26. Kofler K, Grasso I, Cosenza B et al (2013) An automatic input-sensitive approach for heterogeneous task partitioning. In: SC, ACM, pp 149–160

  27. Lattner C, Adve V (2004) Llvm: a compilation framework for lifelong program analysis & transformation. In: CGO, p 75

  28. Liu K, Chen J, Jin H et al (2009) A min-min average algorithm for scheduling transaction-intensive grid workflows. In: AusGrid, Australian Computer Society, Inc., pp 41–48

  29. NVIDIA (2007) Opencl computing sdk. https://developer.nvidia.com/opencl

  30. NVIDIA, Vingelmann P, Fitzek FH (2020) Cuda, release: 10.2.89. https://developer.nvidia.com/cuda-toolkit

  31. Pouchet LN (2012) Polybench benchmark suite. https://web.cse.ohio-state.edu/~pouchet.2/software/polybench/

  32. Sakellariou R, Zhao H (2004) A hybrid heuristic for dag scheduling on heterogeneous systems. In: IPDPS, pp 111–123

  33. Senapati D, Sarkar A, Karfa C (2021) Hmds: A makespan minimizing dag scheduler for heterogeneous distributed systems. ACM Trans Embed Comput Syst 20(5s)

  34. Stone JE, Gohara D, Shi G (2010) OpenCL: a parallel programming standard for heterogeneous computing systems. MCSE 12(3):66

    Google Scholar 

  35. Topcuoglu H, Hariri S, Wu MY (2002) Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans Parallel Distrib Syst 13(3):260–274

    Article  Google Scholar 

  36. Wang H, Sinnen O (2018) List-scheduling vs. cluster-scheduling. IEEE Trans Parallel Distrib Syst, pp 1736–1749

  37. Wen Y, Wang Z, O’Boyle MFP (2014) Smart multi-task scheduling for opencl programs on cpu/gpu heterogeneous platforms. In: HiPC, pp 1–10

  38. Xiang Y, Kim H (2019) Pipelined data-parallel cpu/gpu scheduling for multi-dnn real-time inference. In: RTSS, IEEE, pp 392–405

  39. Yu Z, Shi W (2008) A planner-guided scheduling strategy for multiple workflow applications. In: ICPPW, pp 1–8

  40. Zhao H, Sakellariou R (2003) An experimental investigation into the rank function of the heterogeneous earliest finish time scheduling algorithm. In: Euro-Par 2003 Parallel Processing. Springer, pp 189–194

  41. Zhao H, Sakellariou R (2006) Scheduling multiple dags onto heterogeneous systems. In: IPDPS, pp 14 – 28

  42. Zhou H, Bateni S, Liu C (2018) S^ 3dnn: supervised streaming and scheduling for gpu-accelerated real-time dnn workloads. In: RTAS, IEEE, pp 190–201

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anirban Ghose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ghose, A., Dey, S. FGFS: Feature Guided Frontier Scheduling for SIMT DAGs. J Supercomput 78, 11702–11743 (2022). https://doi.org/10.1007/s11227-022-04323-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04323-8

Keywords

Navigation