HDSAP: heterogeneity-aware dynamic scheduling algorithm to improve performance of nanoscale many-core processors for unknown workloads

Kia, Keihaneh; Rajabzadeh, Amir

doi:10.1007/s11227-023-05159-6

HDSAP: heterogeneity-aware dynamic scheduling algorithm to improve performance of nanoscale many-core processors for unknown workloads

Published: 23 March 2023

Volume 79, pages 13341–13369, (2023)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Keihaneh Kia¹ &
Amir Rajabzadeh¹

110 Accesses
Explore all metrics

Abstract

The performance growth in processors has been continuing toward increasing the number of processing cores on the chip and scaling the feature size of transistors. However, in the nanoera, side effects of the scaling, such as induced heterogeneities in the performance, power, and soft error rate of identically designed cores, prevent the potential performance from being fully utilized. In this paper, we harness the mentioned side effects in shared-memory multicore processors with unknown workloads by a dynamic heuristic scheduling algorithm called HDSAP. HDSAP aims to maximize performance, i.e., the average response time, under power and reliability constraints in presence of induced heterogeneities. In this regard, we use a mathematical model to quantify task to core assignments based on performance variation. We also consider the variation in power to change selected cores when the power constraint is missed. To meet the reliability constraint, we use N-modular redundancy while being aware of the variation in the soft error rate of cores to prevent under/over reliability estimation. To evaluate HDSAP, we run SPLASH benchmark suite on Sniper and MACPat simulators. As a result, the response time of HDSAP reduces by 6%, 8%, and 25% in comparison with similar algorithms under the same power and reliability constraints.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Schedule refinement for homogeneous multi-core processors in the presence of manufacturing-caused heterogeneity

Article 10 December 2015

PLSS: A Scheduler for Multi-core Embedded Systems

Asymmetry-Aware Scheduling in Heterogeneous Multi-core Architectures

Data availability

Not applicable.

Notes

In this paper, we refer to a job as the incoming workload. A job is a set of dependent tasks and the type of dependency is the barrier–synchronization.
Unknown workload refers to the workload in which the arrival time, departure time, and execution time of jobs are not known to the scheduler in advance.
Since in this paper, we do not consider the effect of the memory controller on the execution time, we eliminate this effect by setting the memory controller latency to zero. We will consider it in our future studies.

References

Olofsson A (2017) Epiphany-V: A 1024 Processor 64-bit RISC system-on-chip. In: Hot Chips Symposium (HCS) ArXiv, abs/1610.01832
Intel® Xeon Phi™ Processor 7295. ark.intel.com/content/www/us/en/ark/products/128690/intel-xeon-phi-processor-7295–16gb-1–5-ghz-72-core.html. Accessed 1 June 2022
Dinechin B D (2015) Kalray MPPA®: Massively parallel processor array: Revisiting DSP acceleration with the Kalray MPPA Manycore processor. In: 2015 IEEE Hot Chips 27 Symposium (HCS), pp 1–27
Raghunathan B, Turakhia Y, Garg S, Marculescu D (2013) Cherry-picking: Exploiting process variations in dark-silicon homogeneous chip multi-processors. In: Design, Automation & Test in Europe Conference & Exhibition (DATE), pp 39–44. https://doi.org/10.7873/DATE.2013.023
Raji M, Nikseresht M (2022) UMOTS: an uncertainty-aware multi-objective genetic algorithm-based static task scheduling for heterogeneous embedded systems. J Supercomput 78:279–314. https://doi.org/10.1007/s11227-021-03887-1
Article Google Scholar
Rangan K, Powell M D, Wei G Y, Brooks D (2011) Achieving uniform performance and maximizing throughput in the presence of heterogeneity. In: 2011 IEEE 17th International Symposium on High Performance Computer Architecture, pp 3–14. https://doi.org/10.1109/HPCA.2011.5749712
Huai-Ting L, Chou CY, Yuan-Ting H, Wu AY (2017) Variation-aware reliable many-core system design by exploiting inherent core redundancy. IEEE Trans Very Large Scale Integr VLSI Syst 25(10):2803–2816
Article Google Scholar
Wang Y, Nörtershäuser D, Masson S L, Menaud J M (2019) Experimental characterization of variation in power consumption for processors of different generations. In: 15th IEEE International Conferences on Green Computing and Communications, Atlanta, United States, pp 1–9
Pathania A, Henkel J (2018) Task scheduling for many-cores with s-nuca caches. In: Design, Automation & Test in Europe (DATE), pp 557–562. https://doi.org/10.23919/DATE.2018.8342069
Salehi M, Shafique M, Kriebel F, Rehman S, Khavari Tavana M, Ejlali A, Henkel J (2015). dsReliM: Power-constrained reliability management in dark-silicon many-core chips under process variations. In: International Conferences On Hardware/Software Codesign And System Synthesis (CODES+ ISSS), pp 75–82. https://doi.org/10.1109/CODESISSS.2015.7331370
Kumar S A, Shafique M, Kumar A, Henkel J (2013) Mapping on multi/many-core systems: survey of current and emerging trends. In: 50th ACM/EDAC/IEEE Design Automation Conference (DAC), pp 1–10. https://doi.org/10.1145/2463209.2488734
Shafique M, Gnad D, Garg S, Henkel J (2015) Variability-aware dark silicon management in on-chip many-core systems. In: Design, Automation And Test In Europe Conference And Exhibition (DATE), pp 387–392. https://doi.org/10.7873/DATE.2015.0900
Rapp R, Pathania A, Henkel J (2018) Pareto-optimal power-and cache-aware task mapping for many-cores with distributed shared last-level cache. In: International Symposium on Low Power Electronics And Design (ISLPED), pp 1–6. https://doi.org/10.1145/3218603.3218630
Carlson T E, Heirman W, Eeckhout L (2011) Sniper: exploring the level of abstraction for scalable and accurate parallel multi-core simulation. In proceeding of 2011 International Conferences For High Performance Computing, Networking, Storage and Analysis, pp 1–12. https://doi.org/10.1145/2063384.2063454
Sheng L, Ahn J H, Strong R D, Brockman J B, Tullsen D M, Jouppi N P (2009). McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures. In: International Symposium Microarchitecture (ISCA).
Woo S C, Ohara M, Torrie E, Singh J P, Gupta A (1995) The SPLASH-2 programs: characterization and methodological considerations. In: International Symposium on Computer Architecture (ISCA)
Gupta M, Bhargava L, Indu S (2021) Mapping techniques in multicore processors: current and future trends. J Supercomput 77:9308–9363. https://doi.org/10.1007/s11227-021-03650-6
Article Google Scholar
Yesil S, Ozturk O (2022) Scheduling for heterogeneous systems in accelerator-rich environments. J Supercomput 78:200–221. https://doi.org/10.1007/s11227-021-03883-5
Article Google Scholar
Xu J, Shi H, Chen Y (2022) Efficient tasks scheduling in multicore systems integrated with hardware accelerators. J Supercomput. https://doi.org/10.1007/s11227-022-04955-w
Article Google Scholar
Bahrami F, Ranjbar B, Rohbani N, Ejlali A (2021) PVMC: task mapping and scheduling under process variation heterogeneity in mixed-criticality systems. IEEE Trans Emerg Topics Comput 10(2):1166–1177
Google Scholar
Kapadia N, Pasricha S (2015) VARSHA: Variation and reliability-aware application scheduling with adaptive parallelism in the dark-silicon era. In: Design, Automation & Test in Europe Conferences & Exhibition (DATE) IEEE, pp 1060–1065. https://doi.org/10.7873/DATE.2015.0454
RAPP, M., et al (2020) Neural network-based performance prediction for task migration on s-nuca many-cores. IEEE Trans Comp 70(10):1691–1704
MATH Google Scholar
Pathania A, Venkatramani V, Shafique M, Mitra T, Henkel J (2016) Optimal greedy algorithm for many-core scheduling. IEEE Trans Comput Aided Des Integr Circuits Syst 36(6):1054–1058. https://doi.org/10.1109/TCAD.2016.2618880
Article Google Scholar
Liu G, Park J, Marculescu D (2015) Procrustes: power constrained performance improvement using extended maximize-then-swap algorithm. IEEE Trans Comput Aided Des Integr Circuits Syst 34(10):1664–1676. https://doi.org/10.1109/TCAD.2015.2421911
Article Google Scholar
Kia, K., & Rajabzadeh, A. (2020) DASH: dynamic scheduling algorithm for single-isa heterogeneous nano-scale many-cores. In: IEEE (Ed.), 10th International Conference on Computer and Knowledge Engineering (ICCKE), pp. 447–452. doi: https://doi.org/10.1109/ICCKE50421.2020.9303673
Yuan B, Li B, Chen H, Zeng Z, Yao X (2020) Multi-objective redundancy hardening with optimal task mapping for independent tasks on multi-cores. Soft Comput 24:981–995
Article Google Scholar
Suraj P, Navonil C, Prasun G (2021) Dynamic task allocation and scheduling with contention-awareness for network-on-chip based multicore systems. J Syst Archit 115:102020. https://doi.org/10.1016/j.sysarc.2021.102020
Article Google Scholar
Boroumand B, Yaghoubi E, Barekatain B (2021) An enhanced cost-aware mapping algorithm based on improved shuffled frog leaping in network on chips. J Supercomput 77:498–522
Article Google Scholar
Feitelson D G, Rudolph L (1998) Metrics and Benchmarking for Parallel Job Scheduling. In: Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP) 1459 Springer Berlin Heidelberg. https://doi.org/10.1007/BFb0053978
Wu YJ, Yu ST, Lai KC, Chhabra A, Chang HY, Huang KC (2020) Two-level utilization-based processor allocation for scheduling moldable jobs. J Supercomput 76:10212–10239. https://doi.org/10.1007/s11227-020-03246-6
Article Google Scholar
Cerrolaza JP, Obermaisser R, Abella J, Cazorla FJ, Grüttner K, Agirre I, Ahmadian H, Allende I (2020) Multi-core devices for safety-critical systems: a survey. ACM Comput Surveys (CSUR) 53(4):1–38
Article Google Scholar
Ansari M, Saber-Latibari J, Pasandideh M, Ejlali A (2019) Simultaneous management of peak-power and reliability in heterogeneous multicore embedded systems. IEEE Trans Parallel Distrib Syst 31(3):623–633. https://doi.org/10.1109/TPDS.2019.2940631
Article Google Scholar

Download references

Funding

No funding was received for conducting this paper.

Author information

Authors and Affiliations

Department of Computer Engineering and Information Technology, Razi University, Kermanshah, Iran
Keihaneh Kia & Amir Rajabzadeh

Authors

Keihaneh Kia
View author publications
You can also search for this author in PubMed Google Scholar
Amir Rajabzadeh
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design.

Corresponding author

Correspondence to Amir Rajabzadeh.

Ethics declarations

Conflict of interest

The authors have no competing interests as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Kia, K., Rajabzadeh, A. HDSAP: heterogeneity-aware dynamic scheduling algorithm to improve performance of nanoscale many-core processors for unknown workloads. J Supercomput 79, 13341–13369 (2023). https://doi.org/10.1007/s11227-023-05159-6

Download citation

Accepted: 04 March 2023
Published: 23 March 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s11227-023-05159-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HDSAP: heterogeneity-aware dynamic scheduling algorithm to improve performance of nanoscale many-core processors for unknown workloads

Abstract

Access this article

Similar content being viewed by others

Schedule refinement for homogeneous multi-core processors in the presence of manufacturing-caused heterogeneity

PLSS: A Scheduler for Multi-core Embedded Systems

Asymmetry-Aware Scheduling in Heterogeneous Multi-core Architectures

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

HDSAP: heterogeneity-aware dynamic scheduling algorithm to improve performance of nanoscale many-core processors for unknown workloads

Abstract

Access this article

Similar content being viewed by others

Schedule refinement for homogeneous multi-core processors in the presence of manufacturing-caused heterogeneity

PLSS: A Scheduler for Multi-core Embedded Systems

Asymmetry-Aware Scheduling in Heterogeneous Multi-core Architectures

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation