Abstract
The scale and heterogeneity of exascale systems increment the complexity of programming applications exploiting them. Task-based approaches with support for nested tasks are a good-fitting model for them because of the flexibility lying in the task concept. Resembling the hierarchical organization of the hardware, this paper proposes establishing a hierarchy in the application workflow for mapping coarse-grain tasks to the broader hardware components and finer-grain tasks to the lowest levels of the resource hierarchy to benefit from lower-latency and higher-bandwidth communications and exploiting locality. Building on a proposed mechanism to encapsulate within the task the management of its finer-grain parallelism, the paper presents a hierarchical peer-to-peer engine orchestrating the execution of workflow hierarchies with fully-decentralized management. The tests conducted on the MareNostrum 4 supercomputer using a prototype implementation prove the validity of the proposal supporting the execution of up to 707,653 tasks using 2,400 cores and achieving speedups of up to 106 times faster than executions of a single workflow and centralized management.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Afgan, E., et al.: The galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Res. 46(1), 537–544 (2018)
Cid-Fuentes, J.Á., et al.: dislib: Large scale high performance machine learning in python. In: Proceedings of the 15th International Conference on eScience, pp. 96–105 (2019)
Dask Development Team: Dask: Library for dynamic task scheduling (2016). https://dask.org
Di Tommaso, P., et al.: Nextflow enables reproducible computational workflows. Nat. Biotechnol. 35(4), 316–319 (2017)
Duran, A., et al.: OmpSs: a proposal for programming heterogeneous multi-core architectures. Parallel Process. Lett. 21(02), 173–193 (2011)
Ejarque, J., et al.: A hierarchic task-based programming model for distributed heterogeneous computing. Int. J. High Perform. Comput. Appl. 33(5), 987–997 (2019)
Graf, H., et al.: Parallel support vector machines: the cascade SVM. In: Advances in Neural Information Processing Systems, vol. 17 (2004)
Herault, T., et al.: Composition of algorithmic building blocks in template task graphs. In: 2022 IEEE/ACM Parallel Applications Workshop: Alternatives To MPI+ X (PAW-ATM), pp. 26–38 (2022)
Ho, T.K.: Random decision forests. In: Proceedings of 3rd International Conference on Document Analysis and Recognition, vol. 1, pp. 278–282 (1995)
Intel Corporation: OneAPI TBB Nested parallelism (2022). https://oneapi-src.github.io/oneTBB/main/tbb_userguide/Cancellation_and_Nested_Parallelism.html
Lerman, P.: Fitting segmented regression models by grid search. J. R. Stat. Soc.: Ser. C: Appl. Stat. 29(1), 77–84 (1980)
Lordan, F., et al.: Artifact and instructions to generate experimental results for the Euro-Par 2023 proceedings paper: hierarchical management of extreme-scale task-based applications. https://doi.org/10.6084/m9.figshare.23552229
Lordan, F., et al.: ServiceSs: an interoperable programming framework for the cloud. J. Grid Comput. 12(1), 67–91 (2014)
Lordan, F., Lezzi, D., Badia, R.M.: Colony: parallel functions as a service on the cloud-edge continuum. In: Sousa, L., Roma, N., Tomás, P. (eds.) Euro-Par 2021. LNCS, vol. 12820, pp. 269–284. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-85665-6_17
Mölder, F., et al.: Sustainable data analysis with snakemake. F1000Research 10(33) (2021)
Perez, J.M., et al.: Improving the integration of task nesting and dependencies in OpenMP. In: 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 809–818 (2017)
Planas, J., et al.: Hierarchical task-based programming with StarSs. Int J. High Perform. Comput. Appl. 23(3), 284–299 (2009)
Rabenseifner, R., et al.: Hybrid MPI/OpenMP parallel programming on clusters of multi-core SMP nodes. In: 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing, pp. 427–436 (2009)
Vandierendonck, H., et al.: Parallel programming of general-purpose programs using task-based programming models. In: 3rd USENIX Workshop on Hot Topics in Parallelism (HotPar 11) (2011)
Wozniak, J.M., et al.: Swift/t: large-scale application composition via distributed-memory dataflow processing. In: 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, pp. 95–102 (2013)
Yoo, A.B., et al.: SLURM: simple Linux utility for resource management. In: Job Scheduling Strategies for Parallel Processing, pp. 44–60 (2003)
Acknowledgements and Data Availability
This work has been supported by the Spanish Government (PID2019-107255GB), by MCIN/AEI /10.13039/501100011033 (CEX2021-001148-S), by the Departament de Recerca i Universitats de la Generalitat de Catalunya to the Research Group MPiEDist (2021 SGR 00412), and by the European Commission through the Horizon Europe Research and Innovation program under Grant Agreements 101070177 (ICOS project) and 101016577 (AI-Sprint project). The data and code that support this study are openly available in figshare [12].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Lordan, F., Puigdemunt, G., Vergés, P., Conejero, J., Ejarque, J., Badia, R.M. (2023). Hierarchical Management of Extreme-Scale Task-Based Applications. In: Cano, J., Dikaiakos, M.D., Papadopoulos, G.A., Pericàs, M., Sakellariou, R. (eds) Euro-Par 2023: Parallel Processing. Euro-Par 2023. Lecture Notes in Computer Science, vol 14100. Springer, Cham. https://doi.org/10.1007/978-3-031-39698-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-39698-4_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-39697-7
Online ISBN: 978-3-031-39698-4
eBook Packages: Computer ScienceComputer Science (R0)