Abstract
In the last few decades, modern applications have become larger and more complex. Among the users of these applications, the need to simplify the process of identifying units of work increased as well. With the approach of tasking models, this want has been satisfied. These models make scheduling units of work much more user-friendly. However, with the arrival of tasking models, came granularity management. Discovering an application’s optimal granularity is a frequent and sometimes challenging task for a wide range of recursive algorithms. Often, finding the optimal granularity will cause a substantial increase in performance.
With that in mind, the quest for optimality is no easy task. Many aspects have to be considered that are directly related to lack or excess of parallelism in applications. There is no general solution as the optimal granularity depends on both algorithm and system characteristics. One commonly used method to find an optimal granularity consists in experimentally tuning an application with different granularities until an optimal is found. This paper proposes several heuristics which, combined with the appropriate monitoring techniques, allow a runtime system to automatically tune the granularity of recursive applications. The solution is independent of the architecture, execution environment or application being tested. A reference implementation in OmpSs—a task-parallel programming model—shows the programmability, ease of use and competitive performance of the proposed solution. Results show that the proposed solution is able to achieve, for any scenario, at least 75% of the performance of optimally tuned applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ayguadé, E., Copty, N., Duran, A., Hoeflinger, J., Lin, Y., Massaioli, F., Teruel, X., Unnikrishnan, P., Zhang, G.: The design of OpenMP tasks. IEEE Trans. Parallel Distrib. Syst. 20(3), 404–418 (2009)
OpenMP Architecture Review Board: OpenMP Application Program Interface Version 4.5, November 2015
Rajaraman, V., Murthy, C.S.R.: Parallel Computers: Architecture and Programming, pp. 378–380. Prentice-Hall, New Delhi (2004)
Chen, R.S.: Finding Chapel’s Peak: Introducing Auto-Tuning to the Chapel Parallel Programming Language, November 2012
Chung, I-H., Hollingsworth, J.K.: Using Information from Prior Runs to Improve Automated Tuning Systems, November 2004
Duran, A., Corbalán, J., Ayguadé, E.: An adaptive cut-off for task parallelism. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, November 2008
Barcelona Supercomputing Center: OmpSs Specification, 30 March 2017
Acknowledgments
This work has been supported by the Spanish Ministry of Science and Innovation (contract TIN2015-65316), the grant SEV-2015-0493 of Severo Ochoa Program awarded by the Spanish Government, and by Generalitat de Catalunya (contract 2014-SGR-1051).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Navarro, A., Mateo, S., Perez, J.M., Beltran, V., Ayguadé, E. (2017). Adaptive and Architecture-Independent Task Granularity for Recursive Applications. In: de Supinski, B., Olivier, S., Terboven, C., Chapman, B., Müller, M. (eds) Scaling OpenMP for Exascale Performance and Portability. IWOMP 2017. Lecture Notes in Computer Science(), vol 10468. Springer, Cham. https://doi.org/10.1007/978-3-319-65578-9_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-65578-9_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-65577-2
Online ISBN: 978-3-319-65578-9
eBook Packages: Computer ScienceComputer Science (R0)