Abstract
OpenMP allows programmers to specify nested parallelism in parallel applications. In the case of scientific applications, parallel loops are the most important source of parallelism. In this paper we present an automatic mechanism to dynamically detect the best way to exploit the parallelism when having nested parallel loops. This mechanism is based on the number of threads, the problem size, and the number of iterations on the loop. To do that, we claim that programmers must specify the potential application parallelism and give the runtime the responsibility to decide the best way to exploit it. We have implemented this mechanism inside the IBM XL runtime library. Evaluation shows that our mechanism dynamically adapts the parallelism generated to the application and runtime parameters, reaching the same speedup as the best static parallelization (with a priori information).
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
OpenMP Organization. Openmp fortran application interface, v. 2.0 (June 2000), www.openmp.org
Chen, D., Su, H., Yew, P.: The impact of synchronization and granularity in parallel systems. In: Proceedings of the17th Annual International Symposium on Computer Architecture, pp. 239–248 (1990)
Harrison III, W., Chow, J.H.: Dynamic control of parallelism and granularity in executing nested parallel loops. In: Proceedings of IEEE Third Symposium on Parallel and Distributed Processing, pp. 678–685 (1991)
Moreira, J.E., Schouten, D., Polychronopoulos, C.: The performance impact of granularity control and functional parallelism. In: Huang, C.H., Sadayappan, P., Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds.) Languages and Compilers for Parallel Computing, pp. 481–597. Springer, Berlin (1995)
Jin, H., Jost, G., Johnson, D., Tao, W.: Experience on the parallelization of a cloud modelling code using computer-aided tools. Technical report, NASA Ames Research Center, NAS-03-006 (March 2003)
González, M., Oliver, J., Martorell, X., Ayguadé, E., Labarta, J., Navarro, N.: OpenMP extensions for thread groups and their run-time support. In: Midkiff, S.P., Moreira, J.E., Gupta, M., Chatterjee, S., Ferrante, J., Prins, J.F., Pugh, B., Tseng, C.-W. (eds.) LCPC 2000. LNCS, vol. 2017, p. 324. Springer, Heidelberg (2001)
Shah, S., Haab, G., Petersen, P., Throop, J.: Flexible control structures for parallellism in openmp. In: 1st European Workshop on OpenMP (September 1999)
Ayguadé, E., González, M., Martorell, X., Jost, G.: Employing nested openmp for the parallelization of multi-zone computational fluid dynamics applications. In: Proceedings of the International Parallel and Distributed Processing Symposium (April 2004)
Ayguade, E., Martorell, X., Labarta, J., Gonzalez, M., Navarro, N.: Exploiting multiple levels of parallelism in openmp: A case study. In: Proceedings of the 1999 International Conference on Parallel Processing (September 1999)
Blikberg, R., Sørevik, T.: Nested parallelism: Allocation of processors to tasks and openmp implementation. In: Proceedings of the 1999 International Conference on Parallel Processing (September 1999)
Tanaka, Y., Taura, K., Sato, M., Yonezawa, A.: Performance evaluation of openmp applications with nested parallelism. In: Dwarkadas, S. (ed.) LCR 2000. LNCS, vol. 1915, Springer, Heidelberg (2000)
Blikberg, R., Sørevik, T.: Nested parallelism in openmp. In: ParCo (2003)
Goldstein, S.C., Schauser, K.E., Culler, D.E.: Enabling Primitives for Compiling Parallel Languages. In: Third Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers (May 1995)
Dechering, P., Breebaart, L., kuijlman, F., van Reeuwijk, K.: Semantics and implementation of a generalized forall statement for parallel languages. In: 11th International Parallel Processing Symposium (IPPS 1997) (April 1997)
Ayguadé, E., Blainey, B., Duran, A., Labarta, J., Martínez, F., Silvera, R.: Martorell, X.: Is the schedule clause really necessary in openmp? In: Voss, M.J. (ed.) WOMPAT 2003. LNCS, vol. 2716, pp. 69–83. Springer, Heidelberg (2003)
Van der Wijngaart, R.F., Jin, H.: Nas parallel benchmarks, multi-zone versions. Technical report, NASA Ames Research Center, NAS-03-010 (July 2003)
Bailey, D.H., Barszcz, E., Barton, J.T., Browning, D.S., Carter, R.L., Dagum, D., Fatoohi, R.A., Frederickson, P.O., Lasinski, T.A., Schreiber, R.S., Simon, H.D., Venkatakrishnan, V., Weeratunga, S.K.: The NAS Parallel Benchmarks. The International Journal of Supercomputer Applications, Fall 5(3), 63–73 (1991)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Duran, A., Silvera, R., Corbalán, J., Labarta, J. (2005). Runtime Adjustment of Parallel Nested Loops. In: Chapman, B.M. (eds) Shared Memory Parallel Programming with Open MP. WOMPAT 2004. Lecture Notes in Computer Science, vol 3349. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31832-3_12
Download citation
DOI: https://doi.org/10.1007/978-3-540-31832-3_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24560-5
Online ISBN: 978-3-540-31832-3
eBook Packages: Computer ScienceComputer Science (R0)