Abstract
Nested OpenMP parallelism allows an application to spawn teams of nested threads. This hierarchical nature of thread creation and usage poses problems for performance measurement tools that must determine thread context to properly maintain per-thread performance data. In this paper we describe the problem and a novel solution for identifying threads uniquely. Our approach has been implemented in the TAU performance system and has been successfully used in profiling and tracing OpenMP applications with nested parallelism. We also describe how extensions to the OpenMP standard can help tool developers uniquely identify threads.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Shende, S., Malony, A.: The TAU Parallel Performance System. In: International Journal of High Performance Computing Applications, Summer 2006. ACTS Collection Special Issue (2006)
Malony, A., Shende, S.: Performance Technology for Complex Parallel and Distributed Systems. In: Kotsis, G., Kacsuk, P. (eds.) Distributed and Parallel Systems, From Instruction Parallelism to Cluster Computing, Third Workshop on Distributed and Parallel Systems (DAPSYS 2000), pp. 37–46. Kluwer, Dordrecht (2000)
Lindlan, K., Cuny, J., Malony, A., Shende, S., Mohr, B., Rivenburgh, R., Rasmussen, C.: A Tool Framework for Static and Dynamic Analysis of Object-Oriented Software with Templates. In: SC 2000 conference (2000)
Mohr, B., Malony, A.D., Shende, S., Wolf, F.: Towards a Performance Tool Interface for OpenMP: An Approach Based on Directive Rewriting. In: Proceedings of Third European Workshop on OpenMP (EWOMP 2001) (September 2001)
Mohr, B., Malony, A.D., Shende, S., Wolf, F.: Design and Prototype of a Performance Tool Interface for OpenMP. The Journal of Supercomputing 23, 105–128 (2002)
Huck, K.A., Malony, A.D.: PerfExplorer: A Performance Data Mining Framework for Large-Scale Parallel Computing. In: Proceedings of SC 2005 conference, ACM, New York (2005)
Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A Portable Programming Interface for Performance Evaluation on Modern Processors. International Journal of High Performance Computing Applications 14(3), 189–204 (2000)
Mohr, B., Wolf, F.: KOJAK - A Tool Set for Automatic Performance Analysis of Parallel Applications. In: Kosch, H., Böszörményi, L., Hellwagner, H. (eds.) Euro-Par 2003. LNCS, vol. 2790, pp. 1301–1304. Springer, Heidelberg (2003)
an Mey, D.: Proposed Light Weight Extensions to the OpenMP Specification, Private Communication (December 2005)
Spiegel, A.: Proposed Solution to Identifying Threads Uniquely in Nested OpenMP threads, Private Communication (October 2005)
OpenMP, http://www.openmp.org/drupal/
Tanaka, Y., Taura, K., Sato, M., Yonezawa, A.: Performance Evaluation of OpenMP Applications with Nested Parallelism. In: Languages, Compilers, and Run-Time Systems for Scalable Computers, pp. 100–112 (2000)
Gonzalez, M., Ayguade, E., Martorell, X., Labarta, J., Navarro, N., Oliver, J.: NanosCompiler: Supporting Flexible Multilevel Parallelism Exploitation in OpenMP. Concurrency - Practice and Experience 12(12), 1205–1218 (2000)
Fares, E., Meinke, M., Schröder, W.: Numerical Simulation of the Interaction of Flap Side-Edge Vortices and Engine Jets. In: Proceedings of the 22nd International Congress of Aeronautical Sciences, ICAS 0212 (September 2000)
Fares, E., Meinke, M., Schröder, W.: Numerical Simulation of the Interaction of Wingtip Vortices and Engine Jets in the Near Field. In: Proceedings of the 38th Aerospace Sciences Meeting and Exhibit, AIAA Paper 20002222 (January 2000)
OpenMP tools mailing list, Omp-tools@openmp.org, http://openmp.org/mailman/listinfo/omp-tools
Hörschler, I., Johnson, S.P., an Mey, D.: 100 (Processor) Years Simulation of Flow through the Human Nose using OpenMP (2006), http://www.rz.rwth-aachen.de/computing/events/2005/sunhpc_colloquium/07_Hoerschler.pdf
Sun Studio compilers (2006), http://developers.sun.com/prodtech/cc
ParaWise (2006), http://www.parallelsp.com/parawise.htm
Intel compilers (2006), http://www.intel.com/cd/software/products/asmo-na/eng/compilers
OpenMP API Specification 2.5 (May 2005), http://www.openmp.org/drupal/mp-documents/spec25.pdf
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Morris, A., Malony, A.D., Shende, S.S. (2008). Supporting Nested OpenMP Parallelism in the TAU Performance System. In: Mueller, M.S., Chapman, B.M., de Supinski, B.R., Malony, A.D., Voss, M. (eds) OpenMP Shared Memory Parallel Programming. IWOMP 2005. Lecture Notes in Computer Science, vol 4315. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68555-5_23
Download citation
DOI: https://doi.org/10.1007/978-3-540-68555-5_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68554-8
Online ISBN: 978-3-540-68555-5
eBook Packages: Computer ScienceComputer Science (R0)