Skip to main content
Log in

Supporting Nested OpenMP Parallelism in the TAU Performance System

  • Special Issue OpenMP
  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Nested OpenMP parallelism allows an application to spawn teams of nested threads. This hierarchical nature of thread creation and usage poses problems for performance measurement tools that must determine thread context to properly maintain per-thread performance data. In this paper we describe the problem and a novel solution for identifying threads uniquely. Our approach has been implemented in the TAU performance system and has been successfully used in profiling and tracing OpenMP applications with nested parallelism. We also describe how extensions to the OpenMP standard can help tool developers uniquely identify threads.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. OpenMP, http://www.openmp.org/drupal/.

  2. Y. Tanaka, K. Taura, M. Sato, and A. Yonezawa, Performance Evaluation of OpenMP Applications with Nested Parallelism, in Languages, Compilers, and Run-Time Systems for Scalable Computers, pp. 100–112 (2000).

  3. Gonzalez M., Ayguade E., Martorell X., Labarta J., Navarro N., Oliver J. (2000). NanosCompiler: Supporting Flexible Multilevel Parallelism Exploitation in OpenMP. Concurrency - Pract. Exp. 12(12):1205–1218

    Article  MATH  Google Scholar 

  4. Sun Studio compilers, http://developers.sun.com/prodtech/cc (2006).

  5. Intel compilers, http://www.intel.com/cd/software/products/asmo-na/eng/compilers (2006).

  6. OpenMP API Specification 2.5, http://www.openmp.org/drupal/mp-documents/ spec25.pdf, May (2005).

  7. OpenMP tools mailing list, Omp-tools@openmp.org, http://openmp.org/mailman/listinfo/omp-tools.

  8. Shende S., Malony A.D. (2006). The TAU Parallel Performance System. Int. J. High Perform. Comput. Appl. 20(2):287-331

    Article  Google Scholar 

  9. E. Fares, M. Meinke, and W. Schröder, Numerical Simulation of the Interaction of Flap Side-Edge Vortices and Engine Jets, in Proceedings of the 22nd International Congress of Aeronautical Sciences, ICAS 0212, September (2000).

  10. E. Fares, M. Meinke, and W. Schröder, Numerical Simulation of the Interaction of Wingtip Vortices and Engine Jets in the Near Field, in Proceedings of the 38th Aerospace Sciences Meeting and Exhibit, AIAA Paper 20002222, January (2000)

  11. Bryan R. Buck and Jeffrey K. Hollingsworth, An API for Runtime Code Patching, J. High Perform. Comput. Appl. 14(4):Winter (2000).

  12. R. Bell, A. D. Malony, and S. Shende, A Portable, Extensible, and Scalable Tool for Parallel Performance Profile Analysis, Proc. EUROPAR 2003 conference, LNCS 2790, Springer, Berlin, pp. 17–26 (2003).

  13. K. A. Huck, A. D. Malony, R. Bell, and A. Morris, Design and Implementation of a Parallel Performance Data Management Framework, Proc. International Conference on Parallel Processing (ICPP 2005), IEEE Computer Society (2005).

  14. K. A. Huck and A. D. Malony, PerfExplorer: A Performance Data Mining Framework for Large-Scale Parallel Computing, in Proceedings of SC 2005 conference, ACM (2005).

  15. B. Mohr, A. D. Malony, S. Shende, and F. Wolf, Towards a Performance Tool Interface for OpenMP: An Approach Based on Directive Rewriting, in Proceedings of Third European Workshop on OpenMP, (EWOMP 2001), September (2001).

  16. B. Mohr and F. Wolf, KOJAK - A Tool Set for Automatic Performance Analysis of Parallel Applications, in Proc. of the European Conference on Parallel Computing (EuroPar), Springer-Verlag, Berlin LNCS 2790, pp. 1301–1304, August 26–29 (2003).

  17. K. Lindlan, J. Cuny, A. Malony, S. Shende, B. Mohr, R. Rivenburgh, and C. Rasmussen, A Tool Framework for Static and Dynamic Analysis of Object-Oriented Software with Templates, SC 2000 conference (2000).

  18. Mohr B., Malony A.D., Shende S., Wolf F. (2002). Design and Prototype of a Performance Tool Interface for OpenMP. J. Supercomput. 23: 105–128

    Article  MATH  Google Scholar 

  19. ParaWise, http://www.parallelsp.com/parawise.htm (2006).

  20. I. Hörschler, S. P. Johnson, and D. an Mey, 100 (Processor) Years Simulation of Flow through the Human Nose using OpenMP, http://www.rz. rwth-aachen.de/computing/events/2005/sunhpc_colloquium/ 07_Hoerschler.pdf (2006).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alan Morris.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Morris, A., Malony, A.D. & Shende, S.S. Supporting Nested OpenMP Parallelism in the TAU Performance System. Int J Parallel Prog 35, 417–436 (2007). https://doi.org/10.1007/s10766-007-0050-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-007-0050-5

Keywords

Navigation