Skip to main content

Supporting Nested OpenMP Parallelism in the TAU Performance System

  • Conference paper
Book cover OpenMP Shared Memory Parallel Programming (IWOMP 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4315))

Included in the following conference series:

Abstract

Nested OpenMP parallelism allows an application to spawn teams of nested threads. This hierarchical nature of thread creation and usage poses problems for performance measurement tools that must determine thread context to properly maintain per-thread performance data. In this paper we describe the problem and a novel solution for identifying threads uniquely. Our approach has been implemented in the TAU performance system and has been successfully used in profiling and tracing OpenMP applications with nested parallelism. We also describe how extensions to the OpenMP standard can help tool developers uniquely identify threads.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Shende, S., Malony, A.: The TAU Parallel Performance System. In: International Journal of High Performance Computing Applications, Summer 2006. ACTS Collection Special Issue (2006)

    Google Scholar 

  2. Malony, A., Shende, S.: Performance Technology for Complex Parallel and Distributed Systems. In: Kotsis, G., Kacsuk, P. (eds.) Distributed and Parallel Systems, From Instruction Parallelism to Cluster Computing, Third Workshop on Distributed and Parallel Systems (DAPSYS 2000), pp. 37–46. Kluwer, Dordrecht (2000)

    Google Scholar 

  3. Lindlan, K., Cuny, J., Malony, A., Shende, S., Mohr, B., Rivenburgh, R., Rasmussen, C.: A Tool Framework for Static and Dynamic Analysis of Object-Oriented Software with Templates. In: SC 2000 conference (2000)

    Google Scholar 

  4. Mohr, B., Malony, A.D., Shende, S., Wolf, F.: Towards a Performance Tool Interface for OpenMP: An Approach Based on Directive Rewriting. In: Proceedings of Third European Workshop on OpenMP (EWOMP 2001) (September 2001)

    Google Scholar 

  5. Mohr, B., Malony, A.D., Shende, S., Wolf, F.: Design and Prototype of a Performance Tool Interface for OpenMP. The Journal of Supercomputing 23, 105–128 (2002)

    Article  MATH  Google Scholar 

  6. Huck, K.A., Malony, A.D.: PerfExplorer: A Performance Data Mining Framework for Large-Scale Parallel Computing. In: Proceedings of SC 2005 conference, ACM, New York (2005)

    Google Scholar 

  7. Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A Portable Programming Interface for Performance Evaluation on Modern Processors. International Journal of High Performance Computing Applications 14(3), 189–204 (2000)

    Article  Google Scholar 

  8. Mohr, B., Wolf, F.: KOJAK - A Tool Set for Automatic Performance Analysis of Parallel Applications. In: Kosch, H., Böszörményi, L., Hellwagner, H. (eds.) Euro-Par 2003. LNCS, vol. 2790, pp. 1301–1304. Springer, Heidelberg (2003)

    Google Scholar 

  9. an Mey, D.: Proposed Light Weight Extensions to the OpenMP Specification, Private Communication (December 2005)

    Google Scholar 

  10. Spiegel, A.: Proposed Solution to Identifying Threads Uniquely in Nested OpenMP threads, Private Communication (October 2005)

    Google Scholar 

  11. OpenMP, http://www.openmp.org/drupal/

  12. Tanaka, Y., Taura, K., Sato, M., Yonezawa, A.: Performance Evaluation of OpenMP Applications with Nested Parallelism. In: Languages, Compilers, and Run-Time Systems for Scalable Computers, pp. 100–112 (2000)

    Google Scholar 

  13. Gonzalez, M., Ayguade, E., Martorell, X., Labarta, J., Navarro, N., Oliver, J.: NanosCompiler: Supporting Flexible Multilevel Parallelism Exploitation in OpenMP. Concurrency - Practice and Experience 12(12), 1205–1218 (2000)

    Article  MATH  Google Scholar 

  14. Fares, E., Meinke, M., Schröder, W.: Numerical Simulation of the Interaction of Flap Side-Edge Vortices and Engine Jets. In: Proceedings of the 22nd International Congress of Aeronautical Sciences, ICAS 0212 (September 2000)

    Google Scholar 

  15. Fares, E., Meinke, M., Schröder, W.: Numerical Simulation of the Interaction of Wingtip Vortices and Engine Jets in the Near Field. In: Proceedings of the 38th Aerospace Sciences Meeting and Exhibit, AIAA Paper 20002222 (January 2000)

    Google Scholar 

  16. OpenMP tools mailing list, Omp-tools@openmp.org, http://openmp.org/mailman/listinfo/omp-tools

  17. Hörschler, I., Johnson, S.P., an Mey, D.: 100 (Processor) Years Simulation of Flow through the Human Nose using OpenMP (2006), http://www.rz.rwth-aachen.de/computing/events/2005/sunhpc_colloquium/07_Hoerschler.pdf

  18. Sun Studio compilers (2006), http://developers.sun.com/prodtech/cc

  19. ParaWise (2006), http://www.parallelsp.com/parawise.htm

  20. Intel compilers (2006), http://www.intel.com/cd/software/products/asmo-na/eng/compilers

  21. OpenMP API Specification 2.5 (May 2005), http://www.openmp.org/drupal/mp-documents/spec25.pdf

Download references

Author information

Authors and Affiliations

Authors

Editor information

Matthias S. Mueller Barbara M. Chapman Bronis R. de Supinski Allen D. Malony Michael Voss

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Morris, A., Malony, A.D., Shende, S.S. (2008). Supporting Nested OpenMP Parallelism in the TAU Performance System. In: Mueller, M.S., Chapman, B.M., de Supinski, B.R., Malony, A.D., Voss, M. (eds) OpenMP Shared Memory Parallel Programming. IWOMP 2005. Lecture Notes in Computer Science, vol 4315. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68555-5_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68555-5_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68554-8

  • Online ISBN: 978-3-540-68555-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics