Skip to main content

Improve OpenMP Performance by Extending BARRIER and REDUCTION Constructs

  • Conference paper
Book cover High Performance Computing (ISHPC 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2858))

Included in the following conference series:

Abstract.

Barrier synchronization and reduction are global operations used frequently in large scale OpenMP programs. To improveOpenMP performance, we present two new directives BARRIER(0) and ALLREDUCTION to extend BARRIER and REDUCTION constructs in OpenMP API. The new extensions have been implemented on our portableOpenMP compiler on JIAJIA. Benchmark testing and experiments show that these constructs decrease the system overheads from synchronization, reduction operation and access of reduction variables on SDSM systems significantly. It is predicable that the improvement of performance can be obtained on ccNUMA systems.

This work was supported by National 863 Hi-Tech Programme of China under Grant No. 2002AA1Z2105.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. The OpenMP Forum. OpenMP Fortran Application Program Interface, Version 1.0, October 1997 and OpenMP Fortran Application Program Interface, Version 1.1 (November 1999), See http://www.OpenMP.org

  2. The OpenMP Forum. OpenMP Fortran Application Program Interface, Version 2.0 (November 2000), See http://www.OpenMP.org

  3. Culler, D., Singh, J.P., Gupta, A.: Parallel Computer Architecture, a Hardware/ Software Approach. Morgan Kaufmann Publishers, San Francisco (1998)

    Google Scholar 

  4. Hu, W., Shi, W., Tang, Z.: A Lock-Based Cache Coherence Protocol for Scope Consistency. Journal of Computer Science and Technology 13(2), 97–109 (1998)

    Article  Google Scholar 

  5. Hu, W., Shi, W.S., Tang, Z.: JIAJIA: An SVM System Based on A New Cache Coherence Protocol. In: Sloot, P.M.A., Hoekstra, A.G., Bubak, M., Hertzberger, B. (eds.) HPCN-Europe 1999. LNCS, vol. 1593, pp. 463–472. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  6. Amza, C., Cox, A.L., Dwarkadas, S., Keleher, P., Lu, H., Rajamony, R., Yu, W., Zwaenepoel, W.: TreadMarks: Shared Memory Computing on Networks of Workstations. IEEE Computer 29(2), 18–28 (1996)

    Google Scholar 

  7. Kumar, S., Jiang, D., Chandra, R., Singh, J.P.: Evaluating Synchronization on Shared Address Space Multiprocessors: Methodology and Performance. In: Proc. of the 1999 ACM SIGMETRICS Conference, Atlanta(USA), pp. 23–34 (1999)

    Google Scholar 

  8. Birdsall, C.K., Longdon, A.B.: Plasma Physics via Computer Simulation. Mc- Graw Hill Book Company, New York (1985)

    Google Scholar 

  9. Zeyao, M., Linbao, X., Boilin, Z., Longjun, S.: Parallel Computing and Performance Analysis for Two-Dimensional Plasma Simulations with Particle Clounds in Cells Methords. Computational Physics, pp. 496-504 (September 1999)

    Google Scholar 

  10. Iftode, L., Singh, J.P., Li, K.: Scope Consistency: A Bridge between Release Consistency and Entry Consistency. In: Proc. of the 8th ACM Annual Symp. on Parallel Algorithms and Architectures (SPAA 1996), June 1996, pp. 277–287 (1996)

    Google Scholar 

  11. Bull, J.M., O’Neill, D.: A Microbenchmark Suite for OpenMP 2.0. In: Proc. of the European Workshop on OpenMP (EWOMP 2001) (September 1999)

    Google Scholar 

  12. Laudon, J., Lenoski, D.: The SGI Origin2000: A ccNUMA Highly Scalable Server. In: Proc. of the 24th Annual International Symposium on Computer Architecture, Denver, USA, pp. 241-251 (1997)

    Google Scholar 

  13. Silicon Graphics, Inc. SGITM OriginTM 3000 Series. Technical Report (2002)

    Google Scholar 

  14. Lovett, T., Clapp, R.: STiNG: A CC-NUMA Computer System for the Commercial Marketplace. In: Proc. of the 23rd Annual International Symposium on Computer Architecture, Philadelphia, USA (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chun, H., Xuejun, Y. (2003). Improve OpenMP Performance by Extending BARRIER and REDUCTION Constructs. In: Veidenbaum, A., Joe, K., Amano, H., Aiso, H. (eds) High Performance Computing. ISHPC 2003. Lecture Notes in Computer Science, vol 2858. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39707-6_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-39707-6_48

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20359-9

  • Online ISBN: 978-3-540-39707-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics