Abstract.
Barrier synchronization and reduction are global operations used frequently in large scale OpenMP programs. To improveOpenMP performance, we present two new directives BARRIER(0) and ALLREDUCTION to extend BARRIER and REDUCTION constructs in OpenMP API. The new extensions have been implemented on our portableOpenMP compiler on JIAJIA. Benchmark testing and experiments show that these constructs decrease the system overheads from synchronization, reduction operation and access of reduction variables on SDSM systems significantly. It is predicable that the improvement of performance can be obtained on ccNUMA systems.
This work was supported by National 863 Hi-Tech Programme of China under Grant No. 2002AA1Z2105.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
The OpenMP Forum. OpenMP Fortran Application Program Interface, Version 1.0, October 1997 and OpenMP Fortran Application Program Interface, Version 1.1 (November 1999), See http://www.OpenMP.org
The OpenMP Forum. OpenMP Fortran Application Program Interface, Version 2.0 (November 2000), See http://www.OpenMP.org
Culler, D., Singh, J.P., Gupta, A.: Parallel Computer Architecture, a Hardware/ Software Approach. Morgan Kaufmann Publishers, San Francisco (1998)
Hu, W., Shi, W., Tang, Z.: A Lock-Based Cache Coherence Protocol for Scope Consistency. Journal of Computer Science and Technology 13(2), 97–109 (1998)
Hu, W., Shi, W.S., Tang, Z.: JIAJIA: An SVM System Based on A New Cache Coherence Protocol. In: Sloot, P.M.A., Hoekstra, A.G., Bubak, M., Hertzberger, B. (eds.) HPCN-Europe 1999. LNCS, vol. 1593, pp. 463–472. Springer, Heidelberg (1999)
Amza, C., Cox, A.L., Dwarkadas, S., Keleher, P., Lu, H., Rajamony, R., Yu, W., Zwaenepoel, W.: TreadMarks: Shared Memory Computing on Networks of Workstations. IEEE Computer 29(2), 18–28 (1996)
Kumar, S., Jiang, D., Chandra, R., Singh, J.P.: Evaluating Synchronization on Shared Address Space Multiprocessors: Methodology and Performance. In: Proc. of the 1999 ACM SIGMETRICS Conference, Atlanta(USA), pp. 23–34 (1999)
Birdsall, C.K., Longdon, A.B.: Plasma Physics via Computer Simulation. Mc- Graw Hill Book Company, New York (1985)
Zeyao, M., Linbao, X., Boilin, Z., Longjun, S.: Parallel Computing and Performance Analysis for Two-Dimensional Plasma Simulations with Particle Clounds in Cells Methords. Computational Physics, pp. 496-504 (September 1999)
Iftode, L., Singh, J.P., Li, K.: Scope Consistency: A Bridge between Release Consistency and Entry Consistency. In: Proc. of the 8th ACM Annual Symp. on Parallel Algorithms and Architectures (SPAA 1996), June 1996, pp. 277–287 (1996)
Bull, J.M., O’Neill, D.: A Microbenchmark Suite for OpenMP 2.0. In: Proc. of the European Workshop on OpenMP (EWOMP 2001) (September 1999)
Laudon, J., Lenoski, D.: The SGI Origin2000: A ccNUMA Highly Scalable Server. In: Proc. of the 24th Annual International Symposium on Computer Architecture, Denver, USA, pp. 241-251 (1997)
Silicon Graphics, Inc. SGITM OriginTM 3000 Series. Technical Report (2002)
Lovett, T., Clapp, R.: STiNG: A CC-NUMA Computer System for the Commercial Marketplace. In: Proc. of the 23rd Annual International Symposium on Computer Architecture, Philadelphia, USA (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chun, H., Xuejun, Y. (2003). Improve OpenMP Performance by Extending BARRIER and REDUCTION Constructs. In: Veidenbaum, A., Joe, K., Amano, H., Aiso, H. (eds) High Performance Computing. ISHPC 2003. Lecture Notes in Computer Science, vol 2858. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39707-6_48
Download citation
DOI: https://doi.org/10.1007/978-3-540-39707-6_48
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20359-9
Online ISBN: 978-3-540-39707-6
eBook Packages: Springer Book Archive