Unifying Barrier and Point-to-Point Synchronization in OpenMP with Phasers

Shirako, Jun; Sharma, Kamal; Sarkar, Vivek

doi:10.1007/978-3-642-21487-5_10

Jun Shirako²⁰,
Kamal Sharma²⁰ &
Vivek Sarkar²⁰

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 6665))

Included in the following conference series:

International Workshop on OpenMP

740 Accesses
3 Citations

Abstract

OpenMP is a widely used standard for parallel programing on a broad range of SMP systems. In the OpenMP programming model, synchronization points are specified by implicit or explicit barrier operations. However, certain classes of computations such as stencil algorithms need to specify synchronization only among particular tasks/threads so as to support pipeline parallelism with better synchronization efficiency and data locality than wavefront parallelism using all-to-all barriers. In this paper, we propose two new synchronization constructs in the OpenMP programming model, thread-level phasers and iteration level phasers to support various synchronization patterns such as point-to-point synchronizations and sub-group barriers with neighbor threads. Experimental results on three platforms using numerical applications show performance improvements of phasers over OpenMP barriers of up to 1.74× on an 8-core Intel Nehalem system, up to 1.59× on a 16-core Core-2-Quad system and up to 1.44× on a 32-core IBM Power7 system. It is reasonable to expect larger increases on future manycore processors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Baskaran, M., et al.: Parameterized tiling revisited. In: Proceedings of The International Symposium on Code Generation and Optimization, pp. 200–209 (2010)
Google Scholar
Cavé, V., et al.: Comparing the usability of library vs. language approaches to task parallelism. In: Evaluation and Usability of Programming Languages and Tools, PLATEAU 2010, pp. 9:1–9:6. ACM, New York (2010)
Google Scholar
Charles, P., et al.: X10: an object-oriented approach to non-uniform cluster computing. In: Proceedings of the ACM SIGPLAN Conference on Object Oriented Programming, Systems, Languages, and Applications, NY, USA, pp. 519–538 (2005)
Google Scholar
Dagum, L., Menon, R.: OpenMP: An industry standard API for shared memory programming. IEEE Computational Science & Engineering (1998)
Google Scholar
Darema, F., et al.: A Single-Program-Multiple-Data computational model for EPEX/FORTRAN. Parallel Computing 7(1), 11–24 (1988)
Article MATH Google Scholar
Diniz, P.C., Rinard, M.C.: Synchronization transformations for parallel computing. In: Proceedings of the ACM Symposium on the Principles of Programming Languages, pp. 187–200. ACM, New York (1997)
Google Scholar
Gupta, R.: The fuzzy barrier: a mechanism for high speed synchronization of processors. In: Proceedings of the Third International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 54–63. ACM, New York (1989)
Chapter Google Scholar
The Habanero Java (HJ) Programming Language, http://habanero.rice.edu/hj
Miller, A.: Set your java 7 phasers to stun (2008), http://tech.puredanger.com/2008/07/08/java7-phasers/
Peierls, T., et al.: Java Concurrency in Practice. Addison-Wesley Professional, Reading (2005)
Google Scholar
Sarkar, V.: Synchronization using counting semaphores. In: Proceedings of the International Conference on Supercomputing, pp. 627–637 (July 1988)
Google Scholar
Shirako, J., et al.: Phasers: a unified deadlock-free construct for collective and point-to-point synchronization. In: ICS 2008: Proceedings of the 22nd Annual International Conference on Supercomputing, pp. 277–288. ACM, New York (2008)
Google Scholar
Shirako, J., et al.: Chunking parallel loops in the presence of synchronization. In: ICS 2009: Proceedings of the 23rd Annual International Conference on Supercomputing, pp. 181–192. ACM, New York (2009)
Google Scholar
Shirako, J., Sarkar, V.: Hierarchical phasers for scalable synchronization and reductions in dynamic parallelism. In: Proceedings of the International Parallel and Distributed Processing Symposium, IPDPS (2010)
Google Scholar
Smith, L.A., Bull, J.M.: A multithreaded Java Grande benchmark suite. In: Proceedings of the Third Workshop on Java for High Performance Computing (2001)
Google Scholar
Snyder, L.: The design and development of ZPL. In: HOPL III: Proceedings of the Third ACM SIGPLAN Conference on History of Programming Languages, pp. 8-1–8-37. ACM Press, New York (2007)
Google Scholar
Tseng, C.: Compiler optimizations for eliminating barrier synchronization. In: Proceedings of the Symposium on Principles and Practice of Parallel Programming, pp. 144–155. ACM, New York (1995)
Google Scholar
Vasudevan, N., Tardieu, O., Dolby, J., Edwards, S.A.: Compile-time analysis and specialization of clocks in concurrent programs. In: de Moor, O., Schwartzbach, M.I. (eds.) CC 2009. LNCS, vol. 5501, pp. 48–62. Springer, Heidelberg (2009)
Chapter Google Scholar
Yelick, K., et al.: Productivity and performance using partitioned global address space languages. In: Proceedings of the International Workshop on Parallel Symbolic Computation, pp. 24–32. ACM, New York (2007)
Google Scholar
Zhao, J., et al.: Reducing task creation and termination overhead in explicitly parallel programs. In: Proceedings of the Conference on Parallel Architectures and Compilation Techniques (PACT 2010) (September 2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Rice University, USA
Jun Shirako, Kamal Sharma & Vivek Sarkar

Authors

Jun Shirako
View author publications
You can also search for this author in PubMed Google Scholar
Kamal Sharma
View author publications
You can also search for this author in PubMed Google Scholar
Vivek Sarkar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computer Science, University of Houston, 501 Philip G. Hoffman Hall, 4800 Calhoun Rd, 77204-3475, Houston, TX, USA
Barbara M. Chapman
Dept. of Computer Sci., Univ. of Illinois, 61801, Urbana, Illinois, USA
William D. Gropp
Argonne National Laboratory, TCS, Bldg 240, Rm 1125, 9700 S. Cass Avenue, 60439, Argonne, IL, USA
Kalyan Kumaran
Center for Information Services and High Performance Computing (ZIH), Technische Universität Dresden, Zellescher Weg 12, 01062, Dresden, Germany
Matthias S. Müller

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shirako, J., Sharma, K., Sarkar, V. (2011). Unifying Barrier and Point-to-Point Synchronization in OpenMP with Phasers. In: Chapman, B.M., Gropp, W.D., Kumaran, K., Müller, M.S. (eds) OpenMP in the Petascale Era. IWOMP 2011. Lecture Notes in Computer Science, vol 6665. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21487-5_10

Download citation

DOI: https://doi.org/10.1007/978-3-642-21487-5_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21486-8
Online ISBN: 978-3-642-21487-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics