OpenMP compiler for distributed memory architectures

Wang, Jue; Hu, ChangJun; Zhang, JiLin; Li, JianJiang

doi:10.1007/s11432-010-0074-0

OpenMP compiler for distributed memory architectures

Research Papers
Published: 14 April 2010

Volume 53, pages 932–944, (2010)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Jue Wang¹,
ChangJun Hu¹,
JiLin Zhang¹ &
…
JianJiang Li¹

151 Accesses
3 Altmetric
Explore all metrics

Abstract

OpenMP is an emerging industry standard for shared memory architectures. While OpenMP has advantages on its ease of use and incremental programming, message passing is today still the most widely-used programming model for distributed memory architectures. How to effectively extend OpenMP to distributed memory architectures has been a hot spot. This paper proposes an OpenMP system, called KLCoMP, for distributed memory architectures. Based on the “partially replicating shared arrays” memory model, we propose an algorithm for shared array recognition based on the inter-procedural analysis, optimization technique based on the producer/consumer relationship, and communication generation technique for nonlinear references. We evaluate the performance on nine benchmarks which cover computational fluid dynamics, integer sorting, molecular dynamics, earthquake simulation, and computational chemistry. The average scalability achieved by KLCoMP version is close to that achieved by MPI version. We compare the performance of our translated programs with that of versions generated for Omni+SCASH, LLCoMP, and OpenMP(Purdue), and find that parallel applications (especially, irregular applications) translated by KLCoMP can achieve more effective performance than other versions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

OpenMP Architecture Review Board. OpenMP Application Program Interface, version 2.5, 2005
Sato M, Satoh S, Kusano K, et al. Design of OpenMP compiler for an SMP cluster. In: Proc. of the 1st European Workshop on OpenMP. Belin: Springer, 1999. 32–39
Google Scholar
Costa J J, Cortes T, Martorell X, et al. Running OpenMP applications efficiently on an everything-shared SDSM. J Parall Distrib Comput, 2006, 66: 647–658
Article MATH Google Scholar
Min S J, Eigenmann R. Combined compile-time and runtime-driven, pro-active data movement in software DSM systems. In: Proc. of Seventh Workshop on Languages, Compilers, and Run-time Support for Scalable Systems, Houston, Texas, 2004. 1–6
Lu H H. Quantifying the performance differences between PVM and TreadMarks. J Parall Distrib Comput, 1997, 43: 65–78
Article Google Scholar
Basumallik A, Min S, Eigenmann R. Programming distributed memory systems using OpenMP. In: Proc. of International Parallel and Distributed Processing Symposium. New York: IEEE Press, 2007. 1–8
Google Scholar
Basumallik A, Eigenmann R. Towards automatic translation of OpenMP to MPI. In: Proc. of the 19th Annual International Conference on Supercomputing. New York: ACM Press, 2005. 189–198
Chapter Google Scholar
Basumallik A, Eigenmann R. Optimizing irregular shared-memory applications for distributed-memory systems. In: Proc. of the Eleventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. New York: ACM Press, 2006. 119–128
Chapter Google Scholar
MPICH2.1.0.7, http://www.mcs.anl.gov/research/projects/mpich2/, March 21, 2008
Dorta A, Lopez P, Sande F. Basic skeletons in llc. Parall Comput, 2006, 32: 491–506
Article Google Scholar
Eigenmann R, Hoeflinger J, Kuhn R H, et al. Is OpenMP for Grids? In: Proc. of International Parallel and Distributed Processing Symposium. New York: IEEE Press, 2002. 171–178
Chapter Google Scholar
Jeun W C, Kee Y S, Ha S. Improving performance of OpenMP for SMP clusters through overlapped page migrations. In: Proc. of International Workshop on OpenMP, Reims, France, 2006
Eachempati D, Huang L, Chapman B M. Strategies and implementation for translating OpenMP code for clusters. In: Proc. of High Performance Computing and Communications. Belin: Springer, 2007. 420–431
Chapter Google Scholar
Jin H, Frumkin M, Yan J. The OpenMP implementation of NAS parallel benchmarks and its performance. Technical Report NAS-99-011, 1999
Aslot V, Domeika M, Eigenmann R. SPEComp: A new benchmark suite for measuring parallel computer performance. In: Proc. of the Workshop on OpenMP Applications and Tools. Belin: Springer, 2001. 1–10
Google Scholar
COSMIC group, University of Maryland. COSMIC software for irregular applications. http://www.cs.umd.edu/projects/osmic/software.html
Brooks B R, Bruccoleri R E, Olafson B D, et al. A program for macromolecular energy, minimization, and dynamics calculations. J Comp Chem, 1983, 4: 187–217
Article Google Scholar
Brandes T. ADAPTOR Users Guide, Fraunhofer Gesellschaft, Augustin, Germany, 2004
Petersen P, Padua D A. Static and dynamic evaluation of data dependence analysis techniques. IEEE Trans Parall Distrib Syst, 1996, 7: 1121–1132
Article Google Scholar
Brezany P, Dang M. CHAOS+ Runtime Library. Internal Report, Institute for Software Technology and Parallel Systems, University of Vienna, September 1997
Michelle M, Barbara K, Paul D. Data-flow analysis for MPI programs. In: Proceedings of the 2006 International Conference on Parallel Processing, Columbus, Ohio, USA, 2006. 175–184
Wang J, Hu C J, Zhang J L, et al. An optimized strategy for collective communication in data parallelism (in Chinese). Chinese J Comput, 2008, 2: 318–328
MathSciNet Google Scholar
Engelen R, Birch J, Shou Y, et al. A unified framework for nonlinear dependence testing and symbolic analysis. In: Proc. of the ACM International Conference on Supercomputing. New York: ACM Press, 2004. 106–115
Google Scholar
Li Z. Array privatization for parallel execution of loops. In: Proc. of the ACM International Conference on Supercomputing. New York: ACM Press, 1992. 313–322
Google Scholar
Haghighat M R, Polychronopoulos C D. Symbolic analysis for parallelizing compilers. ACM Trans Program Languag Syst, 1996. 18: 477–518
Article Google Scholar
Hu C, Li J, Wang J, et al. Communication generation for irregular parallel applications. In: Proc. of IEEE International Symposium on Parallel Computing in Electrical Engineering. New York: IEEE Press, 2006. 263–270
Google Scholar
Wang J, Hu C, Zhang J, et al. OpenMP extensions for irregular parallel applications on cluster international workshop on OpenMP. Lecture Notes in Computer Science 4935. Berlin: Springer Publisher, 2007. 101–111
Google Scholar
Tseng E, Gaudlot J. Communication generation for aligned and cyclic(k) distributions using integer lattice. IEEE Trans Parallel Distrib Syst, 1999, 10: 136–146
Article Google Scholar
Ojima Y, Sato M, Harada H, et al. Performance of cluster-enabled OpenMP for the SCASH software distributed shared memory system, cluster computing and the grid. In: Proc. of 3rd IEEE/ACM International Symposium on CCGrid, Tokyo, Japan, 2003. 450–456

Download references

Author information

Authors and Affiliations

School of Information and Engineering, University of Science and Technology Beijing, Beijing, 100083, China
Jue Wang, ChangJun Hu, JiLin Zhang & JianJiang Li

Authors

Jue Wang
View author publications
You can also search for this author inPubMed Google Scholar
ChangJun Hu
View author publications
You can also search for this author inPubMed Google Scholar
JiLin Zhang
View author publications
You can also search for this author inPubMed Google Scholar
JianJiang Li
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Jue Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, J., Hu, C., Zhang, J. et al. OpenMP compiler for distributed memory architectures. Sci. China Inf. Sci. 53, 932–944 (2010). https://doi.org/10.1007/s11432-010-0074-0

Download citation

Received: 11 November 2008
Accepted: 14 June 2009
Published: 14 April 2010
Issue Date: May 2010
DOI: https://doi.org/10.1007/s11432-010-0074-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

OpenMP compiler for distributed memory architectures

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Proposing OpenSHMEM Extensions Towards a Future for Hybrid Programming and Heterogeneous Computing

Integrating Asynchronous Task Parallelism with OpenSHMEM

Beyond Explicit Transfers: Shared and Managed Memory in OpenMP

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

OpenMP compiler for distributed memory architectures

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Proposing OpenSHMEM Extensions Towards a Future for Hybrid Programming and Heterogeneous Computing

Integrating Asynchronous Task Parallelism with OpenSHMEM

Beyond Explicit Transfers: Shared and Managed Memory in OpenMP

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now