Global arrays: A nonuniform memory access programming model for high-performance computers

Nieplocha, Jaroslaw; Harrison, Robert J.; Littlefield, Richard J.

doi:10.1007/BF00130708

Global arrays: A nonuniform memory access programming model for high-performance computers

Published: June 1996

Volume 10, pages 169–189, (1996)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Jaroslaw Nieplocha¹,
Robert J. Harrison¹ &
Richard J. Littlefield¹

410 Accesses
186 Citations
3 Altmetric
Explore all metrics

Abstract

Portability, efficiency, and ease of coding are all important considerations in choosing the programming model for a scalable parallel application. The message-passing programming model is widely used because of its portability, yet some applications are too complex to code in it while also trying to maintain a balanced computation load and avoid redundant computations. The shared-memory programming model simplifies coding, but it is not portable and often provides little control over interprocessor data transfer costs. This paper describes an approach, called Global Arrays (GAs), that combines the better features of both other models, leading to both simple coding and efficient execution. The key concept of GAs is that they provide a portable interface through which each process in a MIMD parallel program can asynchronously access logical blocks of physically distributed matrices, with no need for explicit cooperation by other processes. We have implemented the GA library on a variety of computer systems, including the Intel Delta and Paragon, the IBM SP-1 and SP-2 (all message passers), the Kendall Square Research KSR-1/2 and the Convex SPP-1200 (nonuniform access shared-memory machines), the CRAY T3D (a globally addressable distributed-memory computer), and networks of UNIX workstations. We discuss the design and implementation of these libraries, report their performance, illustrate the use of GAs in the context of computational chemistry applications, and describe the use of a GA performance visualization tool.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parallel and Distributed Computing

A Simple Study of Pleasing Parallelism on Multicore Computers

A C++ Library for Memory Layout and Performance Portability of Scientific Applications

References

J. Almlof, K. Faegri, and K. Korsell. The direct SCF method. Journal of Computational Chemistry, 3:385, 1982.
Google Scholar
D. E. Bernholdt and R. J. Harrison. Orbital-invariant second-order many-body perturbation theory on parallel computers: An approach for large molecules. Journal of Chemical Physics, 102(24):9582–9589, 1995.
Google Scholar
D. E. Bernholdt and R. J. Harrison. Large-scale correlated electronic structure calculations: The RI-MP2 method on parallel computers. Chemical Physics Letters, 250:477–484, 1996.
Google Scholar
N. Carriero and D. Gelernter. How To Write Parallel Programs. A First Course. The MIT Press, Cambridge, Mass., 1990.
Google Scholar
J. Choi, J. Dongarra, R. Pozo, and D. Walker. ScaLAPACK: A scalable linear algebra for distributed memory concurrent computers. In Proceedings of the 4th Symposium on the Frontiers of Massively Parallel Computations, pages 120–127. IEEE Computer Society, 1992.
Cray Research, Inc. CRAY T3D System Architecture. Number HR-04033. Cray Research, Inc., Mendota Heights, Minn., 1994.
Google Scholar
H. Dachsel, H. Lischka, R. L. Shepard, J. Nieplocha, and R. J. Harrison. A massively parallel multireference configuration interaction program—The parallel COLUMBUS program. Journal of Chemical Physics, 1996 (in press).
I. T. Foster and K. M. Chandy. Fortran M: A language for modular parallel programming. Journal of Parallel and Distributed Computing, 26(l):24–35, 1995.
Google Scholar
I. T. Foster, R. Olson, and S. Tuecke. Productive parallel programming: The PCN approach. Scientific Programming, 1(1):51–66, 1992.
Google Scholar
D. Grunwald and S. Vajracharya. Efficient barriers for distributed shared memory computers. In Proceedings of the 8th International Parallel Processing Symposium, pages 202–213. IEEE Computer Society, 1994.
M. F. Guest, E. Apra, D. E. Bernholdt, H. A. Fruechtl, R. J. Harrison, R. A. Kendall, R. A. Kutteh, X. Long, J. B. Nicholas, J. A. Nichols, H. L. Taylor, A. T. Wong, G. I. Fann, R. J. Littlefield, and J. Nieplocha. High performance computational chemistry: Nwchem and fully distributed parallel algorithms. In J. Dongarra, L. Gradinetti, G. Joubert, and J. Kowalik, editors, High Performance Computing: Technology, Methods, and Applications, pages 395–427. Elsevier Science B. V, 1995.
R. J. Harrison. Portable tools and applications for parallel computers. International Journal of Quantum Chemistry, 40:847–863, 1991.
Google Scholar
R. J. Harrison, M. F. Guest, R. A. Kendall, D. E. Bernholdt, A. T. Wong, M. S. Stave, J. L. Anchell, A. C. Hess, R. J. Littlefield, G. I. Fann, J. Nieplocha, G. S. Thomas, D. Eiwood, J. Tilson, R. L. Shepard, A. F. Wagner, I. T. Foster, E. Lusk, and R. Stevens. Toward high-performance computational chemistry: II. A scalable self-consistent field program. Journal of Computational Chemistry, 17(1):124–132, 1996.
Google Scholar
R. J. Harrison and R. L. Shepard. Ab initio molecular electronic structure on parallel computers. Annual Reviews in Physical Chemistry, 45:623–658, 1994.
Google Scholar
High Performance Fortran Forum. High Performance Fortran language specification, version 1.0. Technical Report CRPC-TR92225, Center for Research on Parallel Computation, Rice University, Houston, Tex., 1993.
Google Scholar
Message Passing Interface Forum. MPI: A Message-Passing Interface. University of Tennessee, Knoxville, Tenn., May 5, 1994.
Google Scholar
J. Michalakes. Analysis of workload and load balancing issues in NCAR community climate model. Technical Report MCS-TM-144, Argonne National Laboratory, Argonne, Ill., 1991.
Google Scholar
J. Nieplocha, R. J. Harrison, and R. J. Littlefield. Global Arrays: A portable “shared-memory” programming model for distributed memory computers. In Proceedings of Supercomputing '94, pages 340–349. IEEE Computer Society Press, 1994.
J. Nieplocha, R. J. Harrison, and R. J. Littlefield. The Global Array programming model for high performance scientific computing. SIAM News, 28(7):12–14, 1995.
Google Scholar
A. P. Rendell, M. F. Guest, and R. A. Kendall. Distributed data parallel coupled-cluster algorithm: Application to the 2-hydroxypyridine/2-pyridone tauto-merism. Journal of Computational Chemistry, 14:1429–1439, 1993.
Google Scholar
R. H., Saavedra, R. S. Gaines, and M. J. Carlton. Micro benchmark analysis of the KSR1. In Proceedings of Supercomputing 93, pages 202–213. IEEE Computer Society, 1993.
M. Schueler, T. Kovar, H. Lischka, R. Shepard, and R. J. Harrison. A parallel implementation of the COLUMBUS multireference configuration interaction program. Theoretica Chimica Acta, 84:489–509, 1993.
Google Scholar
J. A. Stephen and R. R. Oldehoeft. HEP SISAL: Parallel functional programming. In J. S. Kowalik, editor, Parallel MIMD Computation: HEP Supercomputer and Its Applications, pages 123–150. The MIT Press, Cambridge, Mass., 1985.
Google Scholar
A. Szabo and N. S. Ostlund. Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory. McGraw-Hill, New York, 1989.
Google Scholar
A. T. Wong, R. J. Harrison, and A. P. Rendell. Parallel direct four-index transformations. Theoretica Chimica Acta, 1996 (in press).

Download references

Author information

Authors and Affiliations

Pacific Northwest National Laboratory, P.O. Box 999, 99352, Richland, WA, USA
Jaroslaw Nieplocha, Robert J. Harrison & Richard J. Littlefield

Authors

Jaroslaw Nieplocha
View author publications
You can also search for this author inPubMed Google Scholar
Robert J. Harrison
View author publications
You can also search for this author inPubMed Google Scholar
Richard J. Littlefield
View author publications
You can also search for this author inPubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nieplocha, J., Harrison, R.J. & Littlefield, R.J. Global arrays: A nonuniform memory access programming model for high-performance computers. J Supercomput 10, 169–189 (1996). https://doi.org/10.1007/BF00130708

Download citation

Issue Date: June 1996
DOI: https://doi.org/10.1007/BF00130708

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Global arrays: A nonuniform memory access programming model for high-performance computers

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Parallel and Distributed Computing

A Simple Study of Pleasing Parallelism on Multicore Computers

A C++ Library for Memory Layout and Performance Portability of Scientific Applications

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now