Abstract
As part of the recent focus on increasing the productivity of parallel application developers, Co-array Fortran (CAF) has emerged as an appealing alternative to the Message Passing Interface (MPI). CAF belongs to the family of global address space parallel programming languages; such languages provide the abstraction of globally addressable memory accessed using one-sided communication. At Rice University we are developing caf c, an open source, multiplatform CAF compiler. Our earlier studies show that caf c-compiled CAF programs achieve similar performance to that of corresponding MPI codes for the NAS Parallel Benchmarks. In this paper, we present a study of several CAF implementations of Sweep3D on four modern architectures. We analyze the impact of using one-sided communication in Sweep3D, identify potential sources of inefficiencies and suggest ways to address them. Our results show that we achieve comparable performance to that of the MPI version on three cluster-based architectures and outperform it by up to 10 % on the SGI Altix 3000.
Article PDF
Similar content being viewed by others
References
Accelerated Strategic Computing Initiative. The ASCI Sweep3D Benchmark Code. http://www.llnl.gov/asci_benchmarks/asci/limited/sweep3d/asci_sweep3d.html, 1995
ANSI. Myrinet-on-VME Protocol Specification (ANSI/VITA 26-1998). American National Standard Institute, 1998
Bailey D, Harris T, Saphir W, van der Wijngaart R, Woo A, Yarrow M (1995) The NAS parallel benchmarks 2.0. Technical Report NAS-95-020, NASA Ames Research Center
Carlson WW, Draper JM, Culler DE, Yelick K, Warren K., Brooks E (1999) Introduction to UPC and language specification. Technical Report CCS-TR-99-157, IDA Center for Computing Sciences
Coarfa C, Dotsenko Y, Eckhardt J, Mellor-Crummey J (2003) Co-array Fortran Performance and Potential: An NPB Experimental Study. In: Proc. of the 16th Intl. Workshop on Languages and Compilers for Parallel Computing, number 2958 in LNCS. Springer-Verlag
Dotsenko Y, Coarfa C, Mellor-Crummey J (2004) A Multiplatform Co-array Fortran compiler. In: Proceedings of the 13th Intl. Conference of Parallel Architectures and Compilation Techniques, Antibes Juan-les-Pins, France
Dotsenko Y, Coarfa C, Mellor-Crummey J, Chavarrí a-Miranda D (2004) Experiences with Co-array Fortran on Hardware Shared Memory Platforms. In: Proceedings of the 17th International Workshop on Languages and Compilers for Parallel Computing
Gropp W, Snir M, Nitzberg B, Lusk E (1998) MPI: The Complete Reference. MIT Press, 2nd ed.
Nieplocha J, Carpenter B (1999) ARMCI: A Portable Remote Memory Copy Library for Distributed Array Libraries and Compiler Run-Time Systems. Volume 1586 Lecture Notes in Computer Science Springer-Verlagpp pp. 533-546
Numrich RW, Reid JK (1998) Co-array Fortran for parallel programming. Technical Report RAL-TR-1998-060 Rutheford Appleton Laboratory
Numrich RW, Reid JK (1998) Co-Array Fortran for parallel programming. ACM Fortran Forum 17(2):1–31
Nieplocha J, Tipparaju V, Saify A, Panda DK (2002) Protocols and strategies for optimizing performance of remote memory operations on clusters. In: Proc. Workshop Communication Architecture for Clusters (CAC02) of IPDPS’02, Ft. Lauderdale, Florida
Open64 developers (2001) Open64 compiler and tools. http://sourceforge.net/projects/open64
Open64/SL Developers (2002) Open64/SL compiler and tools. http://hipersoft.cs.rice.edu/open64
Petrini F, Feng Wc, Hoisie A, Coll S, Frachtenberg E (2002) The Quadrics network: high performance clustering technology. IEEE Micro 22(1):46–57
Rasmussen C, Sottile M, Bulatewicz T (2003) CHASM language interoperability tools. http://sourceforge.net/projects/chasm-interop
Van der Wijngaart RF (1993) Efficient implementation of a 3-dimensional adi method on the ipsc/860. In: Proceedings of the 1993 ACM/IEEE conference on supercomputing, ACM Press pp. 102–111
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported in part by the Department of Energy under Grant DE-FC03-01ER25504/A000, the Los Alamos Computer Science Institute (LACSI) through LANL contract number 03891-99-23 as part of the prime contract (W-7405-ENG-36) between the DOE and the Regents of the University of California, Texas Advanced Technology Program under Grant 003604-0059-2001, and Compaq Computer Corporation under a cooperative research agreement. This research was performed in part using the Molecular Science Computing Facility (MSCF) in the William R. Wiley Environmental Molecular Sciences Laboratory, a national scientific user facility sponsored by the U.S. Department of Energy’s Office of Biological and Environmental Research and located at the Pacific Northwest National Laboratory. Pacific Northwest is operated for the Department of Energy by Battelle. The computations were performed in part on an Itanium cluster purchased with support from the NSF under Grant EIA-0216467, Intel, and Hewlett Packard and on the National Science Foundation Terascale Computing System at the Pittsburgh Supercomputing Center.
Cristian Coarfa and Yuri Dotsenko contributed equally to this work.
Rights and permissions
About this article
Cite this article
Coarfa, C., Dotsenko, Y. & Mellor-Crummey, J. Experiences with Sweep3D implementations in Co-array Fortran. J Supercomput 36, 101–121 (2006). https://doi.org/10.1007/s11227-006-7952-7
Issue Date:
DOI: https://doi.org/10.1007/s11227-006-7952-7