Benchmarking and optimization of scientific codes on the CRAY X-MP, CRAY-2, and SCS-40 vector computers

Pfeiffer, Wayne; Alagar, Arnold; Kamrath, Anke; Leary, Robert H.; Rogers, Jack

doi:10.1007/BF00127877

Benchmarking and optimization of scientific codes on the CRAY X-MP, CRAY-2, and SCS-40 vector computers

Published: June 1990

Volume 4, pages 131–152, (1990)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Wayne Pfeiffer¹,
Arnold Alagar¹,
Anke Kamrath¹,
Robert H. Leary¹ &
…
Jack Rogers¹

45 Accesses
3 Citations
Explore all metrics

Abstract

Various scientific codes were benchmarked on three vector computers: the CRAY X-MP/48 and CRAY-2 supercomputers and the SCS-40/XM minisupercomputer. On the X-MP, two Fortran compilers were also compared. The benchmarks, which were initially all in Fortran, consisted of six research codes from Caltech, the 24 Livermore loops, and two cases from the LINPACK benchmark. As a corollary effort, the effect of manual optimization on the Caltech codes was also considered, including the selected use of assembly-language math routines.

On each machine the ratio of the maximum to the minimum speeds for the various benchmarks was more than a factor of 50, even though the study was restricted to unitasked (i.e., single CPU) runs. The maximum speed for all-Fortran codes was more than 80% of the peak speed on the X-MP and SCS, but less than 40% of the peak speed on the CRAY-2.

Despite having a clock that is 2.3 times faster, the CRAY-2 generally runs slower than the X-MP, typically by a factor of 1.3 for scalar code and even slower for moderately vectorized code. Only for highly vectorized codes does the CRAY-2 marginally outperform the X-MP, at least for in-core benchmarks. The poorer performance of the CRAY-2 is due to its slower scalar speed, its lack of chaining, its single port between each CPU and memory, and its relatively slow memory.

The SCS runs slower than the X-MP by a factor of 2.6 in the scalar limit and by a factor of 4.7 (the clock ratio) in the vector limit when the same CFT compiler is used on both machines. Use of the newer CFT77 compiler on the X-MP negates the relative enhancement of the SCS scalar performance.

On the X-MP, the CFT77 3.0 compiler produces significantly faster code than CFT 1.14, typically by a factor of 1.4. This is obtained, however, at the expense of compilation times that are three to five times longer. Regardless of the compiler, manual optimization is still worthwhile. For three of the six Caltech codes compiled with CFT77, run time speedups of 2, 4, and 16 were achieved due to Fortran optimization only.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluation of the NEC Vector Engine for Legacy CFD Codes

Performance and Portability of a Linear Solver Across Emerging Architectures

The Brand-New Vector Supercomputer, SX-ACE

References

Anderson, R.E., Grimes, R.G., and Simon, H.D. 1988. Performance comparison of the CRAY X-MP/24 with SSD and the CRAY-2. The J. of Supercomputing, 1, 4 (Aug.), 409–419.
Google Scholar
Dongarra, J.J. 1988. Performance of various computers using standard linear equations software in a Fortran environment. Argonne Nat. Laboratory Tech. Mem. MCS-TM-23.
Dongarra, J.J., Gustavson, F.G., and Karp, A. 1984. Implementing linear algebra algorithms for dense matrices on a vector pipeline machine. SIAM Review, 26, 1 (Jan.), 91–112.
Google Scholar
McMahon, F.H. 1986. The Livermore Fortran kernels: A computer test of the numerical performance range. Lawrence Livermore Nat. Laboratory Rept. UCRL-53745.
Messina, P., Baillie, C.F., Felten, E.W., Hipes, P.G., Walker, D.G., Williams, R.D., Pfeiffer, W., Alagar, A., Kamrath, A., Leary, R.H., and Rogers, J. 1990. Benchmarking advanced architecture computers. Concurrency: Practice and Experience (to appear).
Moore, R.W. 1988. Personal commun.
Nelson, H. 1985. Using the performance monitors on the X-MP/48. Tentacle (newsletter of the Computation Dept. at Lawrence Livermore Nat. Laboratory), 5, 9 (Sept./Oct.), 15–23.
Google Scholar
Simmons, M.L., and Wasserman, H.J. 1988. Performance comparison of the CRAY-2 and CRAY X-MP/416 supercomputers. In Proc., Supercomputing '88 (Orlando, Fla., Nov. 14–18), IEEE Comp. Society Press, pp. 288–295.
Walker, D.W., Messina, P., and Baillie, C.F. 1988. Performance evaluation of scientific programs on advanced architecture computers. Calif. Institute of Technology Concurrent Computation Program Rept. C3P-580.

Download references

Author information

Authors and Affiliations

San Diego Supercomputer Center, P.O. Box 85608, 92138, San Diego, CA, USA
Wayne Pfeiffer, Arnold Alagar, Anke Kamrath, Robert H. Leary & Jack Rogers

Authors

Wayne Pfeiffer
View author publications
You can also search for this author in PubMed Google Scholar
Arnold Alagar
View author publications
You can also search for this author in PubMed Google Scholar
Anke Kamrath
View author publications
You can also search for this author in PubMed Google Scholar
Robert H. Leary
View author publications
You can also search for this author in PubMed Google Scholar
Jack Rogers
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pfeiffer, W., Alagar, A., Kamrath, A. et al. Benchmarking and optimization of scientific codes on the CRAY X-MP, CRAY-2, and SCS-40 vector computers. J Supercomput 4, 131–152 (1990). https://doi.org/10.1007/BF00127877

Download citation

Issue Date: June 1990
DOI: https://doi.org/10.1007/BF00127877

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Benchmarking and optimization of scientific codes on the CRAY X-MP, CRAY-2, and SCS-40 vector computers

Abstract

Access this article

Similar content being viewed by others

Evaluation of the NEC Vector Engine for Legacy CFD Codes

Performance and Portability of a Linear Solver Across Emerging Architectures

The Brand-New Vector Supercomputer, SX-ACE

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Benchmarking and optimization of scientific codes on the CRAY X-MP, CRAY-2, and SCS-40 vector computers

Abstract

Access this article

Similar content being viewed by others

Evaluation of the NEC Vector Engine for Legacy CFD Codes

Performance and Portability of a Linear Solver Across Emerging Architectures

The Brand-New Vector Supercomputer, SX-ACE

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation