Parallelization of NAS benchmarks for shared memory multiprocessors

Waheed, Abdul; Yan, Jerry

doi:10.1007/BFb0037164

Abdul Waheed¹ &
Jerry Yan¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1401))

Included in the following conference series:

International Conference on High-Performance Computing and Networking

253 Accesses
3 Citations

Abstract

This paper presents our experiences of parallelizing the sequential implementation of NAS benchmarks using compiler directives on SGI Origin2000 distributed shared memory (DSM) system. Porting existing applications to new high performance parallel and distributed computing platforms is a challenging task. Ideally, a user develops a sequential version of the application, leaving the task of porting the code to parallelization tools and compilers. Due to the simplicity of programming shared-memory multiprocessors, compiler developers have provided various facilities to allow the users to exploit parallelism. Native compilers on SGI Origin2000 support multiprocessing directives to allow users to exploit loop-level parallelism in their programs. Additionally, supporting tools can accomplish this process automatically. We experimented with these compiler directives and supporting tools by parallelizing sequential implementation of NAS benchmarks. Results reported in this paper indicate that with minimal effort, the performance gain is comparable with the hand-parallelized, carefully optimized, message-passing implementations of the same benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

V. Adve, J-C. Wang, J. Mellor-Crummey, D. Reed, M. Anderson, and K. Kennedy, “An Integrated Compilation and Performance Analysis Environment for Data Parallel Programs,” Proceedings of Supercomputing '95, San Diego, CA, December 1995.
Google Scholar
S. P. Amarasinghe, J. M. Anderson, M. S. Lam and C. W. Tseng, “The SUIF Compiler for Scalable Parallel Machines,” Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Processing, July, 1995.
Google Scholar
Jennifer-Ann M. Anderson, “Automatic Computation and Data Decomposition for Multiprocessors,” Technical Report CSL-TR-97-719, Computer Systems Laboratory, Dept. of Electrical Eng. and Computer Sc., Stanford University, 1997.
Google Scholar
David Bailey, Tim Harris, William Saphir, Rob van der Wijngaart, Alex Woo, and Maurice Yarrow, “The NAS Parallel Benchmark 2.0,” Technical Report NAS-95-020, December 1995.
Google Scholar
High Performance Fortran Forum. High Performance Fortran Language Specification, Version 1.0. Scientific Programming, 2(1 & 2), 1993.
Google Scholar
C. S. Ierotheou, S. P. Johnson, M. Cross, and P. F. Leggett “Computer aided parallelisation tools (CAPTools)—conceptual overview and performance on the parallelisation of structured mesh codes” Parallel Computing, Vol.22, 1996, pp. 163–195.
Google Scholar
Kuck & Associates, Inc., “Experiences With Visual KAP and KAP/Pro Toolset Under Windows NT,” Technical Report, Nov. 1997.
Google Scholar
Message Passing Interface Forum, “MPI: A Message-Passing Interface Standard,” May 5, 1994.
Google Scholar
MIPSpro Fortran77 Programmer's Guide, Silicon Graphics, Inc. Available on-line from: http://techpubs.sgi.com/library/dynaweb_bin/0640/bin/nph-dynaweb.cgi/dynaweb/SGI_Developer/MproF77_PG/@Generic_BookView.
Google Scholar
NAS Parallel Benchmarks. Available on-line from: http://science.nas.nasa.gov/Software/NPB.
Google Scholar
OpenMP: A Proposed Standard API for Shared Memory Programming, Oct. 1997. Available on-line from http://www.openmp.org.
Google Scholar
David A. Padua, Rudolf Eigenmann, Jay Hoeflinger, Paul Petersen, Peng Tu, Stephen Weatherford, and Keith Faigin, “Polaris: A New-Generation Parallelizing Compiler for MPPs,” Technical Report CSRD # 1306, University of Illinois at Urbana-Champaign, June 15, 1993.
Google Scholar
Cherri M. Pancake, “The Emperor Has No Clothes: What HPC Users Need to Say and HPC Vendors Need to Hear,”, Supercomputing '95, invited talk, San Diego, Dec. 3–8, 1995.
Google Scholar
Insung Park, Michael J. Voss, and Rudolf Eigenmann, “Compiling for the New Generation of High-Performance SMPs,” Technical Report, Nov. 1996.
Google Scholar

Download references

Author information

Authors and Affiliations

Mail Stop T27A-2, NASA Ames Research Center, 94035-1000, Moffett Field, CA
Abdul Waheed & Jerry Yan

Authors

Abdul Waheed
View author publications
You can also search for this author in PubMed Google Scholar
Jerry Yan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Peter Sloot Marian Bubak Bob Hertzberger

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Waheed, A., Yan, J. (1998). Parallelization of NAS benchmarks for shared memory multiprocessors. In: Sloot, P., Bubak, M., Hertzberger, B. (eds) High-Performance Computing and Networking. HPCN-Europe 1998. Lecture Notes in Computer Science, vol 1401. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0037164

Download citation

DOI: https://doi.org/10.1007/BFb0037164
Published: 22 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64443-9
Online ISBN: 978-3-540-69783-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics