Skip to main content

Parallelization of NAS benchmarks for shared memory multiprocessors

  • 2. Computational Science
  • Conference paper
  • First Online:
High-Performance Computing and Networking (HPCN-Europe 1998)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1401))

Included in the following conference series:

Abstract

This paper presents our experiences of parallelizing the sequential implementation of NAS benchmarks using compiler directives on SGI Origin2000 distributed shared memory (DSM) system. Porting existing applications to new high performance parallel and distributed computing platforms is a challenging task. Ideally, a user develops a sequential version of the application, leaving the task of porting the code to parallelization tools and compilers. Due to the simplicity of programming shared-memory multiprocessors, compiler developers have provided various facilities to allow the users to exploit parallelism. Native compilers on SGI Origin2000 support multiprocessing directives to allow users to exploit loop-level parallelism in their programs. Additionally, supporting tools can accomplish this process automatically. We experimented with these compiler directives and supporting tools by parallelizing sequential implementation of NAS benchmarks. Results reported in this paper indicate that with minimal effort, the performance gain is comparable with the hand-parallelized, carefully optimized, message-passing implementations of the same benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. V. Adve, J-C. Wang, J. Mellor-Crummey, D. Reed, M. Anderson, and K. Kennedy, “An Integrated Compilation and Performance Analysis Environment for Data Parallel Programs,” Proceedings of Supercomputing '95, San Diego, CA, December 1995.

    Google Scholar 

  2. S. P. Amarasinghe, J. M. Anderson, M. S. Lam and C. W. Tseng, “The SUIF Compiler for Scalable Parallel Machines,” Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Processing, July, 1995.

    Google Scholar 

  3. Jennifer-Ann M. Anderson, “Automatic Computation and Data Decomposition for Multiprocessors,” Technical Report CSL-TR-97-719, Computer Systems Laboratory, Dept. of Electrical Eng. and Computer Sc., Stanford University, 1997.

    Google Scholar 

  4. David Bailey, Tim Harris, William Saphir, Rob van der Wijngaart, Alex Woo, and Maurice Yarrow, “The NAS Parallel Benchmark 2.0,” Technical Report NAS-95-020, December 1995.

    Google Scholar 

  5. High Performance Fortran Forum. High Performance Fortran Language Specification, Version 1.0. Scientific Programming, 2(1 & 2), 1993.

    Google Scholar 

  6. C. S. Ierotheou, S. P. Johnson, M. Cross, and P. F. Leggett “Computer aided parallelisation tools (CAPTools)—conceptual overview and performance on the parallelisation of structured mesh codes” Parallel Computing, Vol.22, 1996, pp. 163–195.

    Google Scholar 

  7. Kuck & Associates, Inc., “Experiences With Visual KAP and KAP/Pro Toolset Under Windows NT,” Technical Report, Nov. 1997.

    Google Scholar 

  8. Message Passing Interface Forum, “MPI: A Message-Passing Interface Standard,” May 5, 1994.

    Google Scholar 

  9. MIPSpro Fortran77 Programmer's Guide, Silicon Graphics, Inc. Available on-line from: http://techpubs.sgi.com/library/dynaweb_bin/0640/bin/nph-dynaweb.cgi/dynaweb/SGI_Developer/MproF77_PG/@Generic_BookView.

    Google Scholar 

  10. NAS Parallel Benchmarks. Available on-line from: http://science.nas.nasa.gov/Software/NPB.

    Google Scholar 

  11. OpenMP: A Proposed Standard API for Shared Memory Programming, Oct. 1997. Available on-line from http://www.openmp.org.

    Google Scholar 

  12. David A. Padua, Rudolf Eigenmann, Jay Hoeflinger, Paul Petersen, Peng Tu, Stephen Weatherford, and Keith Faigin, “Polaris: A New-Generation Parallelizing Compiler for MPPs,” Technical Report CSRD # 1306, University of Illinois at Urbana-Champaign, June 15, 1993.

    Google Scholar 

  13. Cherri M. Pancake, “The Emperor Has No Clothes: What HPC Users Need to Say and HPC Vendors Need to Hear,”, Supercomputing '95, invited talk, San Diego, Dec. 3–8, 1995.

    Google Scholar 

  14. Insung Park, Michael J. Voss, and Rudolf Eigenmann, “Compiling for the New Generation of High-Performance SMPs,” Technical Report, Nov. 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Peter Sloot Marian Bubak Bob Hertzberger

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Waheed, A., Yan, J. (1998). Parallelization of NAS benchmarks for shared memory multiprocessors. In: Sloot, P., Bubak, M., Hertzberger, B. (eds) High-Performance Computing and Networking. HPCN-Europe 1998. Lecture Notes in Computer Science, vol 1401. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0037164

Download citation

  • DOI: https://doi.org/10.1007/BFb0037164

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64443-9

  • Online ISBN: 978-3-540-69783-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics