Communication and Optimization Aspects on Hybrid Architectures

Rabenseifner, Rolf

doi:10.1007/3-540-45825-5_59

Rolf Rabenseifner⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2474))

Included in the following conference series:

European Parallel Virtual Machine / Message Passing Interface Users’ Group Meeting

428 Accesses
2 Citations

Abstract

Most HPC systems are clusters of shared memory nodes. Parallel programming must combine the distributed memory parallelization on the node inter-connect with the shared memory parallelization inside of each node. The hybrid MPI+OpenMP programming model is compared with pure MPI and compiler based parallelization. The paper focuses on bandwidth and latency aspects, but also whether programming paradigms can separate the optimization of communication and computation. Benchmark results are presented for hybrid and pure MPI communication.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

E. Ayguade, M. Gonzalez, J. Labarta, X. Martorell, N. Navarro, and J. Oliver, NanosCompiler: A Research Platform for OpenMP Extensions, in proceedings of the 1st European Workshop on OpenMP (EWOMP’99), Lund, Sweden, Sep. 1999.
Google Scholar
Robert B. Ciotti, James R. Taft, and Jens Petersohn, Early Experiences with the 512 Processor Single System Image Origin2000, proceedings of the 42nd International Cray User Group Conference, SUMMIT 2000, Noordwijk, The Netherlands, May 22–26, 2000, http://www.cug.org.
Jonathan Harris, Extending OpenMP for NUMA Architectures, in proceedings of the Second European Workshop on OpenMP, EWOMP 2000.
Google Scholar
D.S. Henty, Performance of hybrid message-passing and shared-memory parallelism for discrete element modeling, in Proc. Supercomputing’00, Dallas, TX, 2000. http://citeseer.nj.nec.com/henty00performance.html.
Alice E. Koniges, Rolf Rabenseifner, Karl Solchenbach, Benchmark Design for Characterization of Balanced High-Performance Architectures, in proceedings, 15th International Parallel and Distributed Processing Symposium (IPDPS’01), Workshop on Massively Parallel Processing, April 23–27, 2001, San Francisco, USA.
Google Scholar
Richard D. Loft, Stephen J. Thomas, and John M. Dennis, Terascale spectral element dynamical core for atmospheric general circulation models, in proceedings, SC 2001, Nov. 2001, Denver, USA.
Google Scholar
John Merlin, Distributed OpenMP: Extensions to OpenMP for SMP Clusters, in proceedings of the Second European Workshop on OpenMP, EWOMP 2000.
Google Scholar
Message Passing Interface Forum. MPI-2: Extensions to the Message-Passing Interface, July 1997, http://www.mpi-forum.org.
Matthias M. Müller, Compiler-Generated Vector-based Prefetching on Architectures with Distributed Memory, in High Performance Computing in Science and Engineering’ 01, W. Jäger and E. Krause (eds), Springer, 2001.
Google Scholar
The NANOS Project, Jesus Labarta, et al., http://research.ac.upc.es/hpc/nanos/.
OpenMP Group, http://www.openmp.org.
Rolf Rabenseifner and Alice E. Koniges, Effective Communication and File-I/O Bandwidth Benchmarks, in Recent Advances in Parallel Virtual Machine and Message Passing Interface, proceedings of the 8th European PVM/MPI Users’ Group Meeting, Santorini, Greece, LNCS 2131, Y. Cotronis, J. Dongarra (Eds.), Springer, 2001, pp 24–35, http://www.hlrs.de/mpi/beff/,http://www.hlrs.de/mpi/beffio/.
Chapter Google Scholar
Mitsuhisa Sato, Shigehisa Satoh, Kazuhiro Kusano and Yoshio Tanaka, Design of OpenMP Compiler for an SMP Cluster, in proceedings of the 1st European Workshop on OpenMP (EWOMP’99), Lund, Sweden, Sep. 1999, pp 32–39. http://citeseer.nj.nec.com/sato99design.html.
G. Wellein, G. Hager, A. Basermann, and H. Fehske, Fast sparse matrix-vector multiplication for TeraFlop/s computers, in proceedings of Vector and Parallel Processing-VECPAR’2002, Porto, Portugal, June 26–28, 2002, Springer LNCS.
Google Scholar

Download references

Author information

Authors and Affiliations

High-Performance Computing-Center (HLRS), University of Stuttgart, Allmandring 30, D-70550, Stuttgart, Germany
Rolf Rabenseifner

Authors

Rolf Rabenseifner
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Abteilung für Graphische und Parallele Datenverarbeitung (GUP) Institut für Technische Informatik und Telematik, Johannes Kepler Universität Linz, Altenbergstr. 69, 4040, Linz, Austria
Dieter Kranzlmüller & Jens Volkert &
Computer and Automation Research Institute, MTA SZTAKI, Hungarian Academy of Sciences, Victor Hugo u. 18-22, 1132, Budapest, Hungary
Peter Kacsuk
Computer Science Department Innovative Computing Laboratory, University of Tennessee, 1122 Volunteer Blvd, Knoxville, TN, 37996-3450, USA
Jack Dongarra

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rabenseifner, R. (2002). Communication and Optimization Aspects on Hybrid Architectures. In: Kranzlmüller, D., Volkert, J., Kacsuk, P., Dongarra, J. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2002. Lecture Notes in Computer Science, vol 2474. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45825-5_59

Download citation

DOI: https://doi.org/10.1007/3-540-45825-5_59
Published: 18 September 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44296-7
Online ISBN: 978-3-540-45825-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics