Mapping communication layouts to network hardware characteristics on massive-scale blue gene systems

Balaji, Pavan; Gupta, Rinku; Vishnu, Abhinav; Beckman, Pete

doi:10.1007/s00450-011-0168-y

Mapping communication layouts to network hardware characteristics on massive-scale blue gene systems

Special Issue Paper
Published: 06 April 2011

Volume 26, pages 247–256, (2011)
Cite this article

Computer Science - Research and Development

Pavan Balaji¹,
Rinku Gupta¹,
Abhinav Vishnu² &
…
Pete Beckman³

123 Accesses
17 Citations
Explore all metrics

Abstract

For parallel applications running on high-end computing systems, which processes of an application get launched on which processing cores is typically determined at application launch time without any information about the application characteristics. As high-end computing systems continue to grow in scale, however, this approach is becoming increasingly infeasible for achieving the best performance. For example, for systems such as IBM Blue Gene and Cray XT that rely on flat 3D torus networks, process communication often involves network sharing, even for highly scalable applications. This causes the overall application performance to depend heavily on how processes are mapped on the network. In this paper, we first analyze the impact of different process mappings on application performance on a massive Blue Gene/P system. Then, we match this analysis with application communication patterns that we allow applications to describe prior to being launched. The underlying process management system can use this combined information in conjunction with the hardware characteristics of the system to determine the best mapping for the application. Our experiments study the performance of different communication patterns, including 2D and 3D nearest-neighbor communication and structured Cartesian grid communication. Our studies, that scale up to 131,072 cores of the largest BG/P system in the United States (using 80% of the total system size), demonstrate that different process mappings can show significant difference in overall performance, especially on scale. For example, we show that this difference can be as much as 30% for P3DFFT and up to twofold for HALO. Through our proposed model, however, such differences in performance can be avoided so that the best possible performance is always achieved.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Topology Aware Process Mapping

TAMM: A New Topology-Aware Mapping Method for Parallel Applications on the Tianhe-2A Supercomputer

Netloc: A Tool for Topology-Aware Process Mapping

References

IBM Blue Gene Team (2008) Overview of the IBM Blue Gene/P project. IBM J Res Dev 52(1–2):199–220
Google Scholar
Cray Research, Inc (1993) Cray T3D system architecture overview
Argonne National Laboratory. PETSc. http://www.mcs.anl.gov/petsc
Kumar S, Huang C, Almasi G, Kale LV (2007) Achieving strong scaling with NAMD on Blue Gene/L. In: IEEE international parallel and distributed processing symposium
Google Scholar
Naval Research Laboratory. Naval research laboratory layered ocean model (NLOM). http://www.navo.hpc.mil/Navigator/Fall99_Feature.html
Rabiti C, Smith MA, Kaushik D, Yang WS, Palmiotti G (2008) Parallel method of characteristics on unstructured meshes for the UNIC code. In: PHYSOR, Interlaken, Switzerland, 14–19 Sept 2008
Google Scholar
Balaji P, Chan A, Thakur R, Gropp W, Lusk E (2009) Toward message passing for a million processes: characterizing MPI on a massive scale blue gene/P. J Comput Sci Res Devel Special edn (presented at the International supercomputing conference (ISC)); Best Paper Award
Balaji P, Naik H, Desai N (2009) Understanding network saturation behavior on large-scale blue gene/P systems. In: Proceedings of the international conference on parallel and distributed systems (ICPADS), Shenzhen, China, 8–11 Dec 2009
Google Scholar
Pekurovsky D (2009) P3DFFT webpage, Feb 2009. http://www.sdsc.edu/us/resources/p3dfft/index.php
Wallcraft AJ (1999) The Halo benchmark. http://www.navo.hpc.mil/Navigator/PDFS/Fall1999.pdf
Fischer P, Lottes J, Pointer D, Siegel A (2008) Petascale algorithms for reactor hydrodynamics. J Phys Conf Ser 125(1). doi:10.1088/1742-6596/125/1/012076
Frigo M, Johnson SG (2005) The design and implementation of FFTW3. Proc IEEE 93:216–231
Article Google Scholar
Cooley JW, Tukey JW (1964) An algorithm for the machine calculation of complex Fourier series. Math Comput 19(90):297–301
MathSciNet Google Scholar
San Diego Supercomputing Center. P3DFFT. http://www.sdsc.edu/us/resources/p3dfft/
Chan A, Balaji P, Gropp W, Thakur R (2008) Communication analysis of parallel 3D FFT for flat Cartesian meshes on large blue gene systems. In: Proceedings of the IEEE/ACM international conference on high performance computing (HiPC), Bangalore, India, 17–20 Dec 2008
Google Scholar
Wallcraft AJ (1991) The NRL layered ocean model users guide. NOARL Report 35, Naval Research Laboratory, Stennis Space Center, MS
Traff J (2002) Implementing the MPI process topology mechanism. In: SC, pp 1–14
Google Scholar
Hur J (1999) An approach for torus embedding. In: ICPP, Washington, DC, USA. IEEE Computer Society, Los Alamitos, p 301
Google Scholar
Ou C, Ranka S, Fox G (1996) Fast and parallel mapping algorithms for irregular problems. J Supercomput 10(2):119–140
Article Google Scholar
Bokhari S (1981) On the mapping problem. IEEE Trans Comput 30(3):207–214
Article MathSciNet Google Scholar
Bollinger SW, Midkiff S (1991) Heuristic technique for processor and link assignment in multicomputers. IEEE Trans Comput 40(3):325–333
Article Google Scholar
Mansour N, Ponnusamy R, Choudhary A, Fox GC (1993) Graph contraction for physical optimization methods: a quality-cost tradeoff for mapping data on parallel computers. In: ISC, New York, NY, USA. ACM, New York, pp 1–10
Google Scholar
Chockalingam T, Arunkumar S (1992) Randomized heuristics for the mapping problem. The genetic approach. In: Parallel computing, pp 1157–1165
Google Scholar
Bhagnot G, Gara A, Heidelberger P et al. (2005) Optimizing task layout on the Blue Gene/L supercomputer. IBM J Res Dev 49(2–3):489–500. doi:10.1147/rd.492.0489
Article Google Scholar
Almasi G, Archer C, Castanos J et al. (2004) Implementing MPI on the BlueGene/L Supercomputer. In: Euro-Par, pp 833–845
Google Scholar
Yu H, Chung I, Moreira J (2006) Topology mapping for Blue Gene/L supercomputer. In: SC, New York, NY, USA. ACM, New York, p 116
Google Scholar
Agarwal T, Sharma A, Laxmikant A, Kale LV (2006) Topology-aware task mapping for reducing communication contention on large parallel machines. In: IPDPS, p 122
Google Scholar
Smith B, Bode B (2005) Performance effects of node mappings on the IBM BlueGene/L machine. In: Euro-Par, pp 1005–1013
Google Scholar
Faraj A, Yuan X, Lowenthal D (2006) STAR-MPI: self tuned adaptive routines for MPI collective operations. In: Proceedings of the 20th annual international conference on supercomputing (ICS), Cairns, Queensland, Australia, pp 199–208
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Mathematics and Computer Science Division, Argonne National Laboratory, 9700 S. Cass Avenue, Argonne, IL, 60439, USA
Pavan Balaji & Rinku Gupta
High Performance Computing Group, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA, 99352, USA
Abhinav Vishnu
Argonne Leadership Computing Facility, Argonne National Laboratory, 9700 S. Cass Avenue, Argonne, IL, 60439, USA
Pete Beckman

Authors

Pavan Balaji
View author publications
You can also search for this author in PubMed Google Scholar
Rinku Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Abhinav Vishnu
View author publications
You can also search for this author in PubMed Google Scholar
Pete Beckman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pavan Balaji.

Additional information

This work was supported in part by the National Science Foundation Grant #0702182 and by Office of Advanced Scientific Computing Research, Office of Science, US Department of Energy, under Contract DE-AC02-06CH11357.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Balaji, P., Gupta, R., Vishnu, A. et al. Mapping communication layouts to network hardware characteristics on massive-scale blue gene systems. Comput Sci Res Dev 26, 247–256 (2011). https://doi.org/10.1007/s00450-011-0168-y

Download citation

Published: 06 April 2011
Issue Date: June 2011
DOI: https://doi.org/10.1007/s00450-011-0168-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mapping communication layouts to network hardware characteristics on massive-scale blue gene systems

Abstract

Access this article

Similar content being viewed by others

Topology Aware Process Mapping

TAMM: A New Topology-Aware Mapping Method for Parallel Applications on the Tianhe-2A Supercomputer

Netloc: A Tool for Topology-Aware Process Mapping

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Mapping communication layouts to network hardware characteristics on massive-scale blue gene systems

Abstract

Access this article

Similar content being viewed by others

Topology Aware Process Mapping

TAMM: A New Topology-Aware Mapping Method for Parallel Applications on the Tianhe-2A Supercomputer

Netloc: A Tool for Topology-Aware Process Mapping

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation