research-article

Advancing application process affinity experimentation: open MPI's LAMA-based affinity interface

Authors:

Joshua Hursey,

Jeffrey M. SquyresAuthors Info & Claims

EuroMPI '13: Proceedings of the 20th European MPI Users' Group Meeting

Pages 163 - 168

https://doi.org/10.1145/2488551.2488603

Published: 15 September 2013 Publication History

Get Access

Abstract

Application studies have shown that the tuning of Message Passing Interface (MPI) processes placement in a server's non-uniform memory access (NUMA) networking topology can have a dramatic impact on performance. The performance implications are magnified when running a parallel job across multiple server nodes, especially with large scale MPI applications. As processor and NUMA topologies continue to grow more complex to meet the demands of ever-increasing processor core counts, best practices regarding process placement also need to evolve.

This paper presents Open MPI's flexible interface for distributing the individual processes of a parallel job across processing resources in a High Performance Computing (HPC) system, paying particular attention to the internal server NUMA topologies. The interface is a realization of the Locality-Aware Mapping Algorithm (LAMA) [8], and provides both simple and complex mechanisms for specifying regular process-to-processor mappings and affinitization. Open MPI's LAMA implementation is intended as a tool for MPI users to experiment with different process placement strategies on both current and emerging HPC platforms.

References

[1]

G. Almási, C. Archer, et al. Implementing MPI on the BlueGene/L supercomputer. In M. Danelutto and et al., editors, Euro-Par 2004 Parallel Processing, volume 3149 of Lecture Notes in Computer Science, pages 833--845. Springer Berlin/Heidelberg, 2004.

Google Scholar

[2]

Argonne National Laboratory. MPICH. http://www.mpich.org/.

Google Scholar

[3]

Argonne National Laboratory. Using the Hydra process manager. http://wiki.mpich.org/mpich/index.php/Using_the_Hydra_Process_Manager.

Google Scholar

[4]

F. Broquedis, J. Clet-Ortega, et al. hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications. In Proceedings of the 18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP2010), pages 180--186, Pisa, Italia, Feb. 2010. IEEE Computer Society Press.

Digital Library

Google Scholar

[5]

S. Ethier, W. M. Tang, et al. Large-scale gyrokinetic particle simulation of microturbulence in magnetically confined fusion plasmas. IBM Journal of Research and Development, 52:105--115, January 2008.

Digital Library

Google Scholar

[6]

E. Gabriel, G. E. Fagg, et al. Open MPI: Goals, concept, and design of a next generation MPI implementation. In Proceedings of the 11th European PVM/MPI Users' Group Meeting, pages 97--104, Budapest, Hungary, September 2004.

Crossref

Google Scholar

[7]

M. Gilge. IBM System Blue Gene Solution: Blue Gene/Q application development. Technical report, IBM, February 2013.

Google Scholar

[8]

J. Hursey, J. M. Squyres, et al. Locality-aware parallel process mapping for multi-core HPC systems. In IEEE International Conference on Cluster Computing, Austin, TX, September 2011. (Poster).

Digital Library

Google Scholar

[9]

E. Jeannot and G. Mercier. Near-optimal placement of MPI processes on hierarchical NUMA architectures. In Proceedings of the 16th International Euro-Par Conference on Parallel Processing, Euro-Par'10, pages 199--210, Berlin, Heidelberg, 2010. Springer-Verlag.

Digital Library

Google Scholar

[10]

M. Karo, R. Lagerstrom, et al. The application level placement scheduler. In Cray Users Group, 2006.

Google Scholar

[11]

A. Yoo, M. Jette, et al. SLURM: Simple Linux Utility for Resource Management. In D. Feitelson, L. Rudolph, and U. Schwiegelshohn, editors, Job Scheduling Strategies for Parallel Processing, volume 2862 of Lecture Notes in Computer Science, pages 44--60. Springer Berlin/Heidelberg, 2003.

Google Scholar

[12]

H. Yu, I.-H. Chung, et al. Topology mapping for Blue Gene/L supercomputer. In Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, SC '06, New York, NY, USA, 2006. ACM.

Digital Library

Google Scholar

Cited By

View all

Di Dio Lavore ICastellana VFeo JSantambrogio M(2024)Exploring Architectural-Aware Affinity Policies in Modern HPC RuntimesPractice and Experience in Advanced Research Computing 2024: Human Powered Computing10.1145/3626203.3670566(1-5)Online publication date: 17-Jul-2024
https://dl.acm.org/doi/10.1145/3626203.3670566
Quiroz-Fabián JRomán-Alonso GCastro-García MAguilar-Cornejo M(2023)PAARes: an efficient process allocation based on the available resources of cluster nodesThe Journal of Supercomputing10.1007/s11227-023-05085-779:9(10423-10441)Online publication date: 8-Feb-2023
https://doi.org/10.1007/s11227-023-05085-7
Leon EGerofi BJaeger JMercier GRiesen RTakagi MGoglin B(2020)Application-Driven Requirements for Node Resource Management in Next-Generation Systems2020 IEEE/ACM International Workshop on Runtime and Operating Systems for Supercomputers (ROSS)10.1109/ROSS51935.2020.00006(1-11)Online publication date: Nov-2020
https://doi.org/10.1109/ROSS51935.2020.00006
Show More Cited By

Recommendations

Locality-Aware Parallel Process Mapping for Multi-core HPC Systems
CLUSTER '11: Proceedings of the 2011 IEEE International Conference on Cluster Computing

High Performance Computing (HPC) systems are composed of servers containing an ever-increasing number of cores. With such high processor core counts, non-uniform memory access (NUMA) architectures are almost universally used to reduce inter-processor ...
NUMA-aware shared-memory collective communication for MPI
HPDC '13: Proceedings of the 22nd international symposium on High-performance parallel and distributed computing

As the number of cores per node keeps increasing, it becomes increasingly important for MPI to leverage shared memory for intranode communication. This paper investigates the design and optimizations of MPI collectives for clusters of NUMA nodes. We ...
NUMA-aware shared-memory collective communication for MPI
HPDC '13: Proceedings of the 22nd international symposium on High-performance parallel and distributed computing

As the number of cores per node keeps increasing, it becomes increasingly important for MPI to leverage shared memory for intranode communication. This paper investigates the design and optimizations of MPI collectives for clusters of NUMA nodes. We ...

Comments

Information & Contributors

Information

Published In

EuroMPI '13: Proceedings of the 20th European MPI Users' Group Meeting

September 2013

289 pages

ISBN:9781450319034

DOI:10.1145/2488551

General Chair:
Jack Dongarra
University of Tennessee
,
Program Chairs:
Javier Garcia Blas
University Carlos III, Spain
,
Jesus Carretero
University Carlos III, Spain

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

SIGHPC: ACM Special Interest Group on High Performance Computing, Special Interest Group on High Performance Computing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 September 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

U.S. Department of Energy

Conference

EuroMPI '13

Sponsor:

ARCOS

EuroMPI '13: 20th European MPI Users's Group Meeting

September 15 - 18, 2013

Madrid, Spain

Acceptance Rates

EuroMPI '13 Paper Acceptance Rate 22 of 47 submissions, 47%;

Overall Acceptance Rate 66 of 139 submissions, 47%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
80
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Di Dio Lavore ICastellana VFeo JSantambrogio M(2024)Exploring Architectural-Aware Affinity Policies in Modern HPC RuntimesPractice and Experience in Advanced Research Computing 2024: Human Powered Computing10.1145/3626203.3670566(1-5)Online publication date: 17-Jul-2024
https://dl.acm.org/doi/10.1145/3626203.3670566
Quiroz-Fabián JRomán-Alonso GCastro-García MAguilar-Cornejo M(2023)PAARes: an efficient process allocation based on the available resources of cluster nodesThe Journal of Supercomputing10.1007/s11227-023-05085-779:9(10423-10441)Online publication date: 8-Feb-2023
https://doi.org/10.1007/s11227-023-05085-7
Leon EGerofi BJaeger JMercier GRiesen RTakagi MGoglin B(2020)Application-Driven Requirements for Node Resource Management in Next-Generation Systems2020 IEEE/ACM International Workshop on Runtime and Operating Systems for Supercomputers (ROSS)10.1109/ROSS51935.2020.00006(1-11)Online publication date: Nov-2020
https://doi.org/10.1109/ROSS51935.2020.00006
Ganapathi RGopalakrishnan AMcGuire R(2018)HPC Process and Optimal Network Device AffinitizationIEEE Transactions on Multi-Scale Computing Systems10.1109/TMSCS.2018.28714444:4(749-757)Online publication date: 1-Oct-2018
https://doi.org/10.1109/TMSCS.2018.2871444
Goglin B(2017)On the Overhead of Topology Discovery for Locality-Aware Scheduling in HPC2017 25th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)10.1109/PDP.2017.35(186-190)Online publication date: 2017
https://doi.org/10.1109/PDP.2017.35
Ganapathi RGopalakrishnan AMcGuire R(2017)MPI Process and Network Device Affinitization for Optimal HPC Application Performance2017 IEEE 25th Annual Symposium on High-Performance Interconnects (HOTI)10.1109/HOTI.2017.12(80-86)Online publication date: Aug-2017
https://doi.org/10.1109/HOTI.2017.12
Michelogiannakis GIbrahim KShalf JWilke JKnight SKenny J(2017)APHiDProceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing10.1109/CCGRID.2017.33(228-237)Online publication date: 14-May-2017
https://dl.acm.org/doi/10.1109/CCGRID.2017.33
Goglin BJacob B(2016)Exposing the Locality of Heterogeneous Memory Architectures to HPC ApplicationsProceedings of the Second International Symposium on Memory Systems10.1145/2989081.2989115(30-39)Online publication date: 3-Oct-2016
https://dl.acm.org/doi/10.1145/2989081.2989115
Goglin B(2014)Managing the topology of heterogeneous cluster nodes with hardware locality (hwloc)2014 International Conference on High Performance Computing & Simulation (HPCS)10.1109/HPCSim.2014.6903671(74-81)Online publication date: Jul-2014
https://doi.org/10.1109/HPCSim.2014.6903671

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Recommendations

Locality-Aware Parallel Process Mapping for Multi-core HPC Systems

NUMA-aware shared-memory collective communication for MPI

NUMA-aware shared-memory collective communication for MPI

Comments

Information

Published In

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations