Characterizing MPI matching via trace-based simulation
- Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
With the increased scale expected on future leadership-class systems, detailed information about the resource usage and performance of MPI message matching provides important insights into how to maintain application performance on next-generation systems. However, obtaining MPI message matching performance data is often not possible without significant effort. A common approach is to instrument an MPI implementation to collect relevant statistics. While this approach can provide important data, collecting matching data at runtime perturbs the application's execution, including its matching performance, and is highly dependent on the MPI library's matchlist implementation. In this paper, we introduce a trace-based simulation approach to obtain detailed MPI message matching performance data for MPI applications without perturbing their execution. Using a number of key parallel workloads, we demonstrate that this simulator approach can rapidly and accurately characterize matching behavior. Specifically, we use our simulator to collect several important statistics about the operation of the MPI posted and unexpected queues. For example, we present data about search lengths and the duration that messages spend in the queues waiting to be matched. Furthermore, data gathered using this simulation-based approach have significant potential to aid hardware designers in determining resource allocation for MPI matching functions and provide application and middleware developers with insight into the scalability issues associated with MPI message matching.
- Research Organization:
- Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
- Sponsoring Organization:
- USDOE National Nuclear Security Administration (NNSA)
- Grant/Contract Number:
- AC04-94AL85000
- OSTI ID:
- 1444084
- Alternate ID(s):
- OSTI ID: 1457519
- Report Number(s):
- SAND-2018-5449J; SAND-2018-6407J; 663297
- Journal Information:
- Parallel Computing, Vol. 2017; ISSN 0167-8191
- Publisher:
- ElsevierCopyright Statement
- Country of Publication:
- United States
- Language:
- English
LogP: towards a realistic model of parallel computation
|
conference | January 1993 |
CTH: A Software Family for Multi-Dimensional Shock Physics Analysis
|
book | January 1995 |
Instrumentation and Analysis of MPI Queue Times on the SeaStar High-Performance Network
|
conference | August 2008 |
Characterizing application sensitivity to OS interference using kernel-level noise injection
|
conference | November 2008 |
CTH: A three-dimensional shock wave physics code
|
journal | January 1990 |
Adaptive and Dynamic Design for MPI Tag Matching
|
conference | September 2016 |
An analysis of NIC resource usage for offloading MPI
|
conference | January 2004 |
Relaxations for High-Performance Message Passing on Massively Parallel SIMT Processors
|
conference | May 2017 |
BoomerAMG: A parallel algebraic multigrid solver and preconditioner
|
journal | April 2002 |
Towards millions of communicating threads
|
conference | January 2016 |
Fast Parallel Algorithms for Short-Range Molecular Dynamics
|
journal | March 1995 |
Understanding the Effects of Communication and Coordination on Checkpointing at Scale
|
conference | November 2014 |
A Hardware Acceleration Unit for MPI Queue Processing
|
conference | January 2005 |
Characterizing the Influence of System Noise on Large-Scale Applications by Simulation
|
conference | November 2010 |
Hardware MPI message matching: Insights into MPI matching behavior to inform design
|
journal | February 2019 |
Tail queues: A multi‐threaded matching architecture
|
journal | February 2019 |
Similar Records
A Case for Application Oblivious Energy-Efficient MPI Runtime
Using Simulation to Examine the Effect of MPI Message Matching Costs on Application Performance