PredCom: A Predictive Approach to Collecting Approximated Communication Traces
- Univ. of Electro-Communications, Tokyo (Japan)
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Technical Univ. of Munich (Germany)
Communication traces collected from MPI applications are an important source of information for performance optimization as they can help analysts determine communication patterns and identify inefficiencies. However, their collection, especially at scale, is time consuming, since it usually requires running the complete target application on a large number of nodes. In this work, we present PredCom, a tool-chain to generate a predictive communication proxy based on information gathered from a few small scale runs, which allows us to extract approximate communication traces with an accuracy high enough for most analysis goals. For this, we combine LLVM passes on the original source code (to capture static program structure) with parameter prediction (to capture dynamic and scaling behavior). This approach drastically reduces the time needed for collecting the communication traces, even for traces on large numbers of MPI processes. Here, we demonstrate that PredCom generates communication traces of various applications up to 1612x faster with an accuracy loss of 0.11 on average compared to the original large-scale traces, and we show that the generated traces can be used to optimize process placement.
- Research Organization:
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Sponsoring Organization:
- USDOE National Nuclear Security Administration (NNSA); Japan Science and Technology Agency (JST); Kayamori Foundation of Informational Science Advancement; Japan Society for the Promotion of Science (JSPS)
- Grant/Contract Number:
- AC52-07NA27344; JP20H04193
- OSTI ID:
- 1769152
- Report Number(s):
- LLNL-JRNL-748663; 933780
- Journal Information:
- IEEE Transactions on Parallel and Distributed Systems, Vol. 32, Issue 1; ISSN 1045-9219
- Publisher:
- IEEECopyright Statement
- Country of Publication:
- United States
- Language:
- English
ScalaExtrap: trace-based communication extrapolation for spmd programs
|
conference | January 2011 |
ScalaBenchGen: Auto-Generation of Communication Benchmarks Traces
|
conference | May 2012 |
The Tau Parallel Performance System
|
journal | May 2006 |
A compiler-based communication analysis approach for multiprocessor systems
|
conference | January 2006 |
Transforming MPI source code based on communication patterns
|
journal | January 2010 |
Program Slicing
|
journal | July 1984 |
Statistical scalability analysis of communication operations in distributed applications
|
conference | January 2001 |
Points-to analysis in almost linear time
|
conference | January 1996 |
Exascaling Your Library
|
conference | June 2015 |
Automatic generation of executable communication specifications from parallel applications
|
conference | January 2011 |
The Scalasca performance toolset architecture
|
journal | January 2010 |
Flow-sensitive pointer analysis for millions of lines of code
|
conference | April 2011 |
Interprocedural slicing using dependence graphs
|
journal | January 1990 |
Recovering logical structure from Charm++ event traces
|
conference | November 2015 |
Combing the Communication Hairball: Visualizing Parallel Execution Traces using Logical Time
|
journal | December 2014 |
ScalaTrace: Scalable compression and replay of communication traces for high-performance computing
|
journal | August 2009 |
PEMOGEN: automatic adaptive performance modeling during program runtime
|
conference | January 2014 |
Profile-based power shifting in interconnection networks with on/off links
|
conference | November 2015 |
A regression-based approach to scalability prediction
|
conference | January 2008 |
Fast Multi-parameter Performance Modeling
|
conference | September 2016 |
How The Flang Frontend Works
|
conference | November 2017 |
Mpipp
|
conference | January 2006 |
Using automated performance modeling to find scalability bugs in complex codes
|
conference | January 2013 |
Scalable performance analysis of exascale MPI programs through signature-based clustering algorithms
|
conference | January 2014 |
Ordering Traces Logically to Identify Lateness in Message Passing Programs
|
journal | March 2016 |
Parallel sparse flow-sensitive points-to analysis
|
conference | February 2018 |
MPIWiz
|
journal | February 2009 |
Points-to analysis with efficient strong updates
|
conference | January 2011 |
Persistent pointer information
|
journal | June 2014 |
Automatic pool allocation
|
conference | January 2005 |
Fact
|
conference | January 2009 |
Quality of service profiling
|
conference | January 2010 |
Level by level
|
conference | January 2010 |
Similar Records
Probabilistic Communication and I/O Tracing with Deterministic Replay at Scale
ScalaTrace: Tracing, Analysis and Modeling of HPC Codes at Scale