Locating and categorizing inefficient communication patterns in HPC systems using inter-process communication traces☆
Introduction
The demand for High Performance Computing (HPC) systems continues to grow to meet the needs of many industrial and research sectors such as bioinformatics, medical information processing, and financial analytics, for powerful systems to process and solve large and complex problems (Heldens et al., 2020). The popularity of HPC programs has further flourished with the advent of multicore and cloud computing environments.
HPC programs that are developed using the Message Passing Interface (MPI) standard (MPI Forum, 2012) rely on a large number of processes working together by exchanging messages to solve computationally intensive problems. MPI combines processes in different groups called Communicators. Processes in one communicator interact with each other according to a virtual topology, which usually follows a linear, 2 or 3-dimensional mesh structure. Processes communicate with their nearest or non-nearest neighbors in the mesh. In a typical MPI program, these communications are repetitive and form communication patterns. A communication pattern groups sequences of MPI communication events from different processes that are working towards a specific task. The binary tree and butterfly patterns are examples of communication patterns (Navaridas et al., 2008). Fig. 1 shows the butterfly and the binary tree patterns for eight processes.
Performance analysis and debugging of HPC systems require dynamic analysis techniques due to the distributed nature of these systems. An early work presented by Preissl et al. (2008) showed that automatic identification of communication patterns from execution traces can be useful for understanding an application’s communication behavior, that would eventually facilitate debugging and performance analysis tasks. The problem is that typical traces can be overwhelmingly large with many instances of various communication patterns. The mere detection of communication patterns may still generate a lot of data that is hard for a software analyst to grasp. To alleviate this problem, researchers have proposed the concept of trace segmentation in which a large trace is partitioned into distinct segments, which depict execution phases of the traced scenario, and detect communication patterns within each segment (Casas et al., 2010, Isaacs et al., 2015, Alawneh et al., 2016).
An execution phase is broadly defined as a region within the trace that contains communication patterns, which implement a specific program functionality (Alawneh et al., 2016). Trace segmentation is also used to support program comprehension tasks in monolithic systems (Pirzadeh et al., 2013). The main objective is to provide a way for the software analyst to only focus on parts of the trace of interest instead of browsing the whole trace. Casas et al. (2010) and Chetsa et al. (2013) showed how MPI trace execution phases can help with performance optimization tasks by uncovering regions in a trace with the highest latency. Isaacs et al. (2015) presented a trace visualization and analysis tool that logically orders and visualizes the MPI communication behavior into fine-grained phases to determine the lateness in program operations using temporal metrics and visual inspection.
In our previous work (Alawneh et al., 2016), we proposed an effective trace segmentation approach, which involves two main steps. In the first step, we detect communication patterns in the entire trace using natural language processing techniques. In the second step, we use the extracted communication patterns to identify dense homogeneous clusters, which represent distinct execution phases of the trace. This is achieved using information theory concepts such as Shannon entropy (Shannon, 1948) and the Jensen–Shannon Divergence measure (Grosse et al., 2002). The new contributions of this paper are summarized as follows:
- •
We improve our previous technique for segmenting traces into execution phases using the Akaike Information Criterion (AIC) (Akaike, 1981) to identify finer execution phases.
- •
We extend the communication pattern detection approach by using distinctive events to identify the boundaries of coherent communication events in each process trace, which facilitates the detection of process repeating patterns.
- •
We propose an approach for detecting inefficient communication pattern instances in trace segments using statistical analysis. More specifically, we use the Median Absolute Deviation (MAD) and the Modified Z-score measures (Iglewicz and Hoaglin, 1993) to determine slow communication patterns.
- •
We propose an approach for the categorization of communication patterns using the Analytic Hierarchy Process (AHP) (Saaty, 1990) by examining the complexity and severity levels for slow patterns in execution phases.
- •
We demonstrate the effectiveness of our approach by applying it to five large traces generated from three different HPC systems.
- •
Through the analysis of a sample of inefficient patterns detected by our approach, we provide a detailed discussion on the potential root causes, which demonstrate the usefulness of our approach in practice.
The rest of the paper is organized as follows. Section 2 presents a background of HPC and sequence segmentation followed by related studies on techniques for the analysis of MPI programs in Section 3. Section 4 details the proposed approach. In Section 5, we apply our approach on several traces generated from HPC systems and show how it could detect patterns of inefficient behavior. We conclude our paper in Section 6 and discuss future directions.
Section snippets
Background
This section starts by providing a more detailed view of HPC and communication patterns. Then, it presents the sequence segmentation technique that we use in our study for identifying computational phases.
Related work
Several tools have been developed for the visualization of MPI traces to facilitate program comprehension and system analysis tasks (ZIH, 2022, Shende and Malony, 2006). Although trace visualization tools capture details regarding the whole execution trace, it is difficult to analyze and comprehend the program execution by mere reliance on visualization techniques. For example, Fig. 4 shows a zoomed-in view of a trace of 16 processes using the Vampir (ZIH, 2022) visualization tool. This typical
The approach
Fig. 5 shows our overall approach for detecting and locating inefficient communication patterns in MPI traces. We start by extracting the communication patterns by identifying the repeated sequences of MPI calls in each process trace. The output of this step is a sequence of all instances of the detected communication patterns in the trace ordered using the happened-before relationship. Second, we locate the communication patterns within specific trace segments using information theory
Evaluation
We tested our approach on five traces generated from three HPC systems: SMG2000 (Brown et al., 2000), AMG2013 (ASC, 2013), and the NAS BT parallel benchmark (NAS, 1994).
To generate traces, we instrumented the applications statically using the Score-P (VI-HPS, 2022) tool, which is perhaps one of the most recommended instrumentation tools for MPI-based systems. We instrumented all the functions of a system in the same way to ensure that the added overhead, though it is known to be low with
Conclusion
We presented a novel approach for the discovery of slow communication patterns in execution traces using statistical analysis techniques. Our approach can also be used for program comprehension to help analysts understand the behavior of the inter-process communication in MPI programs. Our approach is built on an improved version of our previous trace segmentation approach, which uses information theory principles to split a trace on execution phases. The approach relies on the detection of
Reproduction package
The implementation of our approach as well as the data used in the evaluation section are made available in the following repository: https://github.com/lalawneh/HPC-MPI-Traces.
CRediT authorship contribution statement
Luay Alawneh: Conceptualization, Methodology, Literature review, Trace generation, Models and techniques, Implementation and experiments, Validation, Writing – review & editing. Abdelwahab Hamou-Lhadj: Conceptualization, Methodology, Validation, Writing – review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
The authors would like to thank the Deanship of Research at Jordan University of Science & Technology for funding this research (ID. 20200285). Also, the first author, Dr. Luay Alawneh, would like to thank Concordia University, Canada, where this research was conducted during his sabbatical leave.
Dr. Luay Alawneh is an associate professor in the Department of Software Engineering at Jordan University of Science and Technology, Irbid, Jordan. His research interests are in software engineering, software maintenance and evolution, parallel processing, high performance computing systems, machine learning, and deep learning. Luay received his Ph.D. in electrical and computer engineering from Concordia University in Canada. In addition to his research achievements, Luay possesses intensive
References (64)
Likelihood of a model and information criteria
J. Econometrics
(1981)- et al.
Segmenting large traces of inter-process communication with a focus on high performance computing systems
J. Syst. Softw.
(2016) - et al.
An AHP (analytic hierarchy process)/ANP (analytic network process)-based multi-criteria decision approach for the selection of solar-thermal power plant investment projects
Energy
(2014) - et al.
Understanding performance of SMP clusters running MPI programs
Future Gener. Comput. Syst.
(2001) - et al.
The analytic hierarchy process supporting decision making for sustainable development: An overview of applications
J. Cleaner Prod.
(2019) - et al.
Entropy measures for early detection of bearing faults
Physica A
(2019) - et al.
Applications of recursive segmentation to the analysis of dna sequences
Comput. Chem.
(2002) - et al.
Stratified sampling of execution traces: Execution phases serving as strata
Sci. Comput. Program.
(2013) - et al.
MPI performance engineering with the MPI tool interface: the integration of MVAPICH and TAU
Parallel Comput.
(2018) How to make a decision: the analytic hierarchy process
European J. Oper. Res.
(1990)
Investigating solutions for the development of a green bond market: Evidence from analytic hierarchy process
Finance Res. Lett.
Communication complexity of byzantine agreement, revisited
Automatic on-line detection of MPI application structure with event flow graphs
Identifying the root causes of wait states in large-scale parallel applications
ACM Trans. Parallel Comput. (TOPC)
Scalable critical-path based performance analysis
Semicoarsening multigrid on distributed memory machines
SIAM J. Sci. Comput.
Automatic phase detection of MPI applications
Parallel Comput. Archit. Algorithms Appl.
Automatic phase detection and structure extraction of MPI applications
Int. J. High Perform. Comput. Appl.
The context sensitivity problem in biological sequence segmentation
A user friendly phase detection methodology for hpc systems’ analysis
Functional and non-functional requirements prioritization: empirical evaluation of IPA, AHP-based, and HAM-based approaches
Soft Comput.
The spmd model: Past, present and future
Profiling and tracing tools for performance analysis of large scale applications
PRACE: Partnersh. Adv. Comput. Europe
Open trace format 2: The next generation of scalable trace formats and support libraries
Employing MPI_T in MPI advisor to optimize application performance
Int. J. High Perform. Comput. Appl.
The Scalasca performance toolset architecture
Concurr. Comput.: Pract. Exper.
Automatic detection of parallel applications computation phases
Automatic refinement of parallel applications structure detection
Analysis of symbolic sequences using the Jensen-Shannon divergence
Phys. Rev. E
Algorithms on stings, trees, and sequences: Computer science and computational biology
ACM SIGACT News
Cited by (0)
Dr. Luay Alawneh is an associate professor in the Department of Software Engineering at Jordan University of Science and Technology, Irbid, Jordan. His research interests are in software engineering, software maintenance and evolution, parallel processing, high performance computing systems, machine learning, and deep learning. Luay received his Ph.D. in electrical and computer engineering from Concordia University in Canada. In addition to his research achievements, Luay possesses intensive industrial experience in software engineering and software development from North American firms.
Dr. Abdelwahab Hamou-Lhadj is a Professor in the Department of ECE at the Gina Cody School of Engineering and Computer Science, Concordia University, Montreal, Canada. His research interests are in software engineering, AI for IT operations, software tracing and logging, system observability, and model-driven engineering. He received his Ph.D. from the University of Ottawa, Canada. He is a senior member of IEEE, a long-lasting member of ACM, and a professional engineer with OIQ. He is also a frequent contributor to the Object Management Group (OMG) certification programs, OCUP 2 and OCEB 2.
- ☆
Editor: Dr Earl Barr.