ABSTRACT
This tutorial will introduce participants to the Score-P measurement system and the Vampir trace visualization tool for performance analysis. We will provide examples and hands-on exercises covering the full performance engineering workflow cycle on applications that include MPI, OpenMP, and GPU parallelism. Users will learn the following concepts: 1. How to collect an initial profile of their code with Score-P. 2. Evaluation of that profile and its associated measurement overhead. 3. The concepts of scoring and filtering a profile and measurement respectively. 4. How to control the Score-P measurement system via environment variables. 5. How to collect useful traces with acceptable overhead. 6. How to understand trace visualization in Vampir.
- Gene M Amdahl. 1967. Validity of the single processor approach to achieving large scale computing capabilities. In Proceedings of the April 18--20, 1967, spring joint computer conference. 483--485.Google ScholarDigital Library
- David H Bailey, Eric Barszcz, John T Barton, David S Browning, Robert L Carter, Leonardo Dagum, Rod A Fatoohi, Paul O Frederickson, Thomas A Lasinski, Rob S Schreiber, et al. 1991. The NAS parallel benchmarks-summary and preliminary results. In Proceedings of the 1991 ACM/IEEE Conference on Supercomputing. 158-- 165.Google ScholarDigital Library
- Andreas Gocht, Robert Schöne, and Jan Frenzel. 2020. Advanced Python Per- formance Monitoring with Score-P. Tools for High Performance Computing 2018 / 2019 (Oct. 2020), 261--270. https://doi.org/10.1007/978--3-030--66057--4_14 arXiv:2010.15444 [cs.DC]Google ScholarCross Ref
- M Heroux, J Willenbring, S Shende, C Coti, W Spear, J Peyralans, J Skutnik, and E Keever. 2020. E4S: Extreme-scale Scientific Software Stack. In 2020 Collegeville Workshop on Scientific Software Whitepapers.Google Scholar
- Andreas Knüpfer, Christian Rössel, Dieter an Mey, Scott Biersdorff, Kai Diethelm, Dominic Eschweiler, Markus Geimer, Michael Gerndt, Daniel Lorenz, Allen Mal- ony, et al. 2012. Score-p: A joint performance measurement run-time infras- tructure for periscope, scalasca, tau, and vampir. In Tools for High Performance Computing 2011: Proceedings of the 5th International Workshop on Parallel Tools for High Performance Computing, September 2011, ZIH, Dresden. Springer, 79--91.Google Scholar
- Matthias Lieber, Verena Grützun, Ralf Wolke, Matthias S Müller, and Wolfgang E Nagel. 2012. Highly scalable dynamic load balancing in the atmospheric modeling system COSMO-SPECS FD4. In Applied Parallel and Scientific Computing: 10th International Conference, PARA 2010, Reykjavík, Iceland, June 6--9, 2010, Revised Selected Papers, Part I 10. Springer, 131--141.Google ScholarDigital Library
- Matthias Lieber, Ralf Wolke, Verena Grützun, Matthias S Müller, and Wolfgang E Nagel. 2010. A framework for detailed multiphase cloud modeling on HPC systems. In Parallel Computing: From Multicores and GPU's to Petascale. IOS Press, 281--288.Google Scholar
- Simon McIntosh-Smith, Matthew Martineau, Tom Deakin, Grzegorz Pawelczak, Wayne Gaudin, Paul Garrett, Wei Liu, Richard Smedley-Stevenson, and David Beckingsale. 2017. TeaLeaf: A mini-application to enable design-space explo- rations for iterative sparse linear solvers. In 2017 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, 842--849.Google ScholarCross Ref
- Wolfgang E Nagel, Alfred Arnold, Michael Weber, Hans-Christian Hoppe, and Karl Solchenbach. 1996. VAMPIR: Visualization and analysis of MPI resources. (1996).Google Scholar
- Xian-He Sun and John L Gustafson. 1991. Toward a better parallel performance metric. Parallel Comput. 17, 10--11 (1991), 1093--1109.Google ScholarDigital Library
- Samuel Williams, Andrew Waterman, and David Patterson. 2009. Roofline: An Insightful Visual Performance Model for Multicore Architectures. Commun. ACM 52, 4 (apr 2009), 65--76. https://doi.org/10.1145/1498765.1498785Google ScholarDigital Library
Index Terms
- Parallel Performance Engineering using Score-P and Vampir
Recommendations
Integrating Visualization (and Visualization Experts) with Performance Analysis
PERMAVOST '22: Proceedings of the 2nd Workshop on Performance EngineeRing, Modelling, Analysis, and VisualizatiOn StrategyIdentifying and understanding poor performance is an increasingly difficult task due to the growing complexity and scale of target applications and systems. Visualization is an important tool for exploratory and comprehension-centered goals, leading to ...
Automated Detection of Performance Regressions Using Regression Models on Clustered Performance Counters
ICPE '15: Proceedings of the 6th ACM/SPEC International Conference on Performance EngineeringPerformance testing is conducted before deploying system updates in order to ensure that the performance of large software systems did not degrade (i.e., no performance regressions). During such testing, thousands of performance counters are collected. ...
Comments