Visualizing, Measuring, and Tuning Adaptive MPI Parameters

Diener, Matthias; White, Sam; Kale, Laxmikant V.

doi:10.1007/978-3-030-17872-7_13

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11027))

Included in the following conference series:

578 Accesses

Abstract

Adaptive MPI (AMPI) is an advanced MPI runtime environment that offers several features over traditional MPI runtimes, which can lead to a better utilization of the underlying hardware platform and therefore higher performance. These features are overdecomposition through virtualization, and load balancing via rank migration. Choosing which of these features to use, and finding the optimal parameters for them is a challenging task however, since different applications and systems may require different options. Furthermore, there is a lack of information about the impact of each option. In this paper, we present a new visualization of AMPI in its companion Projections tool, which depicts the operation of an MPI application and details the impact of the different AMPI features on its resource usage. We show how these visualizations can help to improve the efficiency and execution time of an MPI application. Applying optimizations indicated by the performance analysis to two MPI-based applications results in performance improvements of up 18% from overdecomposition and load balancing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

On-the-Fly Calculation of Model Factors for Multi-paradigm Applications

Gleaming the Cube: Online Performance Analysis and Visualization Using MALP

Reactive Task Migration for Hybrid MPI+OpenMP Applications

Notes

References

Acun, B., et al.: Parallel programming with migratable objects: Charm++ in practice. In: SC (2014). https://doi.org/10.1109/SC.2014.58
Acun, B., Kale, L.V.: Mitigating processor variation through dynamic load balancing. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, pp. 1073–1076. IEEE (2016)
Google Scholar
Adhianto, L., et al.: HPCToolkit: tools for performance analysis of optimized parallel programs. Concurr. Comput.: Pract. Exp. 22(6), 685–701 (2010). https://doi.org/10.1002/cpe.1553
Article Google Scholar
Bhandarkar, M., Kalé, L.V., de Sturler, E., Hoeflinger, J.: Adaptive load balancing for MPI programs. In: Alexandrov, V.N., Dongarra, J.J., Juliano, B.A., Renner, R.S., Tan, C.J.K. (eds.) ICCS 2001. LNCS, vol. 2074, pp. 108–117. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-45718-6_13
Chapter Google Scholar
Gottbrath, C.: Automation assisted debugging on the Cray with TotalView. In: Proceedings of Cray User Group (2011)
Google Scholar
Huang, C., Lawlor, O., Kalé, L.V.: Adaptive MPI. In: Rauchwerger, L. (ed.) LCPC 2003. LNCS, vol. 2958, pp. 306–322. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24644-2_20
Chapter Google Scholar
Islam, T., Mohror, K., Schulz, M.: Exploring the capabilities of the new MPI$\_$T interface. In: Proceedings of the 21st European MPI Users’ Group Meeting, p. 91. ACM (2014)
Google Scholar
Islam, T., Mohror, K., Schulz, M.: Exploring the MPI tool information interface: features and capabilities. Int. J. High Perform. Comput. Appl., pp. 212–222. (2016). https://doi.org/10.1177/1094342015600507
Chapter Google Scholar
Jeannot, E., Meneses, E., Mercier, G., Tessier, F., Zheng, G.: Communication and topology-aware load balancing in Charm++ with TreeMatch. In: 2013 IEEE International Conference on Cluster Computing, CLUSTER, pp. 1–8. IEEE (2013)
Google Scholar
Kale, L.V., Krishnan, S.: CHARM++: a portable concurrent object oriented system based on C++. In: Conference on Object-Oriented Programming Systems, Languages, and Applications, OOPSLA, pp. 91–108 (1993)
Google Scholar
Karlin, I., et al.: Exploring traditional and emerging parallel programming models using a proxy application. In: 27th IEEE International Parallel & Distributed Processing Symposium, IEEE IPDPS 2013, Boston, USA, May 2013
Google Scholar
Karlin, I., Keasler, J., Neely, R.: Lulesh 2.0 updates and changes. Technical report LLNL-TR-641973, August 2013
Google Scholar
Karrels, E., Lusk, E.: Performance analysis of MPI programs. In: Environments and Tools for Parallel Scientific Computing, pp. 195–200 (1994)
Google Scholar
Knüpfer, A., et al.: The Vampir performance analysis tool-set. In: Resch, M., Keller, R., Himmler, V., Krammer, B., Schulz, A. (eds.) Tools for High Performance Computing, pp. 139–155. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-68564-7_9
Chapter Google Scholar
Knüpfer, A., et al.: Score-P: a joint performance measurement run-time infrastructure for periscope, Scalasca, TAU, and Vampir. In: Brunst, H., Müller, M., Nagel, W., Resch, M. (eds.) Tools for High Performance Computing, pp. 79–91. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31476-6_7
Chapter Google Scholar
Krammer, B., Bidmon, K., Müller, M.S., Resch, M.M.: MARMOT: an MPI analysis and checking tool. In: Advances in Parallel Computing, vol. 13, pp. 493–500 (2004)
Google Scholar
Lecomber, D., Wohlschlegel, P.: Debugging at scale with Allinea DDT. In: Cheptsov, A., Brinkmann, S., Gracia, J., Resch, M., Nagel, W. (eds.) Tools for High Performance Computing, pp. 3–12. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37349-7_1
Chapter Google Scholar
Menon, H., Chandrasekar, K., Kale, L.V.: POSTER: automated load balancer selection based on application characteristics. In: Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 447–448. ACM (2017)
Google Scholar
Message Passing Interface Forum: MPI: A Message-Passing Interface Standard (Version 3.0). Technical report (2012)
Google Scholar
Müller, M.S., et al.: Developing scalable applications with Vampir, VampirServer and VampirTrace. In: PARCO, vol. 15, pp. 637–644 (2007)
Google Scholar
Pearce, O., Gamblin, T., de Supinski, B.R., Schulz, M., Amato, N.M.: Quantifying the effectiveness of load balance algorithms. In: ACM International Conference on Supercomputing, ICS, pp. 185–194 (2012). https://doi.org/10.1145/2304576.2304601
Vetter, J.: Performance analysis of distributed applications using automatic classification of communication inefficiencies. In: Proceedings of the 14th International Conference on Supercomputing, pp. 245–254. ACM (2000)
Google Scholar
Vetter, J.: Dynamic statistical profiling of communication activity in distributed applications. In: Proceedings of the 2002 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2002, pp. 240–250. ACM, New York (2002). https://doi.org/10.1145/511334.511364
Van der Wijngaart, R.F., Mattson, T.G.: The parallel research kernels. In: 2014 IEEE High Performance Extreme Computing Conference, HPEC, pp. 1–6. IEEE (2014)
Google Scholar
Wu, C.E., et al.: From trace generation to visualization: a performance framework for distributed parallel systems. In: Proceedings of the 2000 ACM/IEEE Conference on Supercomputing, SC 2000. IEEE Computer Society, Washington, DC (2000). http://dl.acm.org/citation.cfm?id=370049.370458
Zaki, O., Lusk, E., Gropp, W., Swider, D.: Toward scalable performance visualization with Jumpshot. Int. J. High Perform. Comput. Appl. 13(3), 277–288 (1999)
Article Google Scholar
Zhai, J., Sheng, T., He, J.: Efficiently acquiring communication traces for large-scale parallel applications. IEEE Trans. Parallel Distrib. Syst. (TPDS) 22(11), 1862–1870 (2011). https://doi.org/10.1109/TPDS.2011.49
Article Google Scholar

Download references

Acknowledgments

This paper is based in part upon work supported by the Department of Energy, National Nuclear Security Administration, under Award Number DE-NA0002374.

Author information

Authors and Affiliations

University of Illinois at Urbana-Champaign, Urbana, USA
Matthias Diener, Sam White & Laxmikant V. Kale

Authors

Matthias Diener
View author publications
You can also search for this author in PubMed Google Scholar
Sam White
View author publications
You can also search for this author in PubMed Google Scholar
Laxmikant V. Kale
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matthias Diener .

Editor information

Editors and Affiliations

Lawrence Livermore National Laboratory, Livermore, CA, USA
Abhinav Bhatele
Lawrence Livermore National Laboratory, Livermore, CA, USA
David Boehme
The University of Arizona, Tucson, AZ, USA
Joshua A. Levine
University of Oregon, Eugene, OR, USA
Allen D. Malony
Technical University of Munich, Munich, Germany
Martin Schulz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Diener, M., White, S., Kale, L.V. (2019). Visualizing, Measuring, and Tuning Adaptive MPI Parameters. In: Bhatele, A., Boehme, D., Levine, J., Malony, A., Schulz, M. (eds) Programming and Performance Visualization Tools. ESPT ESPT VPA VPA 2017 2018 2017 2018. Lecture Notes in Computer Science(), vol 11027. Springer, Cham. https://doi.org/10.1007/978-3-030-17872-7_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-17872-7_13
Published: 24 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-17871-0
Online ISBN: 978-3-030-17872-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics