ExaMPI: A Modern Design and Implementation to Accelerate Message Passing Interface Innovation

Skjellum, Anthony; Rüfenacht, Martin; Sultana, Nawrin; Schafer, Derek; Laguna, Ignacio; Mohror, Kathryn

doi:10.1007/978-3-030-41005-6_11

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1087))

Included in the following conference series:

Latin American High Performance Computing Conference

830 Accesses
4 Citations

Abstract

The difficulty of deep experimentation with Message Passing Interface (MPI) implementations—which are quite large and complex—substantially raises the cost and complexity of proof-of-concept activities and limits the community of potential contributors to new and better MPI features and implementations alike. Our goal is to enable researchers to experiment rapidly and easily with new concepts, algorithms, and internal protocols for MPI, we introduce ExaMPI, a modern MPI-3.x subset with a robust MPI-4.x roadmap. We discuss design, early implementation, and ongoing utilization in parallel programming research, plus specific research activities enabled by ExaMPI.

Architecturally, ExaMPI is a C++17-based library designed for modularity, extensibility, and understandability. The code base supports both native C++ threading with thread-safe data structures and a modular progress engine. In addition, the transport abstraction implements UDP, TCP, OFED verbs, and LibFabrics for high-performance networks.

By enabling researchers with ExaMPI, we seek to accelerate innovations and increase the number of new experiments and experimenters, all while expanding MPI’s applicability.

This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344. Lawrence Livermore National Security, LLC (LLNL-CONF-775497) and partial support from the National Science Foundation under Grants Nos. CCF-1562659, CCF-1562306, CCF-1617690, CCF-1822191, CCF-1821431. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation or Lawrence Livermore National Laboratory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In fact, a raft of papers (e.g., [7,8,9,10, 15, 16, 18, 24, 27, 29]) in the literature show workarounds to polling progress involving sporadically and haphazardly strewing one’s code with MPI_Test. Also, the OSU benchmark for overlap explicitly depends on the use of MPI_Test [4, 22].
2.
to avoid code cloning, enable use of compiler-supported threads, employ metaprogramming and polymorphism where appropriate, and enable enhanced modularity over C.
3.
independent progress of messages through the network, independently of how often an application calls MPI functions, see also Sect. 5.2.
4.
based on the long experience of the first author and review of many applications’ use of MPI as supported by a recent study by some of us and others [23].

References

Cray MPI. https://pubs.cray.com/content/S-2529/17.05/xctm-series-programming-environment-user-guide-1705-s-2529/mpt
IBM Spectrum MPI. https://tinyurl.com/yy9cwm4p
MPI/Pro. https://www.runtimecomputing.com/products/mpipro/
Osu micro-benchmarks 5.6.2. http://mvapich.cse.ohio-state.edu/benchmarks/
Intel MPI library, August 2018. https://software.intel.com/en-us/mpi-library
Bangalore, P., Rabenseifner, R., Holmes, D., Jaeger, J., Mercier, G., Blaas-Schenner, C., Skjellum, A.: Exposition, clarification, and expansion of MPI semantic terms and conventions (2019). Under review
Google Scholar
Barigou, Y., Venkatesan, V., Gabriel, E.: Auto-tuning non-blocking collective communication operations. In: 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, pp. 1204–1213, May 2015. https://doi.org/10.1109/IPDPSW.2015.15
Castillo, E., et al.: Optimizing computation-communication overlap in asynchronous task-based programs: poster. In: Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming, PPoPP 2019, pp. 415–416. ACM, New York (2019). https://doi.org/10.1145/3293883.3295720
Denis, A., Trahay, F.: MPI overlap: benchmark and analysis. In: 2016 45th International Conference on Parallel Processing (ICPP), pp. 258–267, August 2016. https://doi.org/10.1109/ICPP.2016.37
Didelot, S., Carribault, P., Pérache, M., Jalby, W.: Improving MPI communication overlap with collaborative polling. In: Träff, J.L., Benkner, S., Dongarra, J.J. (eds.) EuroMPI 2012. LNCS, vol. 7490, pp. 37–46. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33518-1_9
Chapter Google Scholar
Dimitrov, R.P.: Overlapping of communication and computation and early binding: fundamental mechanisms for improving parallel performance on clusters of workstations. Ph.D. thesis, Mississippi State, MS, USA (2001)
Google Scholar
Graham, R.L., Shipman, G.M., Barrett, B.W., Castain, R.H., Bosilca, G., Lumsdaine, A.: Open MPI: a high-performance, heterogeneous MPI. In: Cluster 2006, pp. 1–9, September 2006
Google Scholar
Grant, R.E., Dosanjh, M.G.F., Levenhagen, M.J., Brightwell, R., Skjellum, A.: Finepoints: partitioned multithreaded MPI communication. In: Weiland, M., Juckeland, G., Trinitis, C., Sadayappan, P. (eds.) ISC High Performance 2019. LNCS, vol. 11501, pp. 330–350. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20656-7_17
Chapter Google Scholar
Gropp, W.: MPICH2: a new start for MPI implementations. In: Kranzlmüller, D., Volkert, J., Kacsuk, P., Dongarra, J. (eds.) EuroPVM/MPI 2002. LNCS, vol. 2474, pp. 7–7. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45825-5_5
Chapter Google Scholar
Guo, J., Yi, Q., Meng, J., Zhang, J., Balaji, P.: Compiler-assisted overlapping of communication and computation in MPI applications. In: 2016 IEEE International Conference on Cluster Computing (CLUSTER), pp. 60–69, September 2016. https://doi.org/10.1109/CLUSTER.2016.62
Hager, G., Schubert, G., Wellein, G.: Prospects for truly asynchronous communication with pure MPI and hybrid MPI/OpenMP on current supercomputing platforms (2011)
Google Scholar
Hassani, A.: Toward a scalable, transactional, fault-tolerant message passing interface for petascale and exascale machines. Ph.D. thesis, UAB (2016)
Google Scholar
Hoefler, T., Lumsdaine, A.: Message progression in parallel computing - to thread or not to thread? In: 2008 IEEE International Conference on Cluster Computing, pp. 213–222, September 2008. https://doi.org/10.1109/CLUSTR.2008.4663774
Holmes, D., et al.: MPI sessions: leveraging runtime infrastructure to increase scalability of applications at exascale. In: EuroMPI 2016, pp. 121–129. ACM, New York (2016)
Google Scholar
Holmes, D.J., Morgan, B., Skjellum, A., Bangalore, P.V., Sridharan, S.: Planning for performance: Enhancing achievable performance for MPI through persistent collective operations. PARCOMP 81, 32–57 (2019)
MathSciNet Google Scholar
ISO: ISO/IEC 14882:2017 Information technology – Programming languages – C++. Fifth edn., December 2017. https://tinyurl.com/yct5hxcs
Liu, J., et al.: Performance comparison of MPI implementations over Infiniband, Myrinet and Quadrics. In: Proceedings of the 2003 ACM/IEEE Conference on Supercomputing, SC 2003, pp. 58–58, November 2003. https://doi.org/10.1109/SC.2003.10007
Laguna, I., Mohror, K., Sultana, N., Rüfenacht, M., Marshall, R., Skjellum, A.: A large-scale study of MPI usage in open-source HPC applications. In: Proceedings of the SC 2019, November 2019 (2019, in press). https://github.com/LLNL/MPI-Usage
Lu, H., Seo, S., Balaji, P.: MPI+ULT: overlapping communication and computation with user-level threads. In: 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems, pp. 444–454, August 2015. https://doi.org/10.1109/HPCC-CSS-ICESS.2015.82
Panda, D.K., Tomko, K., Schulz, K., Majumdar, A.: The MVAPICH project: evolution and sustainability of an open source production quality MPI library for HPC. In: WSPPE (2013)
Google Scholar
Skjellum, A., et al.: Object-oriented analysis and design of the message passing interface. Concurrency Comput.: Practice Exp. 13(4), 245–292 (2001). https://doi.org/10.1002/cpe.556
Sridharan, S., Dinan, J., Kalamkar, D.D.: Enabling efficient multithreaded MPI communication through a library-based implementation of MPI endpoints. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2014, pp. 487–498. IEEE Press, Piscataway (2014). https://doi.org/10.1109/SC.2014.45
Sultana, N., Rüfenacht, M., Skjellum, A., Laguna, I., Mohror, K.: Failure recovery for bulk synchronous applications with MPI stages. PARCOMP 84, 1–14 (2019)
Google Scholar
Wittmann, M., Hager, G., Zeiser, T., Wellein, G.: Asynchronous MPI for the masses. arXiv preprint arXiv:1302.4280 (2013)

Download references

Author information

Authors and Affiliations

University of Tennessee at Chattanooga, Chattanooga, USA
Anthony Skjellum & Martin Rüfenacht
Auburn University, Auburn, USA
Nawrin Sultana
EPCC, University of Edinburgh, Edinburgh, Scotland, UK
Martin Rüfenacht
Tennessee Tech University, Cookeville, USA
Derek Schafer
Lawrence Livermore National Laboratory, Livermore, USA
Ignacio Laguna & Kathryn Mohror

Authors

Anthony Skjellum
View author publications
You can also search for this author in PubMed Google Scholar
Martin Rüfenacht
View author publications
You can also search for this author in PubMed Google Scholar
Nawrin Sultana
View author publications
You can also search for this author in PubMed Google Scholar
Derek Schafer
View author publications
You can also search for this author in PubMed Google Scholar
Ignacio Laguna
View author publications
You can also search for this author in PubMed Google Scholar
Kathryn Mohror
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anthony Skjellum .

Editor information

Editors and Affiliations

Costa Rica Institute of Technology, Cartago, Costa Rica
Juan Luis Crespo-Mariño
Costa Rica Institute of Technology, Cartago, Costa Rica
Esteban Meneses-Rojas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Skjellum, A., Rüfenacht, M., Sultana, N., Schafer, D., Laguna, I., Mohror, K. (2020). ExaMPI: A Modern Design and Implementation to Accelerate Message Passing Interface Innovation. In: Crespo-Mariño, J., Meneses-Rojas, E. (eds) High Performance Computing. CARLA 2019. Communications in Computer and Information Science, vol 1087. Springer, Cham. https://doi.org/10.1007/978-3-030-41005-6_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-41005-6_11
Published: 12 February 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41004-9
Online ISBN: 978-3-030-41005-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics