Skip to main content

ExaMPI: A Modern Design and Implementation to Accelerate Message Passing Interface Innovation

  • Conference paper
  • First Online:
High Performance Computing (CARLA 2019)

Abstract

The difficulty of deep experimentation with Message Passing Interface (MPI) implementations—which are quite large and complex—substantially raises the cost and complexity of proof-of-concept activities and limits the community of potential contributors to new and better MPI features and implementations alike. Our goal is to enable researchers to experiment rapidly and easily with new concepts, algorithms, and internal protocols for MPI, we introduce ExaMPI, a modern MPI-3.x subset with a robust MPI-4.x roadmap. We discuss design, early implementation, and ongoing utilization in parallel programming research, plus specific research activities enabled by ExaMPI.

Architecturally, ExaMPI is a C++17-based library designed for modularity, extensibility, and understandability. The code base supports both native C++ threading with thread-safe data structures and a modular progress engine. In addition, the transport abstraction implements UDP, TCP, OFED verbs, and LibFabrics for high-performance networks.

By enabling researchers with ExaMPI, we seek to accelerate innovations and increase the number of new experiments and experimenters, all while expanding MPI’s applicability.

This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344. Lawrence Livermore National Security, LLC (LLNL-CONF-775497) and partial support from the National Science Foundation under Grants Nos. CCF-1562659, CCF-1562306, CCF-1617690, CCF-1822191, CCF-1821431. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation or Lawrence Livermore National Laboratory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In fact, a raft of papers (e.g., [7,8,9,10, 15, 16, 18, 24, 27, 29]) in the literature show workarounds to polling progress involving sporadically and haphazardly strewing one’s code with MPI_Test. Also, the OSU benchmark for overlap explicitly depends on the use of MPI_Test [4, 22].

  2. 2.

    to avoid code cloning, enable use of compiler-supported threads, employ metaprogramming and polymorphism where appropriate, and enable enhanced modularity over C.

  3. 3.

    independent progress of messages through the network, independently of how often an application calls MPI functions, see also Sect. 5.2.

  4. 4.

    based on the long experience of the first author and review of many applications’ use of MPI as supported by a recent study by some of us and others [23].

References

  1. Cray MPI. https://pubs.cray.com/content/S-2529/17.05/xctm-series-programming-environment-user-guide-1705-s-2529/mpt

  2. IBM Spectrum MPI. https://tinyurl.com/yy9cwm4p

  3. MPI/Pro. https://www.runtimecomputing.com/products/mpipro/

  4. Osu micro-benchmarks 5.6.2. http://mvapich.cse.ohio-state.edu/benchmarks/

  5. Intel MPI library, August 2018. https://software.intel.com/en-us/mpi-library

  6. Bangalore, P., Rabenseifner, R., Holmes, D., Jaeger, J., Mercier, G., Blaas-Schenner, C., Skjellum, A.: Exposition, clarification, and expansion of MPI semantic terms and conventions (2019). Under review

    Google Scholar 

  7. Barigou, Y., Venkatesan, V., Gabriel, E.: Auto-tuning non-blocking collective communication operations. In: 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, pp. 1204–1213, May 2015. https://doi.org/10.1109/IPDPSW.2015.15

  8. Castillo, E., et al.: Optimizing computation-communication overlap in asynchronous task-based programs: poster. In: Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming, PPoPP 2019, pp. 415–416. ACM, New York (2019). https://doi.org/10.1145/3293883.3295720

  9. Denis, A., Trahay, F.: MPI overlap: benchmark and analysis. In: 2016 45th International Conference on Parallel Processing (ICPP), pp. 258–267, August 2016. https://doi.org/10.1109/ICPP.2016.37

  10. Didelot, S., Carribault, P., Pérache, M., Jalby, W.: Improving MPI communication overlap with collaborative polling. In: Träff, J.L., Benkner, S., Dongarra, J.J. (eds.) EuroMPI 2012. LNCS, vol. 7490, pp. 37–46. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33518-1_9

    Chapter  Google Scholar 

  11. Dimitrov, R.P.: Overlapping of communication and computation and early binding: fundamental mechanisms for improving parallel performance on clusters of workstations. Ph.D. thesis, Mississippi State, MS, USA (2001)

    Google Scholar 

  12. Graham, R.L., Shipman, G.M., Barrett, B.W., Castain, R.H., Bosilca, G., Lumsdaine, A.: Open MPI: a high-performance, heterogeneous MPI. In: Cluster 2006, pp. 1–9, September 2006

    Google Scholar 

  13. Grant, R.E., Dosanjh, M.G.F., Levenhagen, M.J., Brightwell, R., Skjellum, A.: Finepoints: partitioned multithreaded MPI communication. In: Weiland, M., Juckeland, G., Trinitis, C., Sadayappan, P. (eds.) ISC High Performance 2019. LNCS, vol. 11501, pp. 330–350. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20656-7_17

    Chapter  Google Scholar 

  14. Gropp, W.: MPICH2: a new start for MPI implementations. In: Kranzlmüller, D., Volkert, J., Kacsuk, P., Dongarra, J. (eds.) EuroPVM/MPI 2002. LNCS, vol. 2474, pp. 7–7. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45825-5_5

    Chapter  Google Scholar 

  15. Guo, J., Yi, Q., Meng, J., Zhang, J., Balaji, P.: Compiler-assisted overlapping of communication and computation in MPI applications. In: 2016 IEEE International Conference on Cluster Computing (CLUSTER), pp. 60–69, September 2016. https://doi.org/10.1109/CLUSTER.2016.62

  16. Hager, G., Schubert, G., Wellein, G.: Prospects for truly asynchronous communication with pure MPI and hybrid MPI/OpenMP on current supercomputing platforms (2011)

    Google Scholar 

  17. Hassani, A.: Toward a scalable, transactional, fault-tolerant message passing interface for petascale and exascale machines. Ph.D. thesis, UAB (2016)

    Google Scholar 

  18. Hoefler, T., Lumsdaine, A.: Message progression in parallel computing - to thread or not to thread? In: 2008 IEEE International Conference on Cluster Computing, pp. 213–222, September 2008. https://doi.org/10.1109/CLUSTR.2008.4663774

  19. Holmes, D., et al.: MPI sessions: leveraging runtime infrastructure to increase scalability of applications at exascale. In: EuroMPI 2016, pp. 121–129. ACM, New York (2016)

    Google Scholar 

  20. Holmes, D.J., Morgan, B., Skjellum, A., Bangalore, P.V., Sridharan, S.: Planning for performance: Enhancing achievable performance for MPI through persistent collective operations. PARCOMP 81, 32–57 (2019)

    MathSciNet  Google Scholar 

  21. ISO: ISO/IEC 14882:2017 Information technology – Programming languages – C++. Fifth edn., December 2017. https://tinyurl.com/yct5hxcs

  22. Liu, J., et al.: Performance comparison of MPI implementations over Infiniband, Myrinet and Quadrics. In: Proceedings of the 2003 ACM/IEEE Conference on Supercomputing, SC 2003, pp. 58–58, November 2003. https://doi.org/10.1109/SC.2003.10007

  23. Laguna, I., Mohror, K., Sultana, N., Rüfenacht, M., Marshall, R., Skjellum, A.: A large-scale study of MPI usage in open-source HPC applications. In: Proceedings of the SC 2019, November 2019 (2019, in press). https://github.com/LLNL/MPI-Usage

  24. Lu, H., Seo, S., Balaji, P.: MPI+ULT: overlapping communication and computation with user-level threads. In: 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems, pp. 444–454, August 2015. https://doi.org/10.1109/HPCC-CSS-ICESS.2015.82

  25. Panda, D.K., Tomko, K., Schulz, K., Majumdar, A.: The MVAPICH project: evolution and sustainability of an open source production quality MPI library for HPC. In: WSPPE (2013)

    Google Scholar 

  26. Skjellum, A., et al.: Object-oriented analysis and design of the message passing interface. Concurrency Comput.: Practice Exp. 13(4), 245–292 (2001). https://doi.org/10.1002/cpe.556

  27. Sridharan, S., Dinan, J., Kalamkar, D.D.: Enabling efficient multithreaded MPI communication through a library-based implementation of MPI endpoints. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2014, pp. 487–498. IEEE Press, Piscataway (2014). https://doi.org/10.1109/SC.2014.45

  28. Sultana, N., Rüfenacht, M., Skjellum, A., Laguna, I., Mohror, K.: Failure recovery for bulk synchronous applications with MPI stages. PARCOMP 84, 1–14 (2019)

    Google Scholar 

  29. Wittmann, M., Hager, G., Zeiser, T., Wellein, G.: Asynchronous MPI for the masses. arXiv preprint arXiv:1302.4280 (2013)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anthony Skjellum .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Skjellum, A., Rüfenacht, M., Sultana, N., Schafer, D., Laguna, I., Mohror, K. (2020). ExaMPI: A Modern Design and Implementation to Accelerate Message Passing Interface Innovation. In: Crespo-Mariño, J., Meneses-Rojas, E. (eds) High Performance Computing. CARLA 2019. Communications in Computer and Information Science, vol 1087. Springer, Cham. https://doi.org/10.1007/978-3-030-41005-6_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-41005-6_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-41004-9

  • Online ISBN: 978-3-030-41005-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics