Skip to main content

To Share or Not to Share: A Case for MPI in Shared-Memory

  • Conference paper
  • First Online:
Recent Advances in the Message Passing Interface (EuroMPI 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15267))

Included in the following conference series:

  • 102 Accesses

Abstract

The evolution of parallel computing architectures presents new challenges for developing efficient parallelized codes. The emergence of heterogeneous systems has given rise to multiple programming models, each requiring careful adaptation to maximize performance. In this context, we propose reevaluating memory layout designs for computational tasks within larger nodes by comparing various architectures. To gain insight into the performance discrepancies between shared memory and shared-address space settings, we systematically measure the bandwidth between cores and sockets using different methodologies. Our findings reveal significant differences in performance, suggesting that MPI running inside UNIX processes may not fully utilize its intranode bandwidth potential. In light of our work in the MPC thread-based MPI runtime, which can leverage shared memory to achieve higher performance due to its optimized layout, we advocate for enabling the use of shared memory within the MPI standard.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/besnardjb/memmapper.

  2. 2.

    https://github.com/besnardjb/memmapper.

  3. 3.

    spack install openmpi fabrics=cma.

  4. 4.

References

  1. Besnard, J.B., et al.: Introducing task-containers as an alternative to runtime-stacking. In: Proceedings of the 23rd European MPI Users’ Group Meeting, pp. 51–63 (2016)

    Google Scholar 

  2. Besnard, J., Malony, A.D., Shende, S., Pérache, M., Carribault, P., Jaeger, J.: An MPI halo-cell implementation for zero-copy abstraction. In: Dongarra, J.J., Denis, A., Goglin, B., Jeannot, E., Mercier, G. (eds.) Proceedings of the 22nd European MPI Users’ Group Meeting, EuroMPI 2015, Bordeaux, France, 21–23 September 2015, pp. 3:1–3:9. ACM (2015). https://doi.org/10.1145/2802658.2802669

  3. Brightwell, R., Pedretti, K., Hudson, T.: SMARTMAP: operating system support for efficient data sharing among processes on a multi-core processor. In: SC 2008: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing. IEEE (2008)

    Google Scholar 

  4. Buntinas, D., Mercier, G., Gropp, W.: Implementation and evaluation of shared-memory communication and synchronization operations in MPICH2 using the nemesis communication subsystem. Parallel Comput. 33(9), 634–644 (2007)

    Article  Google Scholar 

  5. Chen, C.C., et al.: MPI-xCCL: a portable MPI library over collective communication libraries for various accelerators. In: Proceedings of the SC’23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, pp. 847–854 (2023)

    Google Scholar 

  6. Dilley, N., Lange, J.: An empirical study of messaging passing concurrency in Go projects. In: 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 377–387. IEEE (2019)

    Google Scholar 

  7. Dosanjh, M.G., et al.: Implementation and evaluation of MPI 4.0 partitioned communication libraries. Parallel Comput. 108, 102827 (2021)

    Google Scholar 

  8. Friedley, A., Bronevetsky, G., Hoefler, T., Lumsdaine, A.: Hybrid MPI: efficient message passing for multi-core systems. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, pp. 1–11 (2013)

    Google Scholar 

  9. Gillis, T., Raffenetti, K., Zhou, H., Guo, Y., Thakur, R.: Quantifying the performance benefits of partitioned communication in MPI. In: Proceedings of the 52nd International Conference on Parallel Processing, pp. 285–294 (2023)

    Google Scholar 

  10. Goglin, B., Moreaud, S.: KNEM: a generic and scalable kernel-assisted intra-node MPI communication framework. J. Parallel Distributed Comput. 73(2), 176–188 (2013)

    Article  Google Scholar 

  11. Grant, R.E., Dosanjh, M.G.F., Levenhagen, M.J., Brightwell, R., Skjellum, A.: Finepoints: partitioned multithreaded MPI communication. In: Weiland, M., Juckeland, G., Trinitis, C., Sadayappan, P. (eds.) ISC High Performance 2019. LNCS, vol. 11501, pp. 330–350. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20656-7_17

    Chapter  Google Scholar 

  12. Hoefler, T., et al.: MPI+ MPI: a new hybrid approach to parallel programming with MPI plus shared memory. Computing 95, 1121–1136 (2013)

    Article  Google Scholar 

  13. Hori, A., Ouyang, K., Gerofi, B., Ishikawa, Y.: On the difference between shared memory and shared address space in HPC communication. In: Panda, D.K., Sullivan, M. (eds.) SCFA 2022. LNCS, vol. 13214, pp. 59–78. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-10419-0_5

  14. Hori, A., et al.: Process-in-process: techniques for practical address-space sharing. In: Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing, pp. 131–143 (2018)

    Google Scholar 

  15. Huang, C., Lawlor, O., Kalé, L.V.: Adaptive MPI. In: Rauchwerger, L. (ed.) LCPC 2003. LNCS, vol. 2958, pp. 306–322. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24644-2_20

    Chapter  Google Scholar 

  16. Jin, H.W., Sur, S., Chai, L., Panda, D.K.: LiMIC: support for high-performance MPI intra-node communication on Linux cluster. In: 2005 International Conference on Parallel Processing (ICPP 2005), pp. 184–191. IEEE (2005)

    Google Scholar 

  17. John, J., Narvaez, S., Gerndt, M.: Invasive computing for power corridor management. Parallel Comput. Technol. Trends 36, 386 (2020)

    Google Scholar 

  18. Malony, A.D., Reed, D.A., McGuire, P.J.: MPF: a portable message passing facility for shared memory multiprocessors. Technical report (1987)

    Google Scholar 

  19. Martinelli, A.R., Torquati, M., Aldinucci, M., Colonnelli, I., Cantalupo, B.: CAPIO: a middleware for transparent I/O streaming in data-intensive workflows. In: 2023 IEEE 30th International Conference on High Performance Computing, Data, and Analytics (HiPC), pp. 153–163. IEEE (2023)

    Google Scholar 

  20. MPI Forum: MPI Endpoints Proposal (2015). https://github.com/MPI-forum/MPI-issues/issues/56. Accessed 2024

  21. MPI Forum (2016): Arecv/Fsend Proposal. https://github.com/MPI-forum/MPI-issues/issues/32. Accessed 2024

  22. Ouyang, K., Si, M., Hori, A., Chen, Z., Balaji, P.: CAB-MPI: exploring interprocess work-stealing towards balanced MPI communication. In: SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE (2020)

    Google Scholar 

  23. Pérache, M., Jourdren, H., Namyst, R.: MPC: a unified parallel runtime for clusters of NUMA machines. In: Luque, E., Margalef, T., Benítez, D. (eds.) Euro-Par 2008. LNCS, vol. 5168, pp. 78–88. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85451-7_9

    Chapter  Google Scholar 

  24. Pieper, R., Löff, J., Hoffmann, R.B., Griebler, D., Fernandes, L.G.: High-level and efficient structured stream parallelism for rust on multi-cores. J. Comput. Lang. 65, 101054 (2021)

    Article  Google Scholar 

  25. Ross, R.B., et al.: Mochi: composing data services for high-performance computing environments. J. Comput. Sci. Technol. 35, 121–144 (2020)

    Article  Google Scholar 

  26. Shimada, A., Gerofi, B., Hori, A., Ishikawa, Y.: Proposing a new task model towards many-core architecture. In: Proceedings of the First International Workshop on Many-Core Embedded Systems, pp. 45–48 (2013)

    Google Scholar 

  27. Shimosaka, T., Murai, H., Sato, M.: A design of a communication library between multiple sets of MPI processes for MPMD. In: 2014 IEEE 17th International Conference on Computational Science and Engineering, pp. 1886–1893. IEEE (2014)

    Google Scholar 

  28. Trott, C.R., et al.: Kokkos 3: programming model extensions for the exascale era. IEEE Trans. Parallel Distrib. Syst. 33(4), 805–817 (2021)

    Article  Google Scholar 

  29. Vef, M.A., et al.: GekkoFS-a temporary distributed file system for HPC applications. In: 2018 IEEE International Conference on Cluster Computing (CLUSTER), pp. 319–324. IEEE (2018)

    Google Scholar 

  30. Venkata, M.G., Graham, R.L., Hjelm, N.T., Gutierrez, S.K.: Open MPI for cray XE/XK systems. In: Proceedings of the 2012 Cray User Group, Greengineering the Future, Stuttgart, Germany (2012)

    Google Scholar 

  31. Vienne, J.: Benefits of cross memory attach for MPI libraries on HPC clusters. In: Proceedings of the 2014 Annual Conference on Extreme Science and Engineering Discovery Environment, pp. 1–6 (2014)

    Google Scholar 

  32. Weingram, A., Li, Y., Qi, H., Ng, D., Dai, L., Lu, X.: xCCL: a survey of industry-led collective communication libraries for deep learning. J. Comput. Sci. Technol. 38(1), 166–195 (2023)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Julien Adam .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Adam, J., Besnard, JB., Roussel, A., Jaeger, J., Carribault, P., Pérache, M. (2025). To Share or Not to Share: A Case for MPI in Shared-Memory. In: Blaas-Schenner, C., Niethammer, C., Haas, T. (eds) Recent Advances in the Message Passing Interface. EuroMPI 2024. Lecture Notes in Computer Science, vol 15267. Springer, Cham. https://doi.org/10.1007/978-3-031-73370-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-73370-3_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-73369-7

  • Online ISBN: 978-3-031-73370-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics