Improving MPI Communication Overlap with Collaborative Polling

Didelot, Sylvain; Carribault, Patrick; Pérache, Marc; Jalby, William

doi:10.1007/978-3-642-33518-1_9

Sylvain Didelot^19,21,
Patrick Carribault^20,19,
Marc Pérache^20,19 &
…
William Jalby^19,21

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 7490))

Included in the following conference series:

European MPI Users' Group Meeting

1528 Accesses
6 Citations

Abstract

With the rise of parallel applications complexity, the needs in term of computational power are continually growing. Recent trends in High-Performance Computing (HPC) have shown that improvements in single-core performance will not be sufficient to face the challenges of an Exascale machine: we expect an enormous growth of the number of cores as well as a multiplication of the data volume exchanged across compute nodes. To scale applications up to Exascale, the communication layer has to minimize the time while waiting for network messages. This paper presents a message progression based on Collaborative Polling which allows an efficient auto-adaptive overlapping of communication phases by performing computing. This approach is new as it increases the application overlap potential without introducing overheads of a threaded message progression.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Efficient Communication/Computation Overlap with MPI+OpenMP Runtimes Collaboration

Dynamic Placement of Progress Thread for Overlapping MPI Non-blocking Collectives on Manycore Processor

Finepoints: Partitioned Multithreaded MPI Communication

References

Iii, J.B.W., Bova, S.W.: Where’s the overlap? - an analysis of popular MPI implementations. Technical report (August 12, 1999)
Google Scholar
Brightwell, R., Riesen, R., Underwood, K.D.: Analyzing the impact of overlap, offload, and independent progress for message passing interface applications. IJHPCA (2005)
Google Scholar
Pérache, M., Carribault, P., Jourdren, H.: MPC-MPI: An MPI Implementation Reducing the Overall Memory Consumption. In: Ropo, M., Westerholm, J., Dongarra, J. (eds.) PVM/MPI. LNCS, vol. 5759, pp. 94–103. Springer, Heidelberg (2009)
Chapter Google Scholar
Bell, C., Bonachea, D., Nishtala, R., Yelick, K.A.: Optimizing bandwidth limited problems using one-sided communication and overlap. In: IPDPS (2006)
Google Scholar
Subotic, V., Sancho, J.C., Labarta, J., Valero, M.: The impact of application’s micro-imbalance on the communication-computation overlap. In: Parallel, Distributed and Network-based Processing (PDP) (2011)
Google Scholar
Thakur, R., Gropp, W.: Open Issues in MPI Implementation. In: Choi, L., Paek, Y., Cho, S. (eds.) ACSAC 2007. LNCS, vol. 4697, pp. 327–338. Springer, Heidelberg (2007)
Chapter Google Scholar
Hager, G., Jost, G., Rabenseifner, R.: Communication characteristics and hybrid MPI/OpenMP parallel programming on clusters of multi-core SMP nodes. In: Proceedings of Cray User Group (2009)
Google Scholar
Graham, R., Poole, S., Shamis, P., Bloch, G., Bloch, N., Chapman, H., Kagan, M., Shahar, A., Rabinovitz, I., Shainer, G.: Connectx-2 infiniband management queues: First investigation of the new support for network offloaded collective operations. In: International Conference on Cluster, Cloud and Grid Computing, CCGRID (2010)
Google Scholar
Almási, G., Bellofatto, R., Brunheroto, J., Caşcaval, C., Castaños, J.G., Crumley, P., Erway, C.C., Lieber, D., Martorell, X., Moreira, J.E., Sahoo, R., Sanomiya, A., Ceze, L., Strauss, K.: An overview of the bluegene/L system software organization. In: Parallel Processing Letters (2003)
Google Scholar
Amerson, G., Apon, A.: Implementation and design analysis of a network messaging module using virtual interface architecture. In: International Conference on Cluster Computing (2004)
Google Scholar
Sur, S., Jin, H.W., Chai, L., Panda, D.K.: RDMA Read Based Rendezvous Protocol for MPI over InfiniBand: Design Alternatives and Benefits. Alternatives (2006)
Google Scholar
Kumar, R., Mamidala, A.R., Koop, M.J., Santhanaraman, G., Panda, D.K.: Lock-Free Asynchronous Rendezvous Design for MPI Point-to-Point Communication. In: Lastovetsky, A., Kechadi, T., Dongarra, J. (eds.) EuroPVM/MPI 2008. LNCS, vol. 5205, pp. 185–193. Springer, Heidelberg (2008)
Chapter Google Scholar
Hoefler, T., Lumsdaine, A.: Message progression in parallel computing – to thread or not to thread? In: International Conference on Cluster Computing (2008)
Google Scholar
Trahay, F., Denis, A.: A scalable and generic task scheduling system for communication libraries. In: International Conference on Cluster Computing (2009)
Google Scholar
Huang, C., Lawlor, O., Kalé, L.V.: Adaptive MPI. In: LCPC (2004)
Google Scholar
Rico-Gallego, J.-A., Díaz-Martín, J.-C.: Performance Evaluation of Thread-Based MPI in Shared Memory. In: Cotronis, Y., Danalis, A., Nikolopoulos, D.S., Dongarra, J. (eds.) EuroMPI 2011. LNCS, vol. 6960, pp. 337–338. Springer, Heidelberg (2011)
Chapter Google Scholar
Demaine, E.: A threads-only MPI implementation for the development of parallel programming. In: Proceedings of the 11th International Symposium on High Performance Computing Systems (1997)
Google Scholar
Tang, H., Yang, T.: Optimizing threaded MPI execution on SMP clusters. In: International Conference on Supercomputing, ICS (2001)
Google Scholar
Brightwell, R., Pedretti, K.: An intra-node implementation of openshmem using virtual address space mapping. In: Fifth Partitioned Global Address Space Conference (2011)
Google Scholar
Wolff, M., Jaouen, S., Jourdren, H., Sonnendrcker, E.: High-order dimensionally split lagrange-remap schemes for ideal magnetohydrodynamics. Discrete and Continuous Dynamical Systems - Series S (2012)
Google Scholar
Bailey, D., Harris, T., Saphir, W., van der Wijngaart, R., Woo, A., Yarrow, M.: The NAS Parallel Benchmarks 2.0 (1995)
Google Scholar
Springel, V.: The cosmological simulation code gadget-2. Monthly Notices of the Royal Astronomical Society 364 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Exascale Computing Research Center, Versailles, France
Sylvain Didelot, Patrick Carribault, Marc Pérache & William Jalby
DAM, DIF, CEA, F-91297, Arpajon, France
Patrick Carribault & Marc Pérache
Université de Versailles Saint-Quentin-en-Yvelines (UVSQ), Versailles, France
Sylvain Didelot & William Jalby

Authors

Sylvain Didelot
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Carribault
View author publications
You can also search for this author in PubMed Google Scholar
Marc Pérache
View author publications
You can also search for this author in PubMed Google Scholar
William Jalby
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Informatics, Institute of Information Systems, Research Group Parallel Computing, Vienna University of Technology / TU Wien, Favoritenstrasse 16, 1040, Vienna / Wien, Austria
Jesper Larsson Träff
Faculty of Computer Science, Research Group Scientific Computing, University of Vienna, Währinger Str. 29/6.21, 1090, Vienna / Wien, Austria
Siegfried Benkner
University of Tennessee, 37996, Knoxville, TN, USA
Jack J. Dongarra

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Didelot, S., Carribault, P., Pérache, M., Jalby, W. (2012). Improving MPI Communication Overlap with Collaborative Polling. In: Träff, J.L., Benkner, S., Dongarra, J.J. (eds) Recent Advances in the Message Passing Interface. EuroMPI 2012. Lecture Notes in Computer Science, vol 7490. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33518-1_9

Download citation

DOI: https://doi.org/10.1007/978-3-642-33518-1_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33517-4
Online ISBN: 978-3-642-33518-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Improving MPI Communication Overlap with Collaborative Polling

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Efficient Communication/Computation Overlap with MPI+OpenMP Runtimes Collaboration

Dynamic Placement of Progress Thread for Overlapping MPI Non-blocking Collectives on Manycore Processor

Finepoints: Partitioned Multithreaded MPI Communication

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Improving MPI Communication Overlap with Collaborative Polling

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Efficient Communication/Computation Overlap with MPI+OpenMP Runtimes Collaboration

Dynamic Placement of Progress Thread for Overlapping MPI Non-blocking Collectives on Manycore Processor

Finepoints: Partitioned Multithreaded MPI Communication

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation