Exploiting application buffer reuse to improve MPI small message transfer protocols over RDMA-enabled networks

Rashti, Mohammad J.; Afsahi, Ahmad

doi:10.1007/s10586-011-0165-8

Exploiting application buffer reuse to improve MPI small message transfer protocols over RDMA-enabled networks

Published: 03 June 2011

Volume 14, pages 345–356, (2011)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Mohammad J. Rashti¹ &
Ahmad Afsahi¹

136 Accesses
1 Citation
Explore all metrics

Abstract

To avoid the memory registration cost for small messages in MPI implementations over RDMA-enabled networks, message transfer protocols involve a copy to intermediate buffers at both sender and receiver. In this paper, we propose to eliminate the send-side copy when an application buffer is reused frequently. We show that it is more efficient to register the application buffer and use it for data transfer. The idea is examined for small message transfer protocols in MVAPICH2, including RDMA Write and Send/Receive based communications, one-sided communications and collectives. The proposed protocol adaptively falls back to the current protocol when the application does not frequently use its buffers. The performance results over InfiniBand indicate up to 14% improvement for single message latency, close to 20% improvement for one-sided operations and up to 25% improvement for collectives. In addition, the communication time in MPI applications with high buffer reuse is improved using this technique.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RDMA Communciation Patterns

Article Open access 29 September 2020

Can Deferring Small Messages Enhance the Performance of OpenSHMEM Applications?

Performance improvement of Apache Storm using InfiniBand RDMA

Article 05 June 2019

References

Message Passing Interface Forum: MPI, A Message Passing Interface Standard V2.2 (2011)
RDMA Consortium: Remote direct memory access protocol. http://www.rdmaconsortium.org (2009). Accessed 1 August 2010
InfiniBand Trade Association: InfiniBand architecture specification. http://www.infinibandta.org/ (2010). Accessed 19 July 2010
Mietke, F., Rex, R., Baumgartl, R., Mehlan, T., Hoefler, T., Rehm, W.: Analysis of the memory registration process in the mellanox InfiniBand software stack. In: Proceedings of the 12th International Euro-Par Conference, Dresden, Germany, pp. 124–133 (2006). doi:10.1007/11823285_13
Google Scholar
Magoutis, K.: Memory management support for multi-programmed remote direct memory access (RDMA) systems. In: Proceedings of the 2nd Workshop for RDMA Applications, Implementations and Technologies (RAIT-2005); held in conjunction with IEEE Cluster 2005, Burlington, MA, October (2005). doi:10.1109/CLUSTR.2005.347031
Google Scholar
Argonne National Laboratory: MPICH2 MPI Implementation. http://www-unix.mcs.anl.gov/mpi/mpich2/ (2010). Accessed 26 July 2010
Liu, J., Wu, J., Panda, D.K.: High performance RDMA-based MPI implementation over InfiniBand. In: Proceedings of the 17th Annual Conference on Supercomputing, pp. 295–304 (2003). doi:10.1145/782814.782855
Chapter Google Scholar
Rashti, M.J., Afsahi, A.: Improving RDMA-based MPI Eager protocol for frequently-used buffers. In: 9th Workshop on Communication Architecture for Clusters (CAC 2009). Proceedings of the 23rd International Parallel and Distributed Processing Symposium (IPDPS 2009), Rome, Italy, May 25–29 (2009). doi:10.1109/IPDPS.2009.5160895
Google Scholar
Huang, W., Santhanaraman, G., Jin, H., Panda, D.K.: Design alternatives and performance trade-offs for implementing MPI-2 over InfiniBand. In: Proceedings of the Euro PVM/MPI Conference, pp. 191–199 (2005). doi:10.1007/11557265_27
Google Scholar
Mietke, F., Rex, R., Mehlan, T., Hoefler, T., Rehm, W.: Reducing the impact of memory registration in InfiniBand. In: 1st Kommunikation in Clusterrechnern und Clusterverbundsystemen (KiCC) (2005)
Google Scholar
Wyckoff, P., Wu, J.: Memory registration caching correctness. In: Proceedings of the IEEE International Symposium on Cluster Computing and the Grid (CCGrid’05), pp. 1008–1015 (2005). doi:10.1109/CCGRID.2005.1558671
Google Scholar
Lever, C.: Linux Kernel Hash Table Behavior: Analysis and Improvements. Technical Report 00-1, Center for Information Technology Integration, University of Michigan (2000)
Mellanox Technologies Inc.: http://www.mellanox.com/ (2010). Accessed 27 July 2010
The Ohio State University, Network-based computing laboratory: MVAPICH2, MPI-2 over InfiniBand, iWARP and RoCE Project. http://mvapich.cse.ohio-state.edu/ (2010). Accessed 1 July 2010
OpenFabric Alliance: OpenFabrics Enterprise Distribution (OFED). http://www.openfabrics.org (2010). Accessed 1 July 2010
National Aeronautics and Space Administration: NAS Parallel Benchmarks, version 2.4. http://www.nas.nasa.gov/Resources/Software/npb.html (2010). Accessed 1 August 2010
Lawrence Livermore National Laboratory: AMG 2006, ASC Sequoia Benchmarks. http://asc.llnl.gov/sequoia/benchmarks/ (2009). Accessed 1 August 2010
Standard Performance Evaluation Corporation: SPEC MPI 2007 Benchmark Suite. http://www.spec.org/mpi/ (2010). Accessed 1 August 2010
Faraj, A., Yuan, X.: Communication characteristics in the NAS parallel benchmarks. In: Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS), pp. 724–729 (2002)
Google Scholar
Arber, L., Pakin, S.: The impact of message-buffer alignment on communication performance. Parallel Process. Lett. 15, 49–65 (2005). doi:10.1142/S0129626405002052
Article MathSciNet Google Scholar
Morrow, M.: Optimizing Memcpy improves speed. Embedded system design, April 2004. http://www.eetimes.com/design/other/4024961/Optimizing-Memcpy-improves-speed (2004). Accessed 24 July 2010
Woodall, T.S., Shipman, G.M., Bosilca, G., Graham, R.L., Maccabe, A.B.: High performance RDMA protocols in HPC. In: Proceedings of the Euro PVM/MPI Conference, pp. 76–85 (2006). doi:10.1007/11846802_18
Google Scholar
Dalessandro, D., Wyckoff, P., Montry, G.: Initial performance evaluation of the NetEffect 10 gigabit iWARP adapter. In: Proceedings of the 3rd IEEE Workshop on Remote Direct Memory Access (RDMA): Applications, Implementations, and Technologies (RAIT 2006), held in conjunction with IEEE Cluster, pp. 1–7 (2006). doi:10.1109/CLUSTR.2006.311915
Google Scholar
Goglin, B.: Decoupling Memory Pinning from the Application with Overlapped On-demand Pinning and MMU Notifiers. In: Proceedings of the IEEE International Symposium on Parallel & Distributed Processing (IPDPS2009), pp. 1–8 (2009). doi:10.1109/IPDPS.2009.5160888
Ou, L., He, X., Han, J.: An efficient design for fast memory registration in RDMA. J. Netw. Comput. Appl. 32, 642–651 (2009). doi:10.1145/363095.363141
Article Google Scholar
Dalessandro, D., Wyckoff, P.: Memory management strategies for data serving with RDMA. In: Proceedings of the 15th Annual IEEE Symposium on High-Performance Interconnects (HotI), August 22–24 (2007). doi:10.1109/HOTI.2007.21
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Queen’s University, Kingston, ON, Canada, K7L 3N6
Mohammad J. Rashti & Ahmad Afsahi

Authors

Mohammad J. Rashti
View author publications
You can also search for this author in PubMed Google Scholar
Ahmad Afsahi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ahmad Afsahi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rashti, M.J., Afsahi, A. Exploiting application buffer reuse to improve MPI small message transfer protocols over RDMA-enabled networks. Cluster Comput 14, 345–356 (2011). https://doi.org/10.1007/s10586-011-0165-8

Download citation

Received: 04 August 2010
Accepted: 04 May 2011
Published: 03 June 2011
Issue Date: December 2011
DOI: https://doi.org/10.1007/s10586-011-0165-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exploiting application buffer reuse to improve MPI small message transfer protocols over RDMA-enabled networks

Abstract

Access this article

Similar content being viewed by others

RDMA Communciation Patterns

Can Deferring Small Messages Enhance the Performance of OpenSHMEM Applications?

Performance improvement of Apache Storm using InfiniBand RDMA

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Exploiting application buffer reuse to improve MPI small message transfer protocols over RDMA-enabled networks

Abstract

Access this article

Similar content being viewed by others

RDMA Communciation Patterns

Can Deferring Small Messages Enhance the Performance of OpenSHMEM Applications?

Performance improvement of Apache Storm using InfiniBand RDMA

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation