Abstract
Nearly all implementations of the Message Passing Interface (MPI) employ a two-level protocol for point-to-point messages. Short messages are sent eagerly to optimize for latency, and long messages are typically implemented using a rendezvous mechanism. In a rendezvous implementation, the sender must first send a request and receive an acknowledgment before the data can be transferred. While there are several possible reasons for using this strategy for long messages, most implementations are forced to use a rendezvous strategy due to operating system and/or network limitations. In this paper, we compare an implementation that uses a rendezvous protocol for long messages with an implementation that adds an eager optimization for long messages. We discuss implementation issues and provide a performance comparison for several micro-benchmarks. We also present a new micro-benchmark that may provide better insight into how these different protocols effect application performance. Results for this new benchmark indicate that, for larger messages, a significant number of receives must be pre-posted in order for an eager protocol optimization to outperform a rendezvous protocol.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brightwell, R., Shuler, L.: Design and Implementation of MPI on Puma Portals. In: Proceedings of the Second MPI Developer’s Conference, July 1996, pp. 18–25 (1996)
Gropp, W., Lusk, E., Doss, N., Skjellum, A.: A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard. Parallel Computing 22(6), 789–828 (1996)
Lawry, W., Wilson, C., Maccabe, A.B., Brightwell, R.: Comb: A portable benchmark suite for assessing mpi overlap. Technical Report TR-CS-2002-13, Computer Science Department, The University of New Mexico (April 2002)
Shuler, L., Jong, C., Riesen, R., van Dresser, D., Maccabe, A.B., Fisk, L.A., Mack Stallcup, T.: The Puma Operating System for Massively Parallel Computers. In: Proceeding of the 1995 Intel Supercomputer User’s Group Conference. Intel Supercomputer User’s Group (1995)
Snell, Q.O., Mikler, A., Gustafson, J.L.: NetPIPE: A Network Protocol Independent Performance Evaluator. In: Proceedings of the IASTED International Conference on Intelligent Information Management and Systems (June 1996)
Wheat, S.R., Mattson, T.G., Scott, D.: A TeraFLOPS Supercomputer in 1996: The ASCI TFLOP System. In: Proceedings of the 1996 International Parallel Processing Symposium (1996)
Wong, F.C., Martin, R.P., Arpaci-Dusseau, R.H., Culler, D.E.: Architectural Requirements and Scalability of the NAS Parallel Benchmarks. In: Proceedings of SC 1999 (November 1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Brightwell, R., Underwood, K. (2003). Evaluation of an Eager Protocol Optimization for MPI. In: Dongarra, J., Laforenza, D., Orlando, S. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2003. Lecture Notes in Computer Science, vol 2840. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39924-7_46
Download citation
DOI: https://doi.org/10.1007/978-3-540-39924-7_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20149-6
Online ISBN: 978-3-540-39924-7
eBook Packages: Springer Book Archive