Abstract
We show how to adapt and extend a well-known allgather (all-to-all broadcast) algorithm to parallel systems with a hierarchical communication system such as clusters of SMP nodes. For small problem sizes, the new algorithm requires a logarithmic number of communication rounds in the number of SMP nodes, and gracefully degrades towards a linear algorithm as problem size increases. The algorithm has been used to implement the MPI_Allgather collective operation of MPI in the MPI/SX library. Performance measurements on a 72 node SX-8 system shows that graceful degradation provides a smooth transition from logarithmic to linear behavior, and significantly outperforms a standard, linear algorithm. The performance of the latter is furthermore highly sensitive to the distribution of MPI processes over the physical processors.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Almási, G., Heidelberger, P., Archer, C., Martorell, X., Erway, C.C., Moreira, J.E., Steinmacher-Burow, B.D., Zheng, Y.: Optimization of MPI collective communication on BlueGene/L systems. In: 19th ACM International Conference on Supercomputing (ICS 2005), pp. 253–262 (2005)
Benson, G.D., Chu, C.-W., Huang, Q., Caglar, S.G.: A comparison of MPICH allgather algorithms on switched networks. In: Dongarra, J., Laforenza, D., Orlando, S. (eds.) EuroPVM/MPI 2003. LNCS, vol. 2840, pp. 335–343. Springer, Heidelberg (2003)
Bruck, J., Ho, C.-T., Kipnis, S., Upfal, E., Weathersby, D.: Efficient algorithms for all-to-all communications in multiport message-passing systems. IEEE Transactions on Parallel and Distributed Systems 8(11), 1143–1156 (1997)
Fraigniaud, P., Lazard, E.: Methods and problems of communication in usual networks. Discrete Applied Mathematics 53(1–3), 79–133 (1994)
Hedetniemi, S.M., Hedetniemi, T., Liestman, A.L.: A survey of gossiping and broadcasting in communication networks. Networks 18, 319–349 (1988)
Johnsson, S.L., Ho, C.-T.: Optimum broadcasting and personalized communication in hypercubes. IEEE Transactions on Computers 38(9), 1249–1268 (1989)
Karonis, N.T., Toonen, B.R., Foster, I.T.: MPICH-G2: A grid-enabled implementation of the message passing interface. Journal of Parallel and Distributed Computing 63(5), 551–563 (2003)
Kielmann, T., Hofman, R.F.H., Bal, H.E., Plaat, A., Bhoedjang, R.A.F.: MagPIe: MPI’s collective communication operations for clustered wide area systems. In: Symposium on Principles and Practice of Parallel Programming (PPoPP 1999). ACM Sigplan Notices, vol. 34, pp. 131–140 (1999)
Krumme, D.W., Cybenko, G., Venkataraman, K.N.: Gossiping in minimal time. SIAM Journal on Computing 21(1), 111–139 (1992)
Ritzdorf, H., Träff, J.L.: Collective operations in NEC’s high-performance MPI libraries. In: International Parallel and Distributed Processing Symposium (IPDPS 2006), p. 100 (2006)
Snir, M., Otto, S., Huss-Lederman, S., Walker, D., Dongarra, J.: MPI – The Complete Reference. In: The MPI Core, 2nd edn., vol. 1. MIT Press, Cambridge (1998)
Thakur, R., Gropp, W.D., Rabenseifner, R.: Improving the performance of collective operations in MPICH. International Journal on High Performance Computing Applications 19, 49–66 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Träff, J.L. (2006). Efficient Allgather for Regular SMP-Clusters. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2006. Lecture Notes in Computer Science, vol 4192. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11846802_16
Download citation
DOI: https://doi.org/10.1007/11846802_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-39110-4
Online ISBN: 978-3-540-39112-8
eBook Packages: Computer ScienceComputer Science (R0)