Abstract
In this paper, we propose a novel allgather algorithm, Reindexed Recursive K-ing (RRK), which leverages flexibility in the algorithm’s tree topology and ability to make asynchronous progress coupled with Core-Direct communication offload capability to optimize the MPI_Allgather for Core-Direct enabled systems. In particular, the RRK introduces a reindexing scheme which ensures contiguous data transfers while adding only a single additional send and receive operation for any radix, k, or communicator size, N. This allows us to improve algorithm scalability by avoiding the use of a scatter/gather elements (SGE) list on InfiniBand networks. The implementations of the RRK algorithm and its evaluation shows that it performs and scales well on Core-Direct systems for a wide range of message sizes and various communicator configurations.
Chapter PDF
Similar content being viewed by others
References
Benson, G.D., Chu, C.-W., Huang, Q., Caglar, S.G.: A Comparison of MPICH Allgather Algorithms on Switched Networks. In: Dongarra, J., Laforenza, D., Orlando, S. (eds.) EuroPVM/MPI 2003. LNCS, vol. 2840, pp. 335–343. Springer, Heidelberg (2003)
Bruck, J., Member, S., Tien Ho, C., Kipnis, S., Upfal, E., Member, S., Weathersby, D.: Efficient algorithms for all-to-all communications in multi-port message-passing systems. In: IEEE Transactions on Parallel and Distributed Systems, pp. 298–309 (1997)
Chen, J., Zhang, L., Zhang, Y., Yuan, W.: Performance evaluation of allgather algorithms on terascale linux cluster with fast ethernet. In: Proceedings. Eighth International Conference on High-Performance Computing in Asia-Pacific Region, pp. 6–442 (July 2005)
Fagg, G., Bosilca, G., Pješivac-Grbović, J., Angskun, T., Dongarra, J.: Tuned: An open mpi collective communications component. In: Distributed and Parallel Systems, pp. 65–72. Springer, US (2007)
Fraigniaud, P., Lazard, E.: Methods and problems of communication in usual networks. Discrete Applied Mathematics 53, 79–133 (1994)
Graham, R., Venkata, M.G., Ladd, J., Shamis, P., Rabinovitz, I., Filipov, V., Shainer, G.: Cheetah: A framework for scalable hierarchical collective operations. In: CCGRID 2011 (2011)
Graham, R.L., Poole, S., Shamis, P., Bloch, G., Bloch, N., Chapman, H., Kagan, M., Shahar, A., Rabinovitz, I., Shainer, G.: Connectx-2 infiniband management queues: First investigation of the new support for network offloaded collective operations. In: CCGRID, pp. 53–62 (2010)
Hedetniemi, S.M., Hedetniemi, S.T., Liestman, A.L.: A survey of gossiping and broadcasting in communication networks. Networks (1988)
Lawry, W., Wilson, C., Maccabe, A., Brightwell, R.: Comb: a portable benchmark suite for assessing mpi overlap. In: 2002 IEEE International Conference on Cluster Computing, pp. 472–475 (2002)
Sanders, P., Träff, J.L.: The Hierarchical Factor Algorithm for All-to-All Communication. In: Monien, B., Feldmann, R.L. (eds.) Euro-Par 2002. LNCS, vol. 2400, p. 799. Springer, Heidelberg (2002)
Sur, S., Bondhugula, U.K.R., Mamidala, A.R., Jin, H.-W., Panda, D.K.: High Performance RDMA Based All-to-All Broadcast for InfiniBand Clusters. In: Bader, D.A., Parashar, M., Sridhar, V., Prasanna, V.K. (eds.) HiPC 2005. LNCS, vol. 3769, pp. 148–157. Springer, Heidelberg (2005)
Sur, S., Jin, H.-W., Panda, D.K.: Efficient and scalable all-to-all personalized exchange for infiniband-based clusters. In: Proceedings of the 2004 International Conference on Parallel Processing, ICPP 2004, pp. 275–282. IEEE Computer Society (2004)
Träff, J.L.: Efficient Allgather for Regular SMP-Clusters. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) PVM/MPI 2006. LNCS, vol. 4192, pp. 58–65. Springer, Heidelberg (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ladd, J.S., Venkata, M.G., Graham, R., Shamis, P. (2012). Assessing the Performance and Scalability of a Novel Multilevel K-Nomial Allgather on CORE-Direct Systems. In: Kaklamanis, C., Papatheodorou, T., Spirakis, P.G. (eds) Euro-Par 2012 Parallel Processing. Euro-Par 2012. Lecture Notes in Computer Science, vol 7484. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32820-6_53
Download citation
DOI: https://doi.org/10.1007/978-3-642-32820-6_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32819-0
Online ISBN: 978-3-642-32820-6
eBook Packages: Computer ScienceComputer Science (R0)