Abstract
Collective communication is widely used in parallel applications. Collective-communication operations, such as Broadcast, Allreduce, and Alltoall, are frequently formed by a large number of peer-to-peer (P2P) communications. The latency of P2P communication affects the overall performance of collective communication. This paper proposes using circulant network topologies for a high-radix interconnection network to improve the performance of collective communications. The circulant network topology takes advantage of an algorithmic feature that reduces the total hop counts of collective communications. The SimGrid discrete-event simulation results showed that the execution time of the collective communication on a circulant network topology improved by 25.7% and 43.1% compared with random and dragonfly network topologies with the same degree, respectively. It also enhances 40.6% and 19.5% on average compared with 3-D torus and hypercube topologies, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Stunkel, C.B., et al.: The high-speed networks of the summit and sierra supercomputers. IBM J. Res. Dev. 64(3/4), 3–1 (2020)
Kim, J., Dally, W. J., Scott, S., Abts, D.: Technology-driven, highly-scalable dragonfly topology. In: ISCA, 2008, pp. 77–88 (2008)
Besta, M., Hoefler, T.: Slim fly: a cost effective low-diameter network topology. In: SC: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 348–359. IEEE (2014)
Koibuchi, M., Matsutani, H., Amano, H., Hsu, D.F., Casanova, H.: A case for random shortcut topologies for HPC interconnects. ISCA 40(3), 177–188 (2012)
Cui, K., Koibuchi, M.: Efficient two-opt collective-communication operations on low-latency random network topologies. IEICE Trans. Inf. Syst. 103(12), 2435–2443 (2020)
Mizutani, K., Yamaguchi, H., Urino, Y., Koibuchi, M.: OPTWEB: a lightweight fully connected inter-FPGA network for efficient collectives. IEEE Trans. Comput. 70(6), 849–862 (2021)
Bruck, J., Ho, C.-T., Kipnis, S., Upfal, E., Weathersby, D.: Efficient algorithms for all-to-all communications in multiport message-passing systems. IEEE Trans. Parallel Distrib. Syst. 8(11), 1143–1156 (1997)
Liu, M.T.: Distributed loop computer networks. Adv. Comput. 17, 163–221. Elsevier (1978)
Bermond, J.-C., Comellas, F., Hsu, D.F.: Distributed loop computer-networks: a survey. J. Parallel Distrib. Comput. 24(1), 2–10 (1995)
Junginger, M., Lee, Y.: The multi-ring topology-high-performance group communication in peer-to-peer networks. In: Second International Conference on Peer-to-Peer Computing, 2002, pp. 49–56 (2002)
Park, J.-H., Chwa, K.-Y.: Recursive circulant: a new topology for multicomputer networks. In: International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN), 1994, pp. 73–80 (1994)
Tang, S.-M., Wang, Y.-L., Li, C.-Y.: Generalized recursive circulant graphs. IEEE Trans. Parallel Distrib. Syst. 23(1), 87–93 (2011)
Huang, X., Ramos, A.F., Deng, Y.: Optimal circulant graphs as low-latency network topologies, arXiv preprint arXiv:2201.01342 (2022)
Chunduri, S., Parker, S., Balaji, P., Harms, K., Kumaran, K.: Characterization of MPI usage on a production supercomputer. In: SC: International Conference for High Performance Computing, pp. 386–400. Storage and Analysis, Networking (2018)
Open MPI: Open Source High Performance Computing. http://www.open-mpi.org/
MPICH | High-Performance Portable MPI. http://www.mpich.org/
MVAPICH. http://mvapich.cse.ohio-state.edu/
Thakur, R., Rabenseifner, R., Gropp, W.: Optimization of collective communication operations in MPICH. Int. J. High Perform. Comput. Appl. 19(1), 49–66 (2005)
Boesch, F., Tindell, R.: Circulants and their connectivities. J. Gr. Theory 8(4), 487–499 (1984)
Casanova, H., Giersch, A., Legrand, A., Quinson, M., Suter, F.: Versatile, scalable, and accurate simulation of distributed applications and platforms. J. Parallel Distrib. Comput. 74(10), 2899–2917 (2014)
Bertsekas, D.P., Özveren, C., Stamoulis, G.D., Tseng, P., Tsitsiklis, J.N.: Optimal communication algorithms for hypercubes. J. Parallel Distrib. Comput. 11(4), 263–275 (1991)
Ho, C.-T., Kao, M.-Y.: Optimal broadcast in all-port wormhole-routed hypercubes. IEEE Trans. Parallel Distrib. Syst. 6(2), 200–204 (1995)
Acknowledgment
This work was partly supported by JSPS KAKENHI Grant Number 19H01106.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Cui, K., Koibuchi, M. (2023). A High-Radix Circulant Network Topology for Efficient Collective Communication. In: Takizawa, H., Shen, H., Hanawa, T., Hyuk Park, J., Tian, H., Egawa, R. (eds) Parallel and Distributed Computing, Applications and Technologies. PDCAT 2022. Lecture Notes in Computer Science, vol 13798. Springer, Cham. https://doi.org/10.1007/978-3-031-29927-8_31
Download citation
DOI: https://doi.org/10.1007/978-3-031-29927-8_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-29926-1
Online ISBN: 978-3-031-29927-8
eBook Packages: Computer ScienceComputer Science (R0)