Abstract
New deadlock-free unicast-based all-to-all broadcast algorithms are proposed for dragonfly networks. An all-to-all broadcast delivers a message from each router to all routers. Two different all-to-all broadcast algorithms GFA2A and RFA2A using the previous group-first and router-first one-to-all broadcast schemes are presented. A new all-to-all broadcast algorithm named A2A is presented by collecting all messages from all routers in the same group to a single router first and combining them, which are forwarded to all routers in the same group. Each router forwards messages to all other routers in the same groups after receiving all messages from other groups. The proposed algorithms can be implemented with the unicast hardware, that is, each input port is assigned two indistinguishable buffers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Boppana, R.V., Chalasani, S., Raghavendra, C.S.: Resource deadlocks and performance of wormhole multicast routing algorithms. IEEE Trans. Parallel Distrib. Syst. 9(6), 535–549 (1998)
Dorier, M., Mubarak, M., Ross, R., Li, J.K., Carothers, C.D., Ma, K.L.: Evaluation of topology-aware broadcast algorithms for dragonfly networks. In: Proceedings of International Conference on Cloud Computing, pp. 40–49 (2016)
Faanes, G., et al.: Cray cascade: a scalable HPC system based on a dragonfly network. In: Proceedings of International Conference on for High-Performance Computing, Networking, Storage and Analysis, Article no. 103, December 2012
Garcia, M., et al.: On-the-fly adaptive routing in high-radix hierarchical networks. In: Proceedings of International Conference on Parallel Processing, pp. 280–288 (2012)
Hoefler, T., Mehlan, T., Mietke, F., Rehm, W.: A survey of barrier algorithms for coarse grained supercomputers. Chemnitzer Informatik Berichte, vol. 04, no. 03, presented in Chemnitz, Germany, December 2004. ISSN 0947-5152
Jiang, N., Kim, J., Dally, W.J.: Indirect adaptive routing on large scale interconnection networks. In: Proceedings of International Symposium on Computer Architecture, pp. 220–231 (2009)
Jiang, N., Kim, J., Dally, W.J.: Gossiping on meshes and tori. IEEE Trans. Parallel Distrib. Syst. 9(6), 513–525 (1998)
Kim, J., Dally, W.J., Scott, S., Abts, D.: Technology-driven, highly-scalable dragonfly topology. In: Proceedings of International Symposium on Computer Architecture, pp. 77–88 (2008)
Lin, X., Ni, L.M.: Multicast communication in multicomputer networks. IEEE Trans. Parallel Distrib. Syst. 4(10), 1105–1117 (1993)
Lin, X., McKinley, P.K., Ni, L.M.: Deadlock-free multicast wormhole routing in 2D mesh multicomputers. IEEE Trans. Parallel Distrib. Syst. 5(8), 793–804 (1994)
Maglione-Mathey, G., Yebenes, P., Escudero-Sahuquillo, J., Garcia, P.J., Quiles, F.J., Zahavi, E.: Scalable deadlock-free deterministic minimal-path routing engine for infiniband-based dragonfly networks. IEEE Trans. Parallel Distrib. Syst. 29(1), 183–197 (2018)
McKinley, P.K., Xu, H., Esfahanian, A.-H., Ni, L.M.: Unicast-based multicast communication in wormhole-routed networks. IEEE Trans. Parallel Distrib. Syst. 5(12), 1252–1265 (1994)
Navaridas, J., Miguel-Alonso, J., Ridruejo, F.J.: On synthesizing workloads emulating MPI applications. In: Proceedings of International Symposium on Parallel and Distributed Processing (2008). https://doi.org/10.1109/IPDPS.2008.4536473
Panda, D.K., Singal, S., Kesavan, R.: Multidestination message passing in wormhole passing in wormhole \(k\)-ary \(n\)-cube networks with base routing conformed paths. IEEE Trans. Parallel Distrib. Syst. 10(1), 76–96 (1999)
De Sensi, D., Di Girolamo, S., McMahon, K.H., Roweth, D., Hoefler, T.: An in-depth analysis of the Slingshot interconnect. In: Proceedings of International Conference on for High-Performance Computing, Networking, Storage and Analysis (2020). https://doi.org/10.1109/SC41405.2020.00039
Suh, Y.-J., Valamanchili, S.: All-to-all communication with minimum start-up costs in 2D/3D tori and meshes. IEEE Trans. Parallel Distrib. Syst. 9(5), 442–458 (1998)
Thakur, R., Gropp, W.D.: Improving the performance of collective operations in MPICH. In: Dongarra, J., Laforenza, D., Orlando, S. (eds.) EuroPVM/MPI 2003. LNCS, vol. 2840, pp. 257–267. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-39924-7_38
Xiang, D., Zhang, Y., Shan, S., Xu, Y.: A fault-tolerant routing algorithm design for on-chip optical networks. In: Proceedings of Symposium on Reliable Distributed Systems, pp. 1–9 (2013)
Xiang, D., Zhang, Y.: Cost-effective power-aware core testing in NoCs based on a new unicast-based multicast scheme. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 30(1), 135–147 (2011)
Xiang, D., Chakrabarty, K., Fujiwara, H.: Multicast-based testing and thermal-aware test scheduling for 3D ICs with a stacked network-on-chip. IEEE Trans. Comput. 65(9), 2767–2779 (2016)
Xiang, D., Li, B., Fu, Y.: Fault-tolerant adaptive routing in dragonfly networks. IEEE Trans. Dependable Secur. Comput. 16(2), 259–271 (2019)
Xiang, D., Liu, X.: Deadlock-free broadcast routing in dragonfly networks without virtual channels. IEEE Trans. Parallel Distrib. Syst. 27(9), 2520–2532 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Xiang, D., Ju, Y. (2021). All-to-All Broadcast in Dragonfly Networks. In: Chen, CY., Hon, WK., Hung, LJ., Lee, CW. (eds) Computing and Combinatorics. COCOON 2021. Lecture Notes in Computer Science(), vol 13025. Springer, Cham. https://doi.org/10.1007/978-3-030-89543-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-89543-3_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89542-6
Online ISBN: 978-3-030-89543-3
eBook Packages: Computer ScienceComputer Science (R0)