Abstract
The Dragonfly topology has been proposed and deployed as the interconnection network topology for next-generation supercomputers. Practical routing algorithms developed for Dragonfly are based on a routing scheme called Universal Globally Adaptive Load-balanced routing with Global information (UGAL-G). While UGAL-G and UGAL-based practical routing schemes have been extensively studied, all existing results are based on simulation or measurement. There is no theoretical understanding of how the UGAL-based routing schemes achieve their performance on a particular network configuration as well as what the routing schemes optimize for. In this work, we develop and validate throughput models for UGAL-G on the Dragonfly topology and identify a robust model that is both accurate and efficient across many Dragonfly variations. Given a traffic pattern, the proposed models estimate the aggregate throughput for the pattern accurately and effectively. Our results not only provide a mechanism to predict the communication performance for large scale Dragonfly networks but also reveal the inner working of UGAL-G, which furthers our understanding of UGAL-based routing on Dragonfly.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Kim, J., Dally, W.J., Scott, S., Abts, D.: Technology-driven, highly-scalable dragonfly topology. In: ACM SIGARCH Computer Architecture News, vol. 36, pp. 77–88. IEEE Computer Society (2008)
Faanes, G., Bataineh, A., Roweth, D., Froese, E., Alverson, B., Johnson, T., Kopnick, J., Higgins, M., Reinhard, J., et al.: Cray cascade: a scalable HPC system based on a dragonfly network. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, p. 103. IEEE Computer Society Press (2012)
NERSC Cori supercomputer. http://www.nersc.gov/users/computational-systems/cori/
Archer, B.J., Vigil, M.: The trinity system. In: Nuclear Explosive Code Development Conference (NECDC), Los Alamos, New Mexico, 20–24 October 2014. Also appears as Los Alamos Technical Report LA-UR-15-20221
Singh, A.: Load-balanced routing. In: Interconnection Networks. Ph.D. thesis, Stanford University (2005)
Jiang, N., Kim, J., Dally, W.J.: Indirect adaptive routing on large scale interconnection networks. SIGARCH Comput. Archit. News 37(3), 220–231 (2009)
Open networking foundation. Sdn architecture. White Paper, ONF TR-502, June 2014. https://www.opennetworking.org/images/stories/downloads/sdn-resources/technical-reports/TR_SDN_ARCH_1.0_06062014.pdf
Shahrokhi, F., Matula, D.W.: The maximum concurrent flow problem. J. ACM 37(2), 318–334 (1990)
Jyothi, S.A., Singla, A., Godfrey, P.B., Kolla, A.: Measuring and understanding throughput of network topologies. In: The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2016), November 2016
Singla, A., Godfrey, P.B., Kolla, A.: High throughput data center topology design. In: 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI), April 2014
Faizian, P., Mollah, M.A., Yuan, X., Pakin, S., Lang, M.: Random regular graph and generalized De Bruijn graph with k-shortest path routing. In: 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 103–112, May 2016
Jiang, N., Balfour, J., Becker, D.U., Towles, B., Dally, W.J., Michelogiannakis, G., Kim, J.: A detailed and flexible cycle-accurate network-on-chip simulator. In: 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 86–96, April 2013
NERSC Edison supercomputer. http://www.nersc.gov/users/computational-systems/edison/
Valiant, L.G.: A scheme for fast parallel communication. SIAM J. Comput. 11(2), 350–361 (1982)
Garcia, M., Vallejo, E., Beivide, R., Odriozola, M., Camarero, C., Valero, M., Rodríguez, G., Labarta, J., Minkenberg, C.: On-the-fly adaptive routing in high-radix hierarchical networks. In: 2012 41st International Conference on Parallel Processing (ICPP), pp. 279–288, September 2012
IBM CPLEX optimizer. https://www.ibm.com/us-en/marketplace/ibm-ilog-cplex/
Garcia, M., Vallejo, E., Beivide, R., Valero, M., Rodríguez, G.: OFAR-CM: efficient dragonfly networks with simple congestion management. In: 2013 IEEE 21st Annual Symposium on High-Performance Interconnects (HOTI), pp. 55–62, August 2013
Garcia, M., Vallejo, E., Beivide, R., Odriozola, M., Valero, M.: Efficient routing mechanisms for dragonfly networks. In: 2013 42nd International Conference on Parallel Processing (ICPP), pp. 582–592, October 2013
Won, J., Kim, G., Kim, J., Jiang, T., Parker, M., Scott, S.: Overcoming far-end congestion in large-scale networks. In: 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA), pp. 415–427, February 2015
Fuentes, P., Vallejo, E., Garcia, M., Beivide, R., Rodríguez, G., Minkenberg, C., Valero, M.: Contention-based nonminimal adaptive routing in high-radix networks. In: 2015 IEEE International Conference on Parallel and Distributed Processing Symposium (IPDPS), pp. 103–112, May 2015
Jain, N., Bhatele, A., Ni, X., Wright, N.J., Kale, L.V.: Maximizing throughput on a dragonfly network. In: SC14: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 336–347, November 2014
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Mollah, M.A., Faizian, P., Rahman, M.S., Yuan, X., Pakin, S., Lang, M. (2018). Modeling UGAL on the Dragonfly Topology. In: Jarvis, S., Wright, S., Hammond, S. (eds) High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation. PMBS 2017. Lecture Notes in Computer Science(), vol 10724. Springer, Cham. https://doi.org/10.1007/978-3-319-72971-8_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-72971-8_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-72970-1
Online ISBN: 978-3-319-72971-8
eBook Packages: Computer ScienceComputer Science (R0)