skip to main content
10.1145/3472456.3472466acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

Receiver-Driven Congestion Control for InfiniBand

Published:05 October 2021Publication History

ABSTRACT

InfiniBand (IB) has become one of the most popular high-speed interconnects in High Performance Computing (HPC). The backpressure effect of credit-based link-layer flow control in IB introduces congestion spreading, which increases queueing delay and hurts application completion time. IB congestion control (IB CC) has been defined in IB specification to address the congestion spreading problem. Nowadays, HPC clusters are increasingly being used to run diverse workloads with a shared network infrastructure. The coexistence of messages transfers of different applications imposes great challenges to IB CC. In this paper, we re-exam IB CC through fine-grained experimental observations and reveal several fundamental problems. Inspired by our understanding and insights, we present a new receiver-driven congestion control for InfiniBand (RR CC). RR CC includes two key mechanisms: receiver-driven congestion identification and receiver-driven rate regulation, which empower eliminating both in-network congestion and endpoint congestion in one control loop. RR CC has much fewer parameters and requires no modifications to InfiniBand switches. Evaluations show that RR CC achieves better average/tail message latency and link utilization than IB CC under various scenarios.

References

  1. 2016. InfiniBand Flit Level Model. https://omnetpp.org/download-items/InfiniBand-FlitSim.htmlGoogle ScholarGoogle Scholar
  2. 2016. Life in the Fast Lane: InfiniBand Continues to Reign as HPC Interconnect of Choice. https://www.infinibandta.org/life-in-the-fast-lane-infiniband-continues-to-reign-as-hpc-interconnect-of-choice/Google ScholarGoogle Scholar
  3. 2019. The InfiniBand® Trade Association Architecture Specification, Volume 1, Version 1.3. https://cw.infinibandta.org/document/dl/7859Google ScholarGoogle Scholar
  4. 2019. Mellanox 40/56/100/200Gbs InfiniBand Switch System Family. https://www.mellanox.com/related-docs/products/SwitchSystem_Brochure.pdfGoogle ScholarGoogle Scholar
  5. 2020. Mellanox technologies. http://www.mellanox.comGoogle ScholarGoogle Scholar
  6. 2020. OMNeT++ Discrete Event Simulator. http://omnetpp.org/Google ScholarGoogle Scholar
  7. 2020. Top 500 Supercomputer Sites. https://www.top500.org/Google ScholarGoogle Scholar
  8. Fatma Alali, Fabrice Mizero, Malathi Veeraraghavan, and John M Dennis. 2017. A measurement study of congestion in an InfiniBand network. In 2017 Network Traffic Measurement and Analysis Conference (TMA). IEEE, 1–9.Google ScholarGoogle ScholarCross RefCross Ref
  9. Mohammad Alizadeh, Abdul Kabbani, Berk Atikoglu, and Balaji Prabhakar. 2011. Stability analysis of QCN: the averaging principle. In Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems. ACM, 49–60.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Kevin A Brown, Nikhil Jain, Satoshi Matsuoka, Martin Schulz, and Abhinav Bhatele. 2018. Interference between I/O and MPI Traffic on Fat-tree Networks. In Proceedings of the 47th International Conference on Parallel Processing. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Sudheer Chunduri, Taylor Groves, Peter Mendygral, Brian Austin, Jacob Balma, Krishna Kandalla, Kalyan Kumaran, Glenn Lockwood, Scott Parker, Steven Warren, Nathan Wichmann, and Nicholas Wright. 2019. GPCNeT: Designing a Benchmark Suite for Inducing and Measuring Contention in HPC Networks. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. Article 42, 33 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Sudheer Chunduri, Scott Parker, Pavan Balaji, Kevin Harms, and Kalyan Kumaran. 2018. Characterization of MPI usage on a production supercomputer. In SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 386–400.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jose Duato, Ian Johnson, Jose Flich, Finbar Naven, P Garcia, and Teresa Nachiondo. 2005. A new scalable and cost-effective congestion management strategy for lossless multistage interconnection networks. In 11th International Symposium on High-Performance Computer Architecture. IEEE, 108–119.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Jesus Escudero-Sahuquillo, Pedro Javier Garcia, Francisco J Quiles, Jose Flich, and Jose Duato. 2010. Cost-effective congestion management for interconnection networks using distributed deterministic routing. In 2010 IEEE 16th International Conference on Parallel and Distributed Systems. IEEE, 355–364.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Jesus Escudero-Sahuquillo, Pedro J Garcia, Francisco J Quiles, Sven-Arne Reinemo, Tor Skeie, Olav Lysne, and Jose Duato. 2014. A new proposal to deal with congestion in InfiniBand-based fat-trees. J. Parallel and Distrib. Comput. 74, 1 (2014), 1802–1819.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jesus Escudero-Sahuquillo, Ernst Gunnar Gran, Pedro Javier Garcia, Jose Flich, Tor Skeie, Olav Lysne, Francisco Jose Quiles, and Jose Duato. 2011. Combining congested-flow isolation and injection throttling in hpc interconnection networks. In 2011 International Conference on Parallel Processing. IEEE, 662–672.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Pedro Javier García, Francisco J Quiles, Jose Flich, Jose Duato, Ian Johnson, and Finbar Naven. 2006. Efficient, scalable congestion management for interconnection networks. IEEE Micro 26, 5 (2006), 52–66.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Patrick Geoffray and Torsten Hoefler. 2008. Adaptive routing strategies for modern high performance networks. In 2008 16th IEEE Symposium on High Performance Interconnects. IEEE, 165–172.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Crispın Gomez, Francisco Gilabert, María Engracia Gomez, Pedro López, and José Duato. 2007. Deterministic versus adaptive routing in fat-trees. In 2007 IEEE International Parallel and Distributed Processing Symposium. IEEE, 1–8.Google ScholarGoogle ScholarCross RefCross Ref
  20. Ernst Gunnar Gran, Magne Eimot, Sven-Arne Reinemo, Tor Skeie, Olav Lysne, Lars Paul Huse, and Gilad Shainer. 2010. First experiences with congestion control in InfiniBand hardware. In 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS). IEEE, 1–12.Google ScholarGoogle ScholarCross RefCross Ref
  21. Ernst Gunnar Gran and Sven-Arne Reinemo. 2011. InfiniBand congestion control: modelling and validation. In Proceedings of the 4th International ICST Conference on Simulation Tools and Techniques. 390–397.Google ScholarGoogle ScholarCross RefCross Ref
  22. Ernst Gunnar Gran, Sven-Arne Reinemo, Olav Lysne, Tor Skeie, Eitan Zahavi, and Gilad Shainer. 2012. Exploring the scope of the InfiniBand congestion control mechanism. In 2012 IEEE 26th International Parallel and Distributed Processing Symposium. IEEE, 1131–1143.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Paul Gratz, Boris Grot, and Stephen W Keckler. 2008. Regional congestion awareness for load balance in networks-on-chip. In 2008 IEEE 14th International Symposium on High Performance Computer Architecture. IEEE, 203–214.Google ScholarGoogle ScholarCross RefCross Ref
  24. Wei Lin Guay, Bartosz Bogdanski, Sven-Arne Reinemo, Olav Lysne, and Tor Skeie. 2011. vFtree-a fat-tree routing algorithm using virtual lanes to alleviate congestion. In 2011 IEEE International Parallel & Distributed Processing Symposium. IEEE, 197–208.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Saurabh Jha, Archit Patke, Jim Brandt, Ann Gentile, Benjamin Lim, Mike Showerman, Greg Bauer, Larry Kaplan, Zbigniew Kalbarczyk, William Kramer, and Ravi Iyer. 2020. Measuring Congestion in High-Performance Datacenter Interconnects. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20). Santa Clara, CA, 37–57.Google ScholarGoogle Scholar
  26. Saurabh Jha, Archit Patke, Jim Brandt, Ann Gentile, Mike Showerman, Eric Roman, Zbigniew T Kalbarczyk, William T Kramer, and Ravishankar K Iyer. 2019. A Study of Network Congestion in Two Supercomputing High-Speed Interconnects. arXiv preprint arXiv:1907.05312(2019).Google ScholarGoogle Scholar
  27. Nan Jiang, Daniel U Becker, George Michelogiannakis, and William J Dally. 2012. Network congestion avoidance through speculative reservation. In IEEE International Symposium on High-Performance Comp Architecture. IEEE, 1–12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Nan Jiang, Larry Dennison, and William J Dally. 2015. Network endpoint congestion control for fine-grained communication. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1–12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Qian Liu, Robert D Russell, and Ernst Gunnar Gran. 2016. Improvements to the InfiniBand congestion control mechanism. In 2016 IEEE 24th Annual Symposium on High-Performance Interconnects (HOTI). IEEE, 27–36.Google ScholarGoogle ScholarCross RefCross Ref
  30. Fabrice Mizero, Malathi Veeraraghavan, Qian Liu, Robert D Russell, and John M Dennis. 2016. A dynamic congestion management system for InfiniBand networks. Supercomputing frontiers and innovations 3, 2 (2016), 5–20.Google ScholarGoogle Scholar
  31. Misbah Mubarak, Philip Carns, Jonathan Jenkins, Jianping Kelvin Li, Nikhil Jain, Shane Snyder, Robert Ross, Christopher D Carothers, Abhinav Bhatele, and Kwan-Liu Ma. 2017. Quantifying i/o and communication traffic interference on dragonfly networks equipped with burst buffers. In 2017 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, 204–215.Google ScholarGoogle ScholarCross RefCross Ref
  32. Rong Pan, Balaji Prabhakar, and Ashvin Laxmikantha. 2007. QCN: Quantized congestion notification. IEEE802 1(2007).Google ScholarGoogle Scholar
  33. G Pfister, Mitchell Gusat, Wolfgang Denzel, David Craddock, Nan Ni, W Rooney, Ton Engbersen, Ronald Luijten, Rajasekar Krishnamurthy, and Jose Duato. 2005. Solving hot spot contention using infiniband architecture congestion control. Proceedings HP-IPC 2005(2005), 6.Google ScholarGoogle Scholar
  34. Arjun Singh. 2005. Load-balanced routing in interconnection networks. Ph.D. Dissertation. Stanford University.Google ScholarGoogle Scholar
  35. Staci A Smith, Clara E Cromey, David K Lowenthal, Jens Domke, Nikhil Jain, Jayaraman J Thiagarajan, and Abhinav Bhatele. 2018. Mitigating inter-job interference using adaptive flow-aware routing. In SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 346–360.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Philip Taffet and John Mellor-Crummey. 2019. Understanding congestion in high performance interconnection networks using sampling. In SC19: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1–24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Ke Wu, Dezun Dong, Cunlu Li, Shan Huang, and Yi Dai. 2019. Network congestion avoidance through packet-chaining reservation. In Proceedings of the 48th International Conference on Parallel Processing. 1–10.Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    ICPP '21: Proceedings of the 50th International Conference on Parallel Processing
    August 2021
    927 pages
    ISBN:9781450390682
    DOI:10.1145/3472456

    Copyright © 2021 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 5 October 2021

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate91of313submissions,29%
  • Article Metrics

    • Downloads (Last 12 months)61
    • Downloads (Last 6 weeks)4

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format