ABSTRACT
Deployed congestion control algorithms rely on end hosts to figure out how congested the network is. Initially, for TCP, the congestion signal was packet drops; a binary signal that kicks in only after the congestion is well underway. More recently, the trend has been towards using RTT as a congestion signal instead (e.g. Timely and BBR). But RTT is a noisy surrogate for congestion; it contains a valuable signal about congestion at the bottleneck but also includes noise from the queuing delay at the non-bottlenecked switches. Taking a step back, it is worth asking: Why don't the switches and routers simply tell us the actual congestion they are experiencing? After all, they must keep track of the precise occupancy of their own queues anyway; they can directly tell the end-hosts. Conventional wisdom said this is too expensive (in terms of additional bits in headers, or complexity and power consumption). We argue that even if this was once the case, it no longer is. Today, it is quite feasible, with negligible increase in power or lost capacity, to report the precise queuing delay at the switches, allowing the end hosts to make more accurate decisions when minimizing required buffering. We explore how this might work using modern programmable switches and NICs that stamp each packet with the queue occupancy (or the maximum queue occupancy along the path), which can be thought of as a multi-bit ECN signal. We provide evidence that the resulting signal is a more accurate indication of congestion at the flow's bottleneck and can lead to higher link utilization and shorter flow completion times than RTT-based algorithms. Consequently, it becomes easier to control required buffer sizes. Our goal here is not to argue for a particular multi-bit ECN algorithm, but to point out that in the future, there is no longer a need to rely on noisy RTT measurements at the edges.
- Hasnain Ahmed and Muhammad Junaid Arshad. 2019. Buffer Occupancy-Based Transport to Reduce Flow Completion Time of Short Flows in Data Center Networks. Symmetry 11 (05 2019), 646. https://doi.org/10.3390/sym11050646Google Scholar
- Mohammad Alizadeh, Albert Greenberg, David A. Maltz, Jitendra Padhye, Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, and Murari Sridharan. 2010. Data Center TCP (DCTCP). SIGCOMM Comput. Commun. Rev. 40, 4 (Aug. 2010), 63--74. https://doi.org/10.1145/1851275.1851192Google ScholarDigital Library
- BarefootNetworks. 2019. Tofino2: Second-generation of World's fastest P4-programmable Ethernet switch ASICs. https://www.barefootnetworks.com/products/brief-tofino-2/Google Scholar
- Pat Bosshart, Dan Daly, Glen Gibb, Martin Izzard, Nick McKeown, Jennifer Rexford, Cole Schlesinger, Dan Talayco, Amin Vahdat, George Varghese, and David Walker. 2014. P4: Programming Protocol-independent Packet Processors. SIGCOMM Comput. Commun. Rev. 44, 3 (July 2014), 87--95. https://doi.org/10.1145/2656877.2656890Google ScholarDigital Library
- Pat Bosshart, Glen Gibb, Hun-Seok Kim, George Varghese, Nick McKeown, Martin Izzard, Fernando Mujica, and Mark Horowitz. 2013. Forwarding Metamorphosis: Fast Programmable Match-action Processing in Hardware for SDN (SIGCOMM '13). ACM, New York, NY, USA, 99--110. https://doi.org/10.1145/2486001.2486011Google ScholarDigital Library
- P. Bosshart, CTO Barefoot Networks. 2019. Personal Communication. personal communication.Google Scholar
- Lawrence S. Brakmo, Sean W. O'Malley, and Larry L. Peterson. 1994. TCP Vegas: New Techniques for Congestion Detection and Avoidance (SIGCOMM '94). ACM, New York, NY, USA, 24--35. https://doi.org/10.1145/190314.190317Google Scholar
- Broadcom. 2019. BCM56880: High-Capacity StrataXGS Trident 4 Ethernet Switch Series. https://www.broadcom.com/products/ethernet-connectivity/switching/strataxgs/bcm56880-seriesGoogle Scholar
- Rory Browne, Andrey Chilikin, and Tal Mizrahi. 2019. Key Performance Indicator (KPI) Stamping for the Network Service Header (NSH). RFC 8592. https://doi.org/10.17487/RFC8592Google Scholar
- Neal Cardwell, Yuchung Cheng, C. Stephen Gunn, Soheil Hassas Yeganeh, and Van Jacobson. 2017. BBR: Congestion-based Congestion Control. Commun. ACM 60, 2 (Jan. 2017), 58--66. https://doi.org/10.1145/3009824Google ScholarDigital Library
- Cisco. 2018. UADP - The Powerhouse of Catalyst 9000 family. https://community.cisco.com/t5/networking-blogs/uadp-the-powerhouse-of-catalyst-9000-family/ba-p/3764605Google Scholar
- D. Clark. 1988. The Design Philosophy of the DARPA Internet Protocols (SIGCOMM '88). ACM, New York, NY, USA, 106--114. https://doi.org/10.1145/52324.52336Google Scholar
- Nandita Dukkipati. 2008. Rate Control Protocol (RCP): Congestion Control to Make Flows Complete Quickly. Ph.D. Dissertation. Stanford University, Stanford, CA, USA. Advisor(s) Mckeown, Nick. AAI3292347.Google Scholar
- S. Floyd and T. Henderson. 1999. The NewReno Modification to TCP's Fast Recovery Algorithm. RFC 2582. https://doi.org/10.17487/RFC2582Google Scholar
- Sally Floyd and Van Jacobson. 1993. Random Early Detection Gateways for Congestion Avoidance. IEEE/ACM Trans. Netw. 1, 4 (Aug. 1993), 397--413. https://doi.org/10.1109/90.251892Google ScholarDigital Library
- Sally Floyd, Dr. K. K. Ramakrishnan, and David L. Black. 2001. The Addition of Explicit Congestion Notification (ECN) to IP. RFC 3168. https://doi.org/10.17487/RFC3168Google Scholar
- Teerawat Issariyakul and Ekram Hossain. 2011. Introduction to Network Simulator NS2 (2nd ed.). Boston, MA.Google Scholar
- V. Jacobson. 1988. Congestion Avoidance and Control (SIGCOMM '88). ACM, New York, NY, USA, 314--329. https://doi.org/10.1145/52324.52356Google Scholar
- Changhoon Kim, Parag Bhide, Ed Doe, Hugh Holbrook, Anoop Ghanwani, Dan Daly, Mukesh Hira, and Bruce Davie. 2016. Inband Network Telemetry (INT). https://p4.org/assets/INT-current-spec.pdfGoogle Scholar
- Yuliang Li, Rui Miao, Hongqiang Harry Liu, Yan Zhuang, Fei Feng, Lingbo Tang, Zheng Cao, Ming Zhang, Frank Kelly, Mohammad Alizadeh, and Minlan Yu. 2019. HPCC: High Precision Congestion Control (SIGCOMM 19). ACM, New York, NY, USA, 44--58. https://doi.org/10.1145/3341302.3342085Google Scholar
- Radhika Mittal, Vinh The Lam, Nandita Dukkipati, Emily Blem, Hassan Wassel, Monia Ghobadi, Amin Vahdat, Yaogong Wang, David Wetherall, and David Zats. 2015. TIMELY: RTT-based Congestion Control for the Datacenter. SIGCOMM Comput. Commun. Rev. 45, 4 (Aug. 2015), 537--550. https://doi.org/10.1145/2829988.2787510Google ScholarDigital Library
- A. Munir, I. A. Qazi, Z. A. Uzmi, A. Mushtaq, S. N. Ismail, M. S. Iqbal, and B. Khan. 2013. Minimizing flow completion times in data centers. In 2013 Proceedings IEEE INFOCOM. 2157--2165. https://doi.org/10.1109/INFCOM.2013.6567018Google Scholar
- Pierre-Francois Quet, Sriram Chellappan, A. Durresi, M. Sridharan, Hitay Ozbay, and Raj Jain. 2002. Guidelines for optimizing Multi-Level ECN, using fluid flow based TCP model. In Proceedings of ITCOMM 2002 Quality of Service over Next Generation Internet. Boston, MA, USA.Google ScholarCross Ref
- K. K. Ramakrishnan and Sally Floyd. 1999. A Proposal to add Explicit Congestion Notification (ECN) to IP. RFC 2481 (1999), 1--25. https://doi.org/10.17487/RFC2481Google Scholar
- D. Shan and F. Ren. 2017. Improving ECN marking scheme with micro-burst traffic in data center networks. In IEEE INFOCOM 2017 - IEEE Conference on Computer Communications. 1--9. https://doi.org/10.1109/INFOCOM.2017.8057181Google ScholarCross Ref
- Keith Winstein and Hari Balakrishnan. 2013. TCP Ex Machina: Computer-generated Congestion Control (SIGCOMM '13). ACM, New York, NY, USA, 123--134. https://doi.org/10.1145/2486001.2486020Google ScholarDigital Library
- Gaoxiong Zeng, Wei Bai, Ge Chen, Kai Chen, Dongsu Han, and Yibo Zhu. 2017. Combining ECN and RTT for Datacenter Transport. In Proceedings of the First Asia-Pacific Workshop on Networking. ACM, 36--42.Google ScholarDigital Library
- Yibo Zhu, Haggai Eran, Daniel Firestone, Chuanxiong Guo, Marina Lipshteyn, Yehonatan Liron, Jitendra Padhye, Shachar Raindel, Mohamad Haj Yahia, and Ming Zhang. 2015. Congestion control for large-scale RDMA deployments. ACM SIGCOMM Computer Communication Review 45, 4 (2015), 523--536.Google ScholarDigital Library
- Yibo Zhu, Monia Ghobadi, Vishal Misra, and Jitendra Padhye. 2016. ECN or Delay: Lessons Learnt from Analysis of DCQCN and TIMELY (CoNEXT '16). ACM, New York, NY, USA, 313--327. https://doi.org/10.1145/2999572.2999593Google Scholar
Recommendations
TCP and explicit congestion notification
This paper discusses the use of Explicit Congestion Notification (ECN) mechanisms in the TCP/IP protocol. The first part proposes new guidelines for TCP's response to ECN mechanisms (e.g., Source Quench packets, ECN fields in packet headers). Next, ...
Configurable active multicast congestion control
A multicast congestion control and avoidance scheme is indispensable for group-based applications to fairly share and efficiently use network resources with unicast applications and maintain the stability of the Internet. It is difficult for the ...
Delay-based congestion avoidance for TCP
The set of TCP congestion control algorithms associated with TCP/Reno (e.g., slow-start and congestion avoidance) have been crucial to ensuring the stability of the Internet. Algorithms such as TCP/NewReno (which has been deployed) and TCP/Vegas (which ...
Comments