skip to main content
10.1145/3609021.3609295acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article

TCP's Third Eye: Leveraging eBPF for Telemetry-Powered Congestion Control

Published:10 September 2023Publication History

ABSTRACT

For years, congestion control algorithms have been navigating in the dark, blind to the actual state of the network. They were limited to the course-grained signals that are visible from the OS kernel, which are measured locally (e.g., RTT) or hints of imminent congestion (e.g., packet loss and ECN). As applications and OSs are becoming ever more distributed, it is only natural that the kernel have visibility beyond the host, into the network fabric. Network switches already collect telemetry, but it has been impractical to export it for the end-host to react.

Although some telemetry-based solutions have been proposed, they require changes to the end-host, like custom hardware or new protocols and network stacks. We address the challenges of efficiency and protocol compatibility, showing that it is possible and practical to run telemetry-based congestion control algorithms in the kernel. We designed a framework that uses eBPF to run CCAs that can execute different control laws by selecting different types of telemetry. It can be deployed in brownfield environments, without requiring all switches be telemetry-enabled, or kernel recompilation at the end-hosts. When our eBPF program is deployed on hosts without hardware or OS changes, TCP incast workloads experience less queuing (thus lower latency), faster convergence and better fairness.

References

  1. Vamsi Addanki, Oliver Michel, and Stefan Schmid. 2022. PowerTCP: Pushing the Performance Limits of Datacenter Networks. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). USENIX Association, Renton, WA, 51--70. https://www.usenix.org/conference/nsdi22/presentation/addankiGoogle ScholarGoogle Scholar
  2. Mina Tahmasbi Arashloo, Alexey Lavrov, Manya Ghobadi, Jennifer Rexford, David Walker, and David Wentzlaff. 2020. Enabling Programmable Transport Protocols in High-Speed NICs. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20). USENIX Association, Santa Clara, CA, 93--109. https://www.usenix.org/conference/nsdi20/presentation/arashlooGoogle ScholarGoogle Scholar
  3. Serhat Arslan, Stephen Ibanez, Alex Mallery, Changhoon Kim, and Nick McKeown. 2021. NanoTransport: A Low-Latency, Programmable Transport Layer for NICs. In Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR) (SOSR '21). Association for Computing Machinery, New York, NY, USA, 13--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Serhat Arslan and Nick McKeown. 2020. Switches Know the Exact Amount of Congestion. In Proceedings of the 2019 Workshop on Buffer Sizing (BS '19). Association for Computing Machinery, New York, NY, USA, Article 10, 6 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Wei Bai, Shanim Sainul Abdeen, Ankit Agrawal, Krishan Kumar Attre, Paramvir Bahl, Ameya Bhagat, Gowri Bhaskara, Tanya Brokhman, Lei Cao, Ahmad Cheema, et al. 2023. Empowering Azure Storage with {RDMA}. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23). 49--67.Google ScholarGoogle Scholar
  6. Hitesh Ballani, Paolo Costa, Raphael Behrendt, Daniel Cletheroe, Istvan Haller, Krzysztof Jozwik, Fotini Karinou, Sophie Lange, Kai Shi, Benn Thomsen, et al. 2020. Sirius: A flat datacenter network with nanosecond optical switching. In Proceedings of the ACM SIGCOMM 2020 Conference. 782--797.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Ran Ben Basat, Sivaramakrishnan Ramanathan, Yuliang Li, Gianni Antichi, Minian Yu, and Michael Mitzenmacher. 2020. PINT: Probabilistic In-Band Network Telemetry. In Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM '20). Association for Computing Machinery, New York, NY, USA, 662--680. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Cristian Hetnandez Benet, Andreas Kassler, Gianni Antichi, Theophilus A. Benson, and Gergely Pongracz. 2021. Providing In-network Support to Coflow Scheduling. In 2021 IEEE 7th International Conference on Network Softwarization (NetSoft). 235--243. Google ScholarGoogle ScholarCross RefCross Ref
  9. Ramyashree Venkatesh Bhat, Jetmir Haxhibeqiri, Ingrid Moerman, and Jeroen Hoebeke. 2021. Adaptive transport layer protocols using in-band network telemetry and eBPF. In 2021 17th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob). IEEE, 241--246. https://ieeexplore.ieee.org/abstract/document/9606378Google ScholarGoogle ScholarCross RefCross Ref
  10. Marco Spaziani Brunella, Giacomo Belocchi, Marco Bonola, Salvatore Pontarelli, Giuseppe Siracusano, Giuseppe Bianchi, Aniello Cammarano, Alessandro Palumbo, Luca Petrucci, and Roberto Bifulco. 2020. hXDP: Efficient Software Packet Processing on FPGA NICs. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). USENIX Association, 973--990. https://www.usenix.org/conference/osdi20/presentation/brunellaGoogle ScholarGoogle Scholar
  11. Mihai Budiu and Chris Dodd. 2017. The p416 programming language. ACM SIGOPS Operating Systems Review 51, 1 (2017), 5--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Qizhe Cai, Shubham Chaudhary, Midhul Vuppalapati, Jaehyun Hwang, and Rachit Agarwal. 2021. Understanding Host Network Stack Overheads. In Proceedings of the 2021 ACM SIGCOMM 2021 Conference (SIGCOMM '21). Association for Computing Machinery, New York, NY, USA, 65--77. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Neal Cardwell, Yuchung Cheng, C Stephen Gunn, Soheil Hassas Yeganeh, and Van Jacobson. 2017. BBR: congestion-based congestion control. Commun. ACM 60, 2 (2017), 58--66.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Intel Corporation. 2020. Intel Tofino. (2020). Retrieved June 9, 2023 from https://www.intel.com/content/www/us/en/products/network-io/programmable-ethernet-switch/tofino-series.html.Google ScholarGoogle Scholar
  15. Wesley Eddy. 2022. Transmission Control Protocol (TCP). RFC 9293. (Aug. 2022). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Matt Fleming. 2017. A thorough introduction to eBPF. (2017). https://lwn.net/Articles/740157/Google ScholarGoogle Scholar
  17. Sally Floyd, Dr. K. K. Ramakrishnan, and David L. Black. 2001. The Addition of Explicit Congestion Notification (ECN) to IP. RFC 3168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Mark Handley, Costin Raiciu, Alexandru Agache, Andrei Voinescu, Andrew W. Moore, Gianni Antichi, and Marcin Wojcik. 2017. Re-Architecting Datacenter Networks and Stacks for Low Latency and High Performance. In Proceedings of the ACM SIGCOMM 2017 Conference. 29--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Jaehyun Hwang, Qizhe Cai, Ao Tang, and Rachit Agarwal. 2020. TCP ≈ RDMA: CPU-efficient Remote Storage Access with i10. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20). USENIX Association, Santa Clara, CA, 127--140. https://www.usenix.org/conference/nsdi20/presentation/hwangGoogle ScholarGoogle Scholar
  20. Stephen Ibanez, Alex Mallery, Serhat Arslan, Theo Jepsen, Muhammad Shahbaz, Changhoon Kim, and Nick McKeown. 2021. The nanoPU: A Nanosecond Network Stack for Datacenters. In 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI 21). USENIX Association, 239--256. https://www.usenix.org/conference/osdi21/presentation/ibanezGoogle ScholarGoogle Scholar
  21. Dell Technologies Blog Ihab Tarazi. 2021. The Future of Software-defined Networking for Storage Connectivity . (2021). Retrieved June 9, 2023 from https://www.dell.com/en-us/blog/the-future-of-software-defined-networking-for-storage-connectivity/.Google ScholarGoogle Scholar
  22. Grzegorz Jereczek, Theo Jepsen, Simon Wass, Bimmy Pujari, Jerry Zhen, and Jeongkeun Lee. 2022. TCP-INT: Lightweight Network Telemetry with TCP Transport. In Proceedings of the SIGCOMM '22 Poster and Demo Sessions (SIGCOMM '22). Association for Computing Machinery, New York, NY, USA, 58--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. The kernel development community. 2014. eBPF Instruction Set Specification, v1.0. (2014). https://www.kernel.org/doc/html/latest/bpf/instruction-set.htmlGoogle ScholarGoogle Scholar
  24. Jakub Kicinski and Nicolaas Viljoen. 2016. eBPF Hardware Offload to SmartNICs: cls bpf and XDP. Proceedings of netdev 1 (2016).Google ScholarGoogle Scholar
  25. Gautam Kumar, Nandita Dukkipati, Keon Jang, Hassan M. G. Wassel, Xian Wu, Behnam Montazeri, Yaogong Wang, Kevin Springborn, Christopher Alfeld, Michael Ryan, David Wetherall, and Amin Vahdat. 2020. Swift: Delay is Simple and Effective for Congestion Control in the Datacenter. In Proceedings of the ACM SIGCOMM 2020 Conference. 514--528. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Yuliang Li, Rui Miao, Hongqiang Harry Liu, Yan Zhuang, Fei Feng, Lingbo Tang, Zheng Cao, Ming Zhang, Frank Kelly, Mohammad Alizadeh, et al. 2019. HPCC: High precision congestion control. In Proceedings of the ACM Special Interest Group on Data Communication. 44--58.Google ScholarGoogle Scholar
  27. Rui Miao, Bo Li, Hongqiang Harry Liu, and Ming Zhang. 2019. Buffer sizing with HPCC. (2019).Google ScholarGoogle Scholar
  28. Radhika Mittal, Vinh The Lam, Nandita Dukkipati, Emily Blem, Hassan Wassel, Monia Ghobadi, Amin Vahdat, Yaogong Wang, David Wetherall, and David Zats. 2015. TIMELY: RTT-Based Congestion Control for the Datacenter. In Proceedings of the ACM SIGCOMM 2015 Conference. 537--550. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. John Nagle. 1984. Congestion Control in IP/TCP Internetworks. RFC 896. (Jan. 1984). Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Akshay Narayan, Frank Cangialosi, Deepti Raghavan, Prateesh Goyal, Srinivas Narayana, Radhika Mittal, Mohammad Alizadeh, and Hari Balakrishnan. 2018. Restructuring endpoint congestion control. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication. 30--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Leon Poutievski, Omid Mashayekhi, Joon Ong, Arjun Singh, Mukarram Tariq, Rui Wang, Jianan Zhang, Virginia Beauregard, Patrick Conner, Steve Gribble, et al. 2022. Jupiter evolving: Transforming google's datacenter network via optical circuit switches and software-defined networking. In Proceedings of the ACM SIGCOMM 2022 Conference. 66--85.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Mubashir Adnan Qureshi, Yuchung Cheng, Qianwen Yin, Qiaobin Fu, Gautam Kumar, Masoud Moshref, Junhua Yan, Van Jacobson, David Wetherall, and Abdul Kabbani. 2022. PLB: Congestion Signals Are Simple and Effective for Network Load Balancing. In Proceedings of the ACM SIGCOMM 2022 Conference (SIGCOMM '22). Association for Computing Machinery, New York, NY, USA, 207--218. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Arjun Roy, Hongyi Zeng, Jasmeet Bagga, George Porter, and Alex C Snoeren. 2015. Inside the social network's (datacenter) network. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication. 123--137.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Pasi Sarolahti and Alexey Kuznetsov. 2002. Congestion Control in Linux TCP. In USENIX Annual Technical Conference, FREENIX Track. 49--62. https://www.usenix.org/legacy/event/usenix02/tech/freenix/full_papers/sarolahti/sarolahti_html/Google ScholarGoogle Scholar
  35. Sandesh Dhawaskar Sathyanarayana, Max Hollingsworth, Wenji Wu, and Richard Cziva. 2022. Design, Implementation, and Evaluation of Host-based In-band Network Telemetry for TCP. In 2022 Global Information Infrastructure and Networking Symposium (GIIS). 62--67. Google ScholarGoogle ScholarCross RefCross Ref
  36. Siyuan Sheng, Qun Huang, and Patrick PC Lee. 2021. DeltaINT: Toward general in-band network telemetry with extremely low bandwidth overhead. In 2021 IEEE 29th International Conference on Network Protocols (ICNP). IEEE, 1--11.Google ScholarGoogle ScholarCross RefCross Ref
  37. Alexei Starovoitov and Daniel Borkmann. 2014. Classic BPF vs eBPF. https://www.kernel.org/doc/html/latest/bpf/classic_vs_extended.htmlGoogle ScholarGoogle Scholar
  38. Lizhuang Tan, Wei Su, Wei Zhang, Jianhui Lv, Zhenyi Zhang, Jingying Miao, Xiaoxi Liu, and Na Li. 2021. In-band network telemetry: A survey. Computer Networks 186 (2021), 107763. Google ScholarGoogle ScholarCross RefCross Ref
  39. The P4.org Applications Working Group. 2020. In-band Network Telemetry (INT) Dataplane Specification. (2020). https://p4.org/p4-spec/docs/INT_v2_1.pdfGoogle ScholarGoogle Scholar
  40. Shuai Wang, Kaihui Gao, Kun Qian, Dan Li, Rui Miao, Bo Li, Yu Zhou, Ennan Zhai, Chen Sun, Jiaqi Gao, Dai Zhang, Binzhang Fu, Frank Kelly, Dennis Cai, Hongqiang Harry Liu, and Ming Zhang. 2022. Predictable VFabric on Informative Data Plane. In Proceedings of the ACM SIGCOMM 2022 Conference (SIGCOMM '22). Association for Computing Machinery, New York, NY, USA, 615--632. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Simon Wass and Jeongkeun Lee. 2022. TCP-INT: Intel's Lightweight Network Telemetry Improves Visibility and Control for TCP Workloads. (2022). Retrieved June 10, 2023 from https://medium.com/intel-tech/tcp-int-intels-lightweight-network-telemetry-improves-visibility-and-control-for-tcp-workloads-74c7c55910e.Google ScholarGoogle Scholar
  42. Yibo Zhu, Haggai Eran, Daniel Firestone, Chuanxiong Guo, Marina Lipshteyn, Yehonatan Liron, Jitendra Padhye, Shachar Raindel, Mohamad Haj Yahia, and Ming Zhang. 2015. Congestion control for large-scale RDMA deployments. ACM SIGCOMM Computer Communication Review 45, 4 (2015), 523--536.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. TCP's Third Eye: Leveraging eBPF for Telemetry-Powered Congestion Control

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              eBPF '23: Proceedings of the 1st Workshop on eBPF and Kernel Extensions
              September 2023
              96 pages
              ISBN:9798400702938
              DOI:10.1145/3609021

              Copyright © 2023 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 10 September 2023

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              eBPF '23 Paper Acceptance Rate12of21submissions,57%Overall Acceptance Rate12of21submissions,57%
            • Article Metrics

              • Downloads (Last 12 months)211
              • Downloads (Last 6 weeks)41

              Other Metrics

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader