skip to main content
10.1145/3555050.3569125acmconferencesArticle/Chapter ViewAbstractPublication PagesconextConference Proceedingsconference-collections
research-article

Spine: an efficient DRL-based congestion control with ultra-low overhead

Published: 30 November 2022 Publication History

Abstract

Previous congestion control (CC) algorithms based on deep reinforcement learning (DRL) directly adjust flow sending rate to respond to dynamic bandwidth change, resulting in high inference overhead. Such overhead may consume considerable CPU resources and hurt the datapath performance. In this paper, we present Spine, a hierarchical congestion control algorithm that fully utilizes the performance gain from deep reinforcement learning but with ultra-low overhead. At its heart, Spine decouples the congestion control task into two subtasks in different timescales and handles them with different components: i) a lightweight CC executor that performs fine-grained control responding to dynamic bandwidth changes, and ii) an RL agent that works at a coarse-grained level that generates control sub-policies for the CC executor. Such two-level control architecture can provide fine-grained DRL-based control with a low model inference overhead. Real-world experiments and emulations show that Spine achieves consistent high performance across various network conditions with an ultra-low control overhead reduced by at least 80% compared to its DRL-based counterparts, similar to classic CC schemes such as Cubic.

References

[1]
Soheil Abbasloo, Chen-Yu Yen, and H Jonathan Chao. 2020. Classic meets modern: a pragmatic learning-based congestion control for the internet. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication. 632--647.
[2]
Soheil Abbasloo, Chen-Yu Yen, and H Jonathan Chao. 2020. Wanna make your TCP scheme great for cellular networks? Let machines do it for you! IEEE Journal on Selected Areas in Communications 39, 1 (2020), 265--279.
[3]
Venkat Arun and Hari Balakrishnan. 2018. Copa: Practical delay-based congestion control for the internet. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). 329--342.
[4]
Djallel Bouneffouf, Irina Rish, and Charu Aggarwal. 2020. Survey on applications of multi-armed and contextual bandits. In 2020 IEEE Congress on Evolutionary Computation (CEC). IEEE, 1--8.
[5]
Lawrence S Brakmo, Sean W O'Malley, and Larry L Peterson. 1994. TCP Vegas: New techniques for congestion detection and avoidance. Number 4. ACM.
[6]
Neal Cardwell, Yuchung Cheng, C Stephen Gunn, Soheil Hassas Yeganeh, and Van Jacobson. 2016. BBR: Congestion-based congestion control. Queue 14, 5 (2016), 20--53.
[7]
Mung Chiang, Steven H Low, A Robert Calderbank, and John C Doyle. 2007. Layering as optimization decomposition: A mathematical theory of network architectures. Proc. IEEE 95, 1 (2007), 255--312.
[8]
Junyoung Chung, Sungjin Ahn, and Yoshua Bengio. 2016. Hierarchical multiscale recurrent neural networks. arXiv preprint arXiv:1609.01704 (2016).
[9]
Mo Dong, Qingxi Li, Doron Zarchy, P Brighten Godfrey, and Michael Schapira. 2015. PCC: Re-architecting Congestion Control for Consistent High Performance. In 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15). 395--408.
[10]
Mo Dong, Tong Meng, Doron Zarchy, Engin Arslan, Yossi Gilad, Brighten Godfrey, and Michael Schapira. 2018. PCC Vivace: Online-Learning Congestion Control. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). USENIX Association, Renton, WA, 343--356.
[11]
DI engine Contributors. 2021. DI-engine: OpenDILab Decision Intelligence Engine. https://github.com/opendilab/DI-engine. (2021).
[12]
Sally Floyd, Tom Henderson, Andrei Gurtov, et al. 1999. The NewReno modification to TCP's fast recovery algorithm. (1999).
[13]
Scott Fujimoto, Herke Hoof, and David Meger. 2018. Addressing function approximation error in actor-critic methods. In International Conference on Machine Learning. PMLR, 1587--1596.
[14]
Alfred Giessler, J Haenle, Andreas König, and E Pade. 1978. Free buffer allocation---An investigation by simulation. Computer Networks (1976) 2, 3 (1978), 191--208.
[15]
Sangtae Ha, Injong Rhee, and Lisong Xu. 2008. CUBIC: a new TCP-friendly high-speed TCP variant. ACM SIGOPS operating systems review 5 (2008), 64--74.
[16]
Mingzhe Hao, Levent Toksoz, Nanqinqin Li, Edward Edberg Halim, Henry Hoffmann, and Haryadi S Gunawi. 2020. {LinnOS}: Predictability on Unpredictable Flash Storage with a Light Neural Network. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). 173--190.
[17]
Van Jacobson. 1988. Congestion avoidance and control. ACM SIGCOMM computer communication review 18, 4 (1988), 314--329.
[18]
Jeffrey Jaffe. 1981. Flow control power is nondecentralizable. IEEE Transactions on Communications 29, 9 (1981), 1301--1306.
[19]
Nathan Jay, Noga Rotman, Brighten Godfrey, Michael Schapira, and Aviv Tamar. 2019. A Deep Reinforcement Learning Perspective on Internet Congestion Control. In International Conference on Machine Learning ICML. 3050--3059.
[20]
Jiechuan Jiang and Zongqing Lu. 2019. Learning fairness in multi-agent systems. Advances in Neural Information Processing Systems 32 (2019).
[21]
Steven Kapturowski, Georg Ostrovski, John Quan, Remi Munos, and Will Dabney. 2018. Recurrent experience replay in distributed reinforcement learning. In International conference on learning representations.
[22]
Leonard Kleinrock. 1978. On flow control in computer networks. In Proceedings of the International Conference on Communications, Vol. 2. 27--2.
[23]
Leonard Kleinrock. 1979. Power and deterministic rules of thumb for probabilistic problems in computer communications. In ICC'79; International Conference on Communications, Volume 3, Vol. 3. 43--1.
[24]
Feng Li, Dongxiao Yu, Huan Yang, Jiguo Yu, Holger Karl, and Xiuzhen Cheng. 2020. Multi-Armed-Bandit-Based Spectrum Scheduling Algorithms in Wireless Networks: A Survey. IEEE Wireless Communications 27, 1 (2020), 24--30.
[25]
Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015).
[26]
Shao Liu, Tamer Başar, and Ravi Srikant. 2008. TCP-Illinois: A loss-and delay-based congestion control algorithm for high-speed networks. Performance Evaluation 65, 6--7 (2008), 417--440.
[27]
Yiqing Ma, Han Tian, Xudong Liao, Junxue Zhang, Weiyan Wang, Kai Chen, and Xin Jin. 2022. Multi-objective congestion control. In Proceedings of the Seventeenth European Conference on Computer Systems. 218--235.
[28]
Setareh Maghsudi and Ekram Hossain. 2016. Multi-armed bandits with application to 5G small cells. IEEE Wireless Communications 23, 3 (2016), 64--73.
[29]
Hongzi Mao, Ravi Netravali, and Mohammad Alizadeh. 2017. Neural adaptive video streaming with pensieve. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication. 197--210.
[30]
Hongzi Mao, Malte Schwarzkopf, Shaileshh Bojja Venkatakrishnan, Zili Meng, and Mohammad Alizadeh. 2019. Learning Scheduling Algorithms for Data Processing Clusters. In Proceedings of the ACM Special Interest Group on Data Communication (SIGCOMM '19). Association for Computing Machinery, New York, NY, USA, 270--288.
[31]
Tong Meng, Neta Rozen Schiff, P Brighten Godfrey, and Michael Schapira. 2020. PCC proteus: Scavenger transport and beyond. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication. 615--631.
[32]
Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous methods for deep reinforcement learning. In International conference on machine learning. PMLR, 1928--1937.
[33]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013).
[34]
Akshay Narayan, Frank Cangialosi, Deepti Raghavan, Prateesh Goyal, Srinivas Narayana, Radhika Mittal, Mohammad Alizadeh, and Hari Balakrishnan. 2018. Restructuring endpoint congestion control. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication. ACM, 30--43.
[35]
Ravi Netravali, Anirudh Sivaraman, Somak Das, Ameesh Goyal, Keith Winstein, James Mickens, and Hari Balakrishnan. 2015. Mahimahi: Accurate Record-and-Replay for HTTP. In 2015 USENIX Annual Technical Conference (USENIX ATC 15). 417--429.
[36]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 8024--8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
[37]
J Salim, H Khosravi, Andi Kleen, and Alexey Kuznetsov. 2003. Linux netlink as an ip services protocol. Technical Report.
[38]
Umer Siddique, Paul Weng, and Matthieu Zimmer. 2020. Learning fair policies in multi-objective (deep) reinforcement learning with average and discounted rewards. In International Conference on Machine Learning. PMLR, 8905--8915.
[39]
David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, et al. 2018. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362, 6419 (2018), 1140--1144.
[40]
Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.
[41]
Kun Tan, Jingmin Song, Qian Zhang, and Murari Sridharan. 2006. A compound TCP approach for high-speed and long distance networks. In Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications. IEEE, 1--12.
[42]
Keith Winstein and Hari Balakrishnan. 2013. TCP Ex Machina: Computer-Generated Congestion Control. In Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM (SIGCOMM '13). Association for Computing Machinery, New York, NY, USA, 123--134.
[43]
Keith Winstein, Anirudh Sivaraman, and Hari Balakrishnan. 2013. Stochastic forecasts achieve high throughput and low delay over cellular networks. In 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13). 459--471.
[44]
Francis Y Yan, Jestin Ma, Greg D Hill, Deepti Raghavan, Riad S Wahby, Philip Levis, and Keith Winstein. 2018. Pantheon: the training ground for Internet congestion-control research. In 2018 USENIX Annual Technical Conference (USENIX ATC 18).
[45]
Siyu Yan, Xiaoliang Wang, Xiaolong Zheng, Yinben Xia, Derui Liu, and Weishan Deng. 2021. ACC: Automatic ECN Tuning for High-Speed Datacenter Networks. In Proceedings of the 2021 ACM SIGCOMM 2021 Conference (SIGCOMM '21). Association for Computing Machinery, New York, NY, USA, 384--397.
[46]
Tai Yue, Pengfei Wang, Yong Tang, Enze Wang, Bo Yu, Kai Lu, and Xu Zhou. 2020. EcoFuzz: Adaptive Energy-Saving Greybox Fuzzing as a Variant of the Adversarial Multi-Armed Bandit. In 29th USENIX Security Symposium (USENIX Security 20). USENIX Association, 2307--2324. https://www.usenix.org/conference/usenixsecurity20/presentation/yue
[47]
Junxue Zhang, Chaoliang Zeng, Hong Zhang, Shuihai Hu, and Kai Chen. 2022. LiteFlow: High-performance Adaptive Neural Networks for Kernel Datapath. In Proceedings of the 2022 ACM SIGCOMM 2022 Conference.
[48]
Matthieu Zimmer, Claire Glanois, Umer Siddique, and Paul Weng. 2021. Learning fair policies in decentralized cooperative multi-agent reinforcement learning. In International Conference on Machine Learning. PMLR, 12967--12978.

Cited By

View all
  • (2024)Lightweight Automatic ECN Tuning Based on Deep Reinforcement Learning With Ultra-Low Overhead in Datacenter NetworksIEEE Transactions on Network and Service Management10.1109/TNSM.2024.345059621:6(6398-6408)Online publication date: Dec-2024
  • (2024)Dragonfly: In-Flight CCA IdentificationIEEE Transactions on Network and Service Management10.1109/TNSM.2024.338041721:3(2675-2685)Online publication date: Jun-2024
  • (2024)Resource Critical Flow Monitoring in Software-Defined NetworksIEEE/ACM Transactions on Networking10.1109/TNET.2023.328669132:1(396-410)Online publication date: Feb-2024
  • Show More Cited By

Index Terms

  1. Spine: an efficient DRL-based congestion control with ultra-low overhead

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CoNEXT '22: Proceedings of the 18th International Conference on emerging Networking EXperiments and Technologies
      November 2022
      431 pages
      ISBN:9781450395083
      DOI:10.1145/3555050
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 30 November 2022

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. congestion control
      2. deep reinforcement learning
      3. transport layer protocols

      Qualifiers

      • Research-article

      Funding Sources

      • the Key-Area Research and Development Program of Guangdong Province
      • the Hong Kong RGC TRS
      • the NSFC Grant
      • GRF research funding

      Conference

      CoNEXT '22
      Sponsor:

      Acceptance Rates

      CoNEXT '22 Paper Acceptance Rate 28 of 151 submissions, 19%;
      Overall Acceptance Rate 198 of 789 submissions, 25%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)113
      • Downloads (Last 6 weeks)10
      Reflects downloads up to 05 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Lightweight Automatic ECN Tuning Based on Deep Reinforcement Learning With Ultra-Low Overhead in Datacenter NetworksIEEE Transactions on Network and Service Management10.1109/TNSM.2024.345059621:6(6398-6408)Online publication date: Dec-2024
      • (2024)Dragonfly: In-Flight CCA IdentificationIEEE Transactions on Network and Service Management10.1109/TNSM.2024.338041721:3(2675-2685)Online publication date: Jun-2024
      • (2024)Resource Critical Flow Monitoring in Software-Defined NetworksIEEE/ACM Transactions on Networking10.1109/TNET.2023.328669132:1(396-410)Online publication date: Feb-2024
      • (2024)Reinforcement Learning-based Congestion Control: A Systematic Evaluation of Fairness, Efficiency and ResponsivenessIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621288(1451-1460)Online publication date: 20-May-2024
      • (2024)Aquilas: Adaptive QoS-Oriented Multipath Packet Scheduler with Hierarchical Intelligence for QUIC2024 IEEE 44th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS60910.2024.00052(485-495)Online publication date: 23-Jul-2024
      • (2024)RNN-based Congestion Control in the Linux Kernel2024 Twelfth International Symposium on Computing and Networking Workshops (CANDARW)10.1109/CANDARW64572.2024.00028(130-136)Online publication date: 26-Nov-2024
      • (2023)Dragonfly: In-Flight CCA Identification2023 IFIP Networking Conference (IFIP Networking)10.23919/IFIPNetworking57963.2023.10186432(1-9)Online publication date: 12-Jun-2023
      • (2023)LiteFlow: Toward High-Performance Adaptive Neural Networks for Kernel DatapathIEEE/ACM Transactions on Networking10.1109/TNET.2023.329315232:1(627-642)Online publication date: 17-Jul-2023
      • (2023)A Data-Driven Framework for TCP to Achieve Flexible QoS Control in Mobile Data Networks2023 IEEE/ACM 31st International Symposium on Quality of Service (IWQoS)10.1109/IWQoS57198.2023.10188765(1-11)Online publication date: 19-Jun-2023
      • (2023)Towards Enabling Performance-Guaranteed Slice Management and Orchestration in 6G2023 Joint European Conference on Networks and Communications & 6G Summit (EuCNC/6G Summit)10.1109/EuCNC/6GSummit58263.2023.10188226(729-734)Online publication date: 6-Jun-2023
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media