ABSTRACT
Recent years have witnessed a plethora of learning-based solutions for congestion control (CC) that demonstrate better performance over traditional TCP schemes. However, they fail to provide consistently good convergence properties, including fairness, fast convergence and stability, due to the mismatch between their objective functions and these properties. Despite being intuitive, integrating these properties into existing learning-based CC is challenging, because: 1) their training environments are designed for the performance optimization of single flow but incapable of cooperative multi-flow optimization, and 2) there is no directly measurable metric to represent these properties into the training objective function.
We present Astraea, a new learning-based congestion control that ensures fast convergence to fairness with stability. At the heart of Astraea is a multi-agent deep reinforcement learning framework that explicitly optimizes these convergence properties during the training process by enabling the learning of interactive policy between multiple competing flows, while maintaining high performance. We further build a faithful multi-flow environment that emulates the competing behaviors of concurrent flows, explicitly expressing convergence properties to enable their optimization during training. We have fully implemented Astraea and our comprehensive experiments show that Astraea can quickly converge to fairness point and exhibit better stability than its counterparts. For example, Astraea achieves near-optimal bandwidth sharing (i.e., fairness) when multiple flows compete for the same bottleneck, delivers up to 8.4× faster convergence speed and 2.8× smaller throughput deviation, while achieving comparable or even better performance over prior solutions.
- Linux tc. https://man7.org/linux/man-pages/man8/tc.8.html.Google Scholar
- Pantheon tunnel. https://github.com/StanfordSNR/pantheon-tunnel. Accessed: 2021-05-30.Google Scholar
- Soheil Abbasloo, Chen-Yu Yen, and H Jonathan Chao. Classic meets modern: a pragmatic learning-based congestion control for the internet. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication, pages 632--647, 2020.Google ScholarDigital Library
- Guido Appenzeller, Isaac Keslassy, and Nick McKeown. Sizing router buffers. ACM SIGCOMM Computer Communication Review, 34(4):281--292, 2004.Google ScholarDigital Library
- Venkat Arun and Hari Balakrishnan. Copa: Practical delay-based congestion control for the internet. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18), pages 329--342, 2018.Google ScholarDigital Library
- Lawrence S Brakmo, Sean W O'Malley, and Larry L Peterson. TCP Vegas: New techniques for congestion detection and avoidance. Number 4. ACM, 1994.Google Scholar
- Neal Cardwell, Yuchung Cheng, C Stephen Gunn, Soheil Hassas Yeganeh, and Van Jacobson. BBR: Congestion-based congestion control. Queue, 14(5):20--53, 2016.Google ScholarDigital Library
- Li Chen, Justinas Lingys, Kai Chen, and Feng Liu. Auto: Scaling deep reinforcement learning for datacenter-scale automatic traffic optimization. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication, pages 191--205, 2018.Google ScholarDigital Library
- Inho Cho, Keon Jang, and Dongsu Han. Credit-scheduled delay-bounded congestion control for datacenters. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication, SIGCOMM '17, page 239-252. Association for Computing Machinery, 2017.Google ScholarDigital Library
- Mo Dong, Qingxi Li, Doron Zarchy, P Brighten Godfrey, and Michael Schapira. PCC: Re-architecting congestion control for consistent high performance. In 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15), pages 395--408, 2015.Google Scholar
- Mo Dong, Tong Meng, Doron Zarchy, Engin Arslan, Yossi Gilad, Brighten Godfrey, and Michael Schapira. PCC Vivace: Online-learning congestion control. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18), pages 343--356, Renton, WA, April 2018. USENIX Association.Google Scholar
- Sally Floyd, Tom Henderson, Andrei Gurtov, et al. The newreno modification to tcp's fast recovery algorithm. 1999.Google ScholarDigital Library
- Victor S Frost and Benjamin Melamed. Traffic modeling for telecommunications networks. IEEE Communications Magazine, 32(3):70--81, 1994.Google ScholarDigital Library
- Sangtae Ha, Injong Rhee, and Lisong Xu. Cubic: a new tcp-friendly high-speed tcp variant. ACM SIGOPS operating systems review, (5):64--74, 2008.Google Scholar
- Shariq Iqbal and Fei Sha. Actor-attention-critic for multi-agent reinforcement learning. In ICML, 2019.Google Scholar
- Van Jacobson. Congestion avoidance and control. ACM SIGCOMM computer communication review, 18(4):314--329, 1988.Google Scholar
- Raj Jain, Arjan Durresi, and Gojko Babic. Throughput fairness index: An explanation. In ATM Forum contribution, volume 99, 1999.Google Scholar
- Nathan Jay, Noga Rotman, Brighten Godfrey, Michael Schapira, and Aviv Tamar. A deep reinforcement learning perspective on internet congestion control. In International Conference on Machine Learning ICML, pages 3050--3059, 2019.Google Scholar
- Cheng Jin, David X Wei, and Steven H Low. Fast tcp: motivation, architecture, algorithms, performance. In IEEE INFOCOM 2004, volume 4, pages 2490--2501. IEEE, 2004.Google Scholar
- Dina Katabi, Mark Handley, and Charlie Rohrs. Congestion control for high bandwidth-delay product networks. In Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications, pages 89--102, 2002.Google ScholarDigital Library
- Frank P Kelly, Aman K Maulloo, and David Kim Hong Tan. Rate control for communication networks: shadow prices, proportional fairness and stability. Journal of the Operational Research society, 49(3):237--252, 1998.Google ScholarCross Ref
- Xudong Liao, Han Tian, Chaoliang Zeng, Xinchen Wan, and Kai Chen. Towards fair and efficient learning-based congestion control. arXiv preprint arXiv:2403.01798, 2024.Google Scholar
- Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971, 2015.Google Scholar
- Michael L Littman. Markov games as a framework for multi-agent reinforcement learning. In Machine learning proceedings 1994, pages 157--163. Elsevier, 1994.Google ScholarDigital Library
- Ryan Lowe, Yi Wu, Aviv Tamar, Jean Harb, Pieter Abbeel, and Igor Mordatch. Multi-agent actor-critic for mixed cooperative-competitive environments. arXiv preprint arXiv:1706.02275, 2017.Google Scholar
- Yiqing Ma, Han Tian, Xudong Liao, Junxue Zhang, Weiyan Wang, Kai Chen, and Xin Jin. Multi-objective congestion control. In Proceedings of the Seventeenth European Conference on Computer Systems, pages 218--235, 2022.Google ScholarDigital Library
- Hongzi Mao, Ravi Netravali, and Mohammad Alizadeh. Neural adaptive video streaming with pensieve. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication, pages 197--210, 2017.Google ScholarDigital Library
- Hongzi Mao, Malte Schwarzkopf, Shaileshh Bojja Venkatakrishnan, Zili Meng, and Mohammad Alizadeh. Learning scheduling algorithms for data processing clusters. In Proceedings of the ACM Special Interest Group on Data Communication, SIGCOMM '19, page 270--288, New York, NY, USA, 2019. Association for Computing Machinery.Google ScholarDigital Library
- Gustavo Marfia, Claudio Palazzi, Giovanni Pau, Mario Gerla, MY Sanadidi, and Marco Roccetti. Tcp libra: Exploring rtt-fairness for tcp. In International Conference on Research in Networking, pages 1005--1013. Springer, 2007.Google Scholar
- Ravi Netravali, Anirudh Sivaraman, Somak Das, Ameesh Goyal, Keith Winstein, James Mickens, and Hari Balakrishnan. Mahimahi: Accurate record-and-replay for HTTP. In 2015 USENIX Annual Technical Conference (USENIX ATC 15), pages 417--429, 2015.Google Scholar
- Tabish Rashid, Mikayel Samvelyan, C. S. D. Witt, Gregory Farquhar, Jakob N. Foerster, and Shimon Whiteson. Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning. ArXiv, abs/1803.11485, 2018.Google Scholar
- Tabish Rashid, Mikayel Samvelyan, C. S. D. Witt, Gregory Farquhar, Jakob N. Foerster, and Shimon Whiteson. Monotonic value function factorisation for deep multi-agent reinforcement learning. J. Mach. Learn. Res., 21:178:1-178:51, 2020.Google Scholar
- Alessio Sacco, Matteo Flocco, Flavio Esposito, and Guido Marchetto. Owl: congestion control with partially invisible networks via reinforcement learning. In IEEE INFOCOM 2021-IEEE Conference on Computer Communications, pages 1--10. IEEE, 2021.Google ScholarDigital Library
- Anirudh Sivaraman, Keith Winstein, Pratiksha Thaker, and Hari Balakrishnan. An experimental study of the learnability of congestion control. In Proceedings of the 2014 ACM Conference on SIGCOMM, SIGCOMM '14, page 479--490, New York, NY, USA, 2014. Association for Computing Machinery.Google ScholarDigital Library
- Richard S Sutton and Andrew G Barto. Reinforcement learning: An introduction. MIT press, 2018.Google ScholarDigital Library
- Kun Tan, Jingmin Song, Qian Zhang, and Murari Sridharan. A compound tcp approach for high-speed and long distance networks. In Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications, pages 1--12. IEEE, 2006.Google ScholarCross Ref
- Han Tian, Xudong Liao, Chaoliang Zeng, Junxue Zhang, and Kai Chen. Spine: an efficient drl-based congestion control with ultra-low overhead. In Proceedings of the 18th International Conference on emerging Networking EXperiments and Technologies, pages 261--275, 2022.Google ScholarDigital Library
- Keith Winstein and Hari Balakrishnan. Tcp ex machina: Computer-generated congestion control. In Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM, SIGCOMM '13, page 123--134, New York, NY, USA, 2013. Association for Computing Machinery.Google ScholarDigital Library
- Keith Winstein, Anirudh Sivaraman, and Hari Balakrishnan. Stochastic forecasts achieve high throughput and low delay over cellular networks. In 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), pages 459--471, 2013.Google Scholar
- Zhengxu Xia, Yajie Zhou, Francis Y. Yan, and Junchen Jiang. Genet: Automatic curriculum generation for learning adaptation in networking. In Proceedings of the ACM SIGCOMM 2022 Conference, SIGCOMM '22, page 397--413, 2022.Google ScholarDigital Library
- Kaiqiang Xu, Xinchen Wan, Hao Wang, Zhenghang Ren, Xudong Liao, Decang Sun, Chaoliang Zeng, and Kai Chen. Tacc: A full-stack cloud computing infrastructure for machine learning tasks. arXiv preprint arXiv:2110.01556, 2021.Google Scholar
- Francis Y Yan, Jestin Ma, Greg D Hill, Deepti Raghavan, Riad S Wahby, Philip Levis, and Keith Winstein. Pantheon: the training ground for internet congestion-control research. In 2018 USENIX Annual Technical Conference (USENIX ATC 18), 2018.Google Scholar
- Junxue Zhang, Chaoliang Zeng, Hong Zhang, Shuihai Hu, and Kai Chen. Liteflow: towards high-performance adaptive neural networks for kernel datapath. In Proceedings of the ACM SIGCOMM 2022 Conference, pages 414--427, 2022.Google ScholarDigital Library
Index Terms
- Astraea: Towards Fair and Efficient Learning-based Congestion Control
Recommendations
An efficient and fair explicit congestion control protocol for high bandwidth-delay product networks
XCP and VCP can achieve excellent performance under high bandwidth-delay product networks, but they all have some defects. In XCP, router needs to calculate a feedback for each departing packet, the cost will be un-negligible in high-speed networks. In ...
Robust and fair Multicast Congestion Control (M2C)
Since 1995 and the Receiver-driven Layered Multicast (RLM) protocol, numerous multicast congestion control protocols have been proposed, such as RLC, FLID-SL, FLID-DL and finally the WEBRC protocol. However these protocols suffer from some limitations ...
Fair multicast congestion control (M2C)
INFOCOM'09: Proceedings of the 28th IEEE international conference on Computer Communications WorkshopsWe propose a new TCP-friendly multicast congestion control for very large groups of receivers. This protocol named M2C is layered and receiver-driven. Like TCP, M2C is composed of a Congestion Avoidance and a Slow Start state. However, M2C can re-...
Comments