Glider: rethinking congestion control with deep reinforcement learning

Xia, Zhenchang; Wu, Libing; Wang, Fei; Liao, Xudong; Hu, Haiyan; Wu, Jia; Wu, Dan

doi:10.1007/s11280-022-01018-1

Glider: rethinking congestion control with deep reinforcement learning

Published: 27 June 2022

Volume 26, pages 115–137, (2023)
Cite this article

World Wide Web Aims and scope Submit manuscript

Zhenchang Xia ORCID: orcid.org/0000-0001-8295-8359^1,2,
Libing Wu¹,
Fei Wang³,
Xudong Liao³,
Haiyan Hu³,
Jia Wu⁴ &
…
Dan Wu⁵

880 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Traditional congestion control protocols may fail to achieve consistently-high performance over a wide range of networking environments as their hardwired policies are optimized over specific network conditions. In this paper, we depart from conventional wisdom and propose Glider, a new congestion control protocol that uses deep reinforcement learning to be more versatile and adaptive to dynamic environments. In particular, Glider uses a framework based on Deep Q-Network, that a sender keeps adapting its congestion control strategies by continuously interacting with the network environment. In addition, the sender constantly sends data, making it challenging to apply reinforcement learning algorithms that require step-by-step state computation to congestion control. Therefore, we design a Dynamic Bisection Division Algorithm (DBDA) to discretize the packet transmission process into steps to ensure Glider’s feasibility on congestion control. We have used an extensive array of experiments on Pantheon to show that Glider can adapt well to varying buffer sizes and is resilient to random loss. Moreover, on wide-area inter-data center links, it can achieve 6.4$\times$ and 1.4$\times$ higher throughput than TCP CUBIC and BBR, respectively, and comparable performance as other learning-based congestion control protocols in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MA-CC: Cross-Layer Congestion Control via Multi-agent Reinforcement Learning

HAECN: Hierarchical Automatic ECN Tuning with Ultra-Low Overhead in Datacenter Networks

HACC: Hierarchical Automatic Selection of Congestion Control Algorithms

References

Amiranashvili, A., Dosovitskiy, A., Koltun, V., Brox, T.: Td or not td: Analyzing the role of temporal differencing in deep reinforcement learning. arXiv:1806.01175 (2018)
Appenzeller, G., Keslassy, I., McKeown, N.: Sizing router buffers. ACM SIGCOMM Comput. Commun. Rev. 34(4), 281–292 (2004)
Arun, V., Balakrishnan, H.: Copa: Practical delay-based congestion control for the internet. In: Proc. USENIX Symposium on Networked Systems Design and Implementation, pp. 329–342 (2018)
Brakmo, L.S., O’Malley, S.W., Peterson, L.L.: Tcp vegas : new techniques for congestion detection and avoidance. ACM SIGCOMM Computer Communication Review 24(4), 24–35 (1994)
Article Google Scholar
Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: OpenAI Gym. arXiv:1606.01540 (2016)
Cai, T., Li, J., Mian, A.S., Sellis, T., Yu, J.X., et al.: Target-aware holistic influence maximization in spatial social networks. IEEE Transactions on Knowledge and Data Engineering (2020)
Cardwell, N., Cheng, Y., Gunn, C.S., Yeganeh, S.H., Jacobson, V.: BBR: Congestion-based congestion control. Communications of the ACM 60, 58–66 (2017)
Article Google Scholar
De Boer, P.T., Kroese, D.P., Mannor, S., Rubinstein, R.Y.: A tutorial on the cross-entropy method. Annals of Operations Research 134(1), 19–67 (2005)
Article MATH Google Scholar
Dong, M., Li, Q., Zarchy, D., Godfrey, P.B., Schapira, M.: PCC: re-architecting congestion control for consistent high performance. In: Proc. USENIX Symposium on Networked Systems Design and Implementation, pp. 395–408 (2015)
Dong, M., Meng, T., Zarchy, D., Arslan, E., Gilad, Y., Godfrey, B., Schapira, M.: PCC Vivace: Online-learning congestion control. In: Proc. USENIX Symposium on Networked Systems Design and Implementation, pp. 343–356 (2018)
Du, J., Michalska, S., Subramani, S., Wang, H., Zhang, Y.: Neural attention with character embeddings for hay fever detection from twitter. Health Information Science and Systems 7(1), 1–7 (2019)
Article Google Scholar
Fall, K.R., Stevens, W.R.: TCP/IP Illustrated, volume 1: the protocols. Addison-Wesley (2011)
Gettys, J.: Bufferbloat: dark buffers in the internet. IEEE Internet Computing (3) (2011)
Guadarrama, S., et al.: TF-Agents: A library for reinforcement learning in TensorFlow. https://github.comtensorflow/agents (2018). Accessed 22 Aug 2018
Ha, S., Rhee, I., Xu, L.: CUBIC: a new TCP-friendly high-speed TCP variant. ACM SIGOPS Operating Systems Review 42(5), 64–74 (2008)
Article Google Scholar
Haldar, N.A.H., Reynolds, M., Shao, Q., Paris, C., Li, J., Chen, Y.: Activity location inference of users based on social relationship. World Wide Web pp. 1–19 (2021)
Jay, N., Rotman, N., Godfrey, B., Schapira, M., Tamar, A.: A deep reinforcement learning perspective on internet congestion control. In: Proc. the 36th international conference on machine learning, pp. 3050–3059 (2019)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv:1412.6980 (2014)
Kong, Y., Zang, H., Ma, X.: Improving TCP congestion control with machine intelligence. In: Proc. 2018 ACM Workshop on Network Meets AI & ML, pp. 60–66 (2018)
Li, J., Cai, T., Deng, K., Wang, X., Sellis, T., Xia, F.: Community-diversified influence maximization in social networks. Information Systems 92, 101,522 (2020)
Li, W., Zhou, F., Chowdhury, K.R., Meleis, W.: QTCP: Adaptive congestion control with reinforcement learning. IEEE Transactions on Network Science and Engineering 6(3), 445–458 (2018)
Article Google Scholar
Li, Z., Wang, X., Li, J., Zhang, Q.: Deep attributed network representation learning of complex coupling and interaction. Knowledge-Based Systems 212, 106,618 (2021)
Article Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Netravali, R., Sivaraman, A., Das, S., Goyal, A., Winstein, K., Mickens, J., Balakrishnan, H.: Mahimahi: Accurate record-and-replay for HTTP. In: Proc. USENIX Annual Technical Conference, pp. 417–429 (2015)
Ruffy, F., Przystupa, M., Beschastnikh, I.: Iroko: A framework to prototype reinforcement learning for data center traffic control. In: Proc. International Conference on Machine Learning (2018)
Sarki, R., Ahmed, K., Wang, H., Zhang, Y.: Automated detection of mild and multi-class diabetic eye diseases using deep learning. Health Information Science and Systems 8(1), 1–9 (2020)
Article Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv:1707.06347 (2017)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT press (2018)
Szita, I., Lörincz, A.: Learning Tetris using the noisy cross-entropy method. Neural Computation 18(12), 2936–2941 (2006)
Article MATH Google Scholar
Tan, L., Yuan, C., Zukerman, M.: FAST TCP: fairness and queuing issues. IEEE Communications Letters 9(8), 762–764 (2005)
Article Google Scholar
Winstein, K., Balakrishnan, H.: TCP ex Machina: Computer-generated congestion control. In: Proc. ACM SIGCOMM 2013 Conference, pp. 123–134 (2013)
Wu, L., Yang, J., Zhou, M., Chen, Y., Wang, Q.: Lvid: A multimodal biometrics authentication system on smartphones. IEEE Transactions on Information Forensics and Security 15, 1572–1585 (2019)
Article Google Scholar
Xia, Z., Chen, Y., Wu, L., Chou, Y.C., Zheng, Z., Li, H., Li, B.: A multi-objective reinforcement learning perspective on internet congestion control. In: 2021 IEEE/ACM 29th International Symposium on Quality of Service (IWQOS), pp. 1–10 (2021)
Xia, Z., Wu, J., Wu, L., Yuan, J., Zhang, J., Li, J., Wu, D.: Rlcc: Practical learning-based congestion control for the internet. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021)
Xia, Z., Xue, S., Wu, J., Chen, Y., Chen, J., Wu, L.: Deep reinforcement learning for smart city communication networks. IEEE Transactions on Industrial Informatics 17(6), 4188–4196 (2020)
Article Google Scholar
Xiao, L., Jiang, D., Chen, Y., Su, W., Tang, Y.: Reinforcement-learning-based relay mobility and power allocation for underwater sensor networks against jamming. IEEE Journal of Oceanic Engineering 45(3), 1148–1156 (2019)
Article Google Scholar
Xue, G., Zhong, M., Li, J., Chen, J., Zhai, C., Kong, R.: Dynamic network embedding survey. arXiv:2103.15447 (2021)
Yan, F.Y., Ma, J., Hill, G.D., Raghavan, D., Wahby, R.S., Levis, P., Winstein, K.: Pantheon: the training ground for internet congestion-control research. In: Proc. USENIX Annual Technical Conference, pp. 731–743 (2018)
Yin, J., Tang, M., Cao, J., Wang, H., You, M., Lin, Y.: Vulnerability exploitation time prediction: an integrated framework for dynamic imbalanced learning. World Wide Web pp. 1–23 (2021)
Yin, H., Yang, S., Song, X., Liu, W., Li, J.: Deep fusion of multimodal features for social media retweet time prediction. World Wide Web pp. 1–18 (2020)

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. U20A20177, 61772377,91746206), the Fundamental Research Funds for the Central Universities(2042020kf0217), and the Science and Technology planning project of Shenzhen(JCYJ202103243002197).

Author information

Authors and Affiliations

School of Computer Science, Wuhan University, Wuhan, China
Zhenchang Xia & Libing Wu
The State Key Laboratory of Integrated Services Networks, Xidian University, Xi’an, Shaanxi, China
Zhenchang Xia
School of Computer Science, Wuhan University, Wuhan, China
Fei Wang, Xudong Liao & Haiyan Hu
Department of Computing, Macquarie University, Sydney, Australia
Jia Wu
School of Computer Science, University of Windsor, Windsor, Canada
Dan Wu

Authors

Zhenchang Xia
View author publications
You can also search for this author inPubMed Google Scholar
Libing Wu
View author publications
You can also search for this author inPubMed Google Scholar
Fei Wang
View author publications
You can also search for this author inPubMed Google Scholar
Xudong Liao
View author publications
You can also search for this author inPubMed Google Scholar
Haiyan Hu
View author publications
You can also search for this author inPubMed Google Scholar
Jia Wu
View author publications
You can also search for this author inPubMed Google Scholar
Dan Wu
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding authors

Correspondence to Libing Wu or Dan Wu.

Additional information

This article belongs to the Topical Collection: Special Issue on Decision Making in Heterogeneous Network Data Scenarios and Applications

Guest Editors: Jianxin Li, Chengfei Liu, Ziyu Guan, and Yinghui Wu

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xia, Z., Wu, L., Wang, F. et al. Glider: rethinking congestion control with deep reinforcement learning. World Wide Web 26, 115–137 (2023). https://doi.org/10.1007/s11280-022-01018-1

Download citation

Received: 20 September 2021
Revised: 14 January 2022
Accepted: 24 January 2022
Published: 27 June 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s11280-022-01018-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Glider: rethinking congestion control with deep reinforcement learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

MA-CC: Cross-Layer Congestion Control via Multi-agent Reinforcement Learning

HAECN: Hierarchical Automatic ECN Tuning with Ultra-Low Overhead in Datacenter Networks

HACC: Hierarchical Automatic Selection of Congestion Control Algorithms

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now