Abstract
Traditional congestion control protocols may fail to achieve consistently-high performance over a wide range of networking environments as their hardwired policies are optimized over specific network conditions. In this paper, we depart from conventional wisdom and propose Glider, a new congestion control protocol that uses deep reinforcement learning to be more versatile and adaptive to dynamic environments. In particular, Glider uses a framework based on Deep Q-Network, that a sender keeps adapting its congestion control strategies by continuously interacting with the network environment. In addition, the sender constantly sends data, making it challenging to apply reinforcement learning algorithms that require step-by-step state computation to congestion control. Therefore, we design a Dynamic Bisection Division Algorithm (DBDA) to discretize the packet transmission process into steps to ensure Glider’s feasibility on congestion control. We have used an extensive array of experiments on Pantheon to show that Glider can adapt well to varying buffer sizes and is resilient to random loss. Moreover, on wide-area inter-data center links, it can achieve 6.4\(\times\) and 1.4\(\times\) higher throughput than TCP CUBIC and BBR, respectively, and comparable performance as other learning-based congestion control protocols in the literature.










Similar content being viewed by others
References
Amiranashvili, A., Dosovitskiy, A., Koltun, V., Brox, T.: Td or not td: Analyzing the role of temporal differencing in deep reinforcement learning. arXiv:1806.01175 (2018)
Appenzeller, G., Keslassy, I., McKeown, N.: Sizing router buffers. ACM SIGCOMM Comput. Commun. Rev. 34(4), 281–292 (2004)
Arun, V., Balakrishnan, H.: Copa: Practical delay-based congestion control for the internet. In: Proc. USENIX Symposium on Networked Systems Design and Implementation, pp. 329–342 (2018)
Brakmo, L.S., O’Malley, S.W., Peterson, L.L.: Tcp vegas : new techniques for congestion detection and avoidance. ACM SIGCOMM Computer Communication Review 24(4), 24–35 (1994)
Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: OpenAI Gym. arXiv:1606.01540 (2016)
Cai, T., Li, J., Mian, A.S., Sellis, T., Yu, J.X., et al.: Target-aware holistic influence maximization in spatial social networks. IEEE Transactions on Knowledge and Data Engineering (2020)
Cardwell, N., Cheng, Y., Gunn, C.S., Yeganeh, S.H., Jacobson, V.: BBR: Congestion-based congestion control. Communications of the ACM 60, 58–66 (2017)
De Boer, P.T., Kroese, D.P., Mannor, S., Rubinstein, R.Y.: A tutorial on the cross-entropy method. Annals of Operations Research 134(1), 19–67 (2005)
Dong, M., Li, Q., Zarchy, D., Godfrey, P.B., Schapira, M.: PCC: re-architecting congestion control for consistent high performance. In: Proc. USENIX Symposium on Networked Systems Design and Implementation, pp. 395–408 (2015)
Dong, M., Meng, T., Zarchy, D., Arslan, E., Gilad, Y., Godfrey, B., Schapira, M.: PCC Vivace: Online-learning congestion control. In: Proc. USENIX Symposium on Networked Systems Design and Implementation, pp. 343–356 (2018)
Du, J., Michalska, S., Subramani, S., Wang, H., Zhang, Y.: Neural attention with character embeddings for hay fever detection from twitter. Health Information Science and Systems 7(1), 1–7 (2019)
Fall, K.R., Stevens, W.R.: TCP/IP Illustrated, volume 1: the protocols. Addison-Wesley (2011)
Gettys, J.: Bufferbloat: dark buffers in the internet. IEEE Internet Computing (3) (2011)
Guadarrama, S., et al.: TF-Agents: A library for reinforcement learning in TensorFlow. https://github.comtensorflow/agents (2018). Accessed 22 Aug 2018
Ha, S., Rhee, I., Xu, L.: CUBIC: a new TCP-friendly high-speed TCP variant. ACM SIGOPS Operating Systems Review 42(5), 64–74 (2008)
Haldar, N.A.H., Reynolds, M., Shao, Q., Paris, C., Li, J., Chen, Y.: Activity location inference of users based on social relationship. World Wide Web pp. 1–19 (2021)
Jay, N., Rotman, N., Godfrey, B., Schapira, M., Tamar, A.: A deep reinforcement learning perspective on internet congestion control. In: Proc. the 36th international conference on machine learning, pp. 3050–3059 (2019)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv:1412.6980 (2014)
Kong, Y., Zang, H., Ma, X.: Improving TCP congestion control with machine intelligence. In: Proc. 2018 ACM Workshop on Network Meets AI & ML, pp. 60–66 (2018)
Li, J., Cai, T., Deng, K., Wang, X., Sellis, T., Xia, F.: Community-diversified influence maximization in social networks. Information Systems 92, 101,522 (2020)
Li, W., Zhou, F., Chowdhury, K.R., Meleis, W.: QTCP: Adaptive congestion control with reinforcement learning. IEEE Transactions on Network Science and Engineering 6(3), 445–458 (2018)
Li, Z., Wang, X., Li, J., Zhang, Q.: Deep attributed network representation learning of complex coupling and interaction. Knowledge-Based Systems 212, 106,618 (2021)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Netravali, R., Sivaraman, A., Das, S., Goyal, A., Winstein, K., Mickens, J., Balakrishnan, H.: Mahimahi: Accurate record-and-replay for HTTP. In: Proc. USENIX Annual Technical Conference, pp. 417–429 (2015)
Ruffy, F., Przystupa, M., Beschastnikh, I.: Iroko: A framework to prototype reinforcement learning for data center traffic control. In: Proc. International Conference on Machine Learning (2018)
Sarki, R., Ahmed, K., Wang, H., Zhang, Y.: Automated detection of mild and multi-class diabetic eye diseases using deep learning. Health Information Science and Systems 8(1), 1–9 (2020)
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv:1707.06347 (2017)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT press (2018)
Szita, I., Lörincz, A.: Learning Tetris using the noisy cross-entropy method. Neural Computation 18(12), 2936–2941 (2006)
Tan, L., Yuan, C., Zukerman, M.: FAST TCP: fairness and queuing issues. IEEE Communications Letters 9(8), 762–764 (2005)
Winstein, K., Balakrishnan, H.: TCP ex Machina: Computer-generated congestion control. In: Proc. ACM SIGCOMM 2013 Conference, pp. 123–134 (2013)
Wu, L., Yang, J., Zhou, M., Chen, Y., Wang, Q.: Lvid: A multimodal biometrics authentication system on smartphones. IEEE Transactions on Information Forensics and Security 15, 1572–1585 (2019)
Xia, Z., Chen, Y., Wu, L., Chou, Y.C., Zheng, Z., Li, H., Li, B.: A multi-objective reinforcement learning perspective on internet congestion control. In: 2021 IEEE/ACM 29th International Symposium on Quality of Service (IWQOS), pp. 1–10 (2021)
Xia, Z., Wu, J., Wu, L., Yuan, J., Zhang, J., Li, J., Wu, D.: Rlcc: Practical learning-based congestion control for the internet. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021)
Xia, Z., Xue, S., Wu, J., Chen, Y., Chen, J., Wu, L.: Deep reinforcement learning for smart city communication networks. IEEE Transactions on Industrial Informatics 17(6), 4188–4196 (2020)
Xiao, L., Jiang, D., Chen, Y., Su, W., Tang, Y.: Reinforcement-learning-based relay mobility and power allocation for underwater sensor networks against jamming. IEEE Journal of Oceanic Engineering 45(3), 1148–1156 (2019)
Xue, G., Zhong, M., Li, J., Chen, J., Zhai, C., Kong, R.: Dynamic network embedding survey. arXiv:2103.15447 (2021)
Yan, F.Y., Ma, J., Hill, G.D., Raghavan, D., Wahby, R.S., Levis, P., Winstein, K.: Pantheon: the training ground for internet congestion-control research. In: Proc. USENIX Annual Technical Conference, pp. 731–743 (2018)
Yin, J., Tang, M., Cao, J., Wang, H., You, M., Lin, Y.: Vulnerability exploitation time prediction: an integrated framework for dynamic imbalanced learning. World Wide Web pp. 1–23 (2021)
Yin, H., Yang, S., Song, X., Liu, W., Li, J.: Deep fusion of multimodal features for social media retweet time prediction. World Wide Web pp. 1–18 (2020)
Acknowledgements
This work was supported by the National Natural Science Foundation of China (No. U20A20177, 61772377,91746206), the Fundamental Research Funds for the Central Universities(2042020kf0217), and the Science and Technology planning project of Shenzhen(JCYJ202103243002197).
Author information
Authors and Affiliations
Corresponding authors
Additional information
This article belongs to the Topical Collection: Special Issue on Decision Making in Heterogeneous Network Data Scenarios and Applications
Guest Editors: Jianxin Li, Chengfei Liu, Ziyu Guan, and Yinghui Wu
Rights and permissions
About this article
Cite this article
Xia, Z., Wu, L., Wang, F. et al. Glider: rethinking congestion control with deep reinforcement learning. World Wide Web 26, 115–137 (2023). https://doi.org/10.1007/s11280-022-01018-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-022-01018-1