Abstract
Currently, existing research on deploying deep reinforcement learning on software-defined networks (SDN) to achieve route optimization does not consider the network’s spatial–temporal correlation globally and has yet to reach the ultimate in performance. Given the above issues, this study proposes a Proximal Policy Optimization algorithm based on the Attention mechanism and Spatio–Temporal correlation (ASTPPO) to optimize the SDN routing issue. First, we extract temporal and spatial correlation features in state information using Gated Recurrent Units (GRU) and Graph Attention Networks (GAT), providing implicit information containing more environments for reinforcement learning decisions. Second, we use the skip-connect method to connect implicit and directly related information into a multi-layer perceptron, improving the model's learning efficiency and perceptual ability. Finally, we demonstrate the effectiveness of ASTPPO through static and dynamic traffic experiments. Benefitting from Spatio–Temporal correlation learning with a global view, ASTPPO performs better load balancing and congestion control under different traffic intensity requirements and network topologies than other reinforcement learning baseline algorithms. The simulation results show that the ASTPPO algorithm improved by 9.02% and 15.07%, respectively, compared with the second-best algorithm in static and dynamic traffic scenarios.
Similar content being viewed by others
Data availability
The authors confirm that the data supporting the findings of this study are available within the article.
References
Bernárdez G, Suárez-Varela J, López A, Wu B, Cabellos-Aparicio A (2021) Is machine learning ready for traffic engineering optimization? IEEE Int Conf Netw Protocols (ICNP 2021) 1–11. https://doi.org/10.1109/ICNP52444.2021.9651930
Freeman LC, Borgatti SP, White DR (1991) Centrality in valued graphs: A measure of betweenness based on network flow. Soc Netw 13(2):141–154. https://doi.org/10.1016/0378-8733(91)90017-N
Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing atari with deep reinforcement learning. arXiv preprint, arXiv:1312.5602
Lillicrap T, Hunt J, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2016) Continuous control with deep reinforcement learning. Int Conf Learn Represent (ICLR 2016) 187–200. https://doi.org/10.1016/S1098-3015(10)67722-4
Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv preprint, arXiv:1707.06347
Khan A, Zafrullah M, Hussain M, Ahmad A (2017) Performance analysis of OSPF and hybrid networks. Int Symp Wirel Syst Netw (ISWSN 2017) 1–4. https://doi.org/10.1109/ISWSN.2017.8250022
Chiesa M, Kindler G, Schapira M (2017) Traffic engineering with Equal-Cost-Multipath: An algorithmic perspective. IEEE/ACM Trans Netw 25(2):779–792. https://doi.org/10.1109/TNET.2016.2614247
Michael N, Tang A (2015) Halo: Hop-by-hop adaptive link-state optimal routing. IEEE/ACM Trans Netw 23(6):1862–1875. https://doi.org/10.1109/TNET.2014.2349905
Gurusamy U, Hariharan K, Manikandan M (2020) Path optimization of box-covering based routing to minimize average packet delay in software defined network. Peer-to-Peer Netw Appl 13:932–939. https://doi.org/10.1007/s12083-019-00855-8
Alidadi A, Arab S, Askari T (2022) A novel optimized routing algorithm for QoS traffic engineering in SDN-based mobile networks. ICT Express 8(1):130–134. https://doi.org/10.1016/j.icte.2021.12.010
Deng G, Wang K (2018) An application-aware QoS routing algorithm for SDN-based IoT networking. IEEE Symp Comput Commun (ISCC 2018) 186–191. https://doi.org/10.1109/ISCC.2018.8538551
Park J, Hwang J, Yeom K (2019) NSAF: An approach for ensuring application-aware routing based on network QoS of applications in SDN. Mob Inf Syst 2019:3971598. https://doi.org/10.1155/2019/3971598
Moufakir T, Zhani MF, Gherbi A, Bouachir O (2021) Collaborative multi-domain routing in SDN environments. J Netw Syst Manag 30(1):1–23. https://doi.org/10.1007/s10922-021-09638-0
Shirmarz A, Ghaffari A (2021) A novel flow routing algorithm based on non-dominated ranking and crowd distance sorting to improve the performance in SDN. Photonic Netw Commun 42:167–183. https://doi.org/10.1007/s11107-021-00951-x
Naseri TS, Gharehchopogh FS (2022) A feature selection based on the farmland fertility algorithm for improved intrusion detection systems. J Netw Syst Manag 30(3):1–27. https://doi.org/10.1007/s10922-022-09653-9
Mohammadzadeh H, Gharehchopogh FS (2021) Feature selection with binary symbiotic organisms search algorithm for email spam detection. Int J Inf Technol Decis Mak 20(01):469–515. https://doi.org/10.1142/S0219622020500546
Bao K, Matyjas JD, Hu F, Kumar S (2018) Intelligent software-defined mesh networks with link-failure adaptive traffic balancing. IEEE Trans Cognit Commun Netw 4(2):266–276. https://doi.org/10.1109/TCCN.2018.2790974
Huang R, Chu X, Zhang J, Hu Y (2015) Energy-efficient monitoring in software defined wireless sensor networks using reinforcement learning: A prototype. Int J Distrib Sens Netw 11(10):360428. https://doi.org/10.1155/2015/360428
Al-Jawad A, Trestian R, Shah P, Gemikonakli O (2015) BaProbSDN: A probabilistic-based QoS routing mechanism for software defined networks. Proc IEEE Conf Netw Softw (NetSoft). https://doi.org/10.1109/NETSOFT.2015.7116128
Reza M, Sobouti M, Raouf S, Javidan R (2017) Network traffic classification using machine learning techniques over software defined networks. Int J Adv Comput Sci Appl 8(7):220–225. https://doi.org/10.14569/IJACSA.2017.080729
Tang F, Mao B, Fadlullah ZM, Kato N, Mizutani K (2017) On removing routing protocol from future wireless networks: A real-time deep learning approach for intelligent traffic control. IEEE Wirel Commun 25(1):154–160. https://doi.org/10.1109/MWC.2017.1700244
Tang F, Fadlullah ZM, Mao B, Kato N (2018) An intelligent traffic load prediction-based adaptive channel assignment algorithm in SDN-IoT: A deep learning approach. IEEE Internet Things J 5(6):5141–5154. https://doi.org/10.1109/JIOT.2018.2838574
Huang R, Ma L, Zhai G, He J, Chu X, Yan H (2020) Resilient routing mechanism for wireless sensor networks with deep learning link reliability prediction. IEEE Access 8:64857–64872. https://doi.org/10.1109/ACCESS.2020.2984593
Rusek K, Suárez-Varela J, Almasan P, Barlet-Ros P, Cabellos-Aparicio A (2020) RouteNet: Leveraging Graph Neural Networks for network modeling and optimization in SDN. IEEE J Sel Areas Commun 38(10):2260–2270. https://doi.org/10.1109/JSAC.2020.3000405
Chen J, Wang Y, Huang X, Xie X, Zhang H, Lu X (2022) ALBLP: adaptive load-balancing architecture based on link-state prediction in software-defined networking. Wirel Commun Mob Comput 2022:8354150. https://doi.org/10.1155/2022/8354150
Zhou Y, Cao T, Xiang W (2020) Anypath routing protocol design via Q-learning for underwater sensor networks. IEEE Internet Things J 8(10):8173–8190. https://doi.org/10.1109/JIOT.2020.3042901
Huang R, Chu X, Zhang J, Hu Y, Yan H (2019) A machine-learning-enabled context-driven control mechanism for software-defined smart home networks. Sens Mater 31(6):2103–2129. https://doi.org/10.18494/SAM.2019.2298
Casas-Velasco D, Rendon O (2021) Fonseca (2021) Intelligent routing based on reinforcement learning for software-defined networking. IEEE Trans Netw Serv Manag 18(1):870–881. https://doi.org/10.1109/TNSM.2020.3036911
Razzaque MA, Ahmed MHU, Hong CS, Lee S (2014) QoS-aware distributed adaptive cooperative routing in wireless sensor networks. Ad Hoc Netw 19:28–42. https://doi.org/10.1016/j.adhoc.2014.02.002
Keerthika A, Hency V (2022) Reinforcement-Learning based energy efficient optimized routing protocol for WSN. Peer-to-Peer Netw Appl 15:1685–1704. https://doi.org/10.1007/s12083-022-01315-6
Luong NC, Hoang DT, Gong S, Niyato D, Dong I (2019) Applications of deep reinforcement learning in communications and networking: A survey. IEEE Commun Surv Tutor 21(4):3133–3174. https://doi.org/10.1109/COMST.2019.2916583
Yao H, Mai T, Xu X, Zhang P, Liu M, Li Y (2018) NetworkAI: An intelligent network architecture for self-learning control strategies in software defined networks. IEEE Internet Things J 5(6):4319–4327. https://doi.org/10.1109/JIOT.2018.2859480
Meng Z, Wang M, Bai J, Xu M, Mao H, Hu H (2020) Interpreting deep learning-based networking systems. SIGCOMM 154–171. https://doi.org/10.1145/3387514.3405859
Fu Q, Sun E, Meng K, Li M, Zhang Y (2020) Deep Q-learning for routing schemes in SDN-based data center networks. IEEE Access 8:103491–103499. https://doi.org/10.1109/ACCESS.2020.2995511
Chen YR, Rezapour A, Tzeng WG, Tsai S (2020) RL-routing: An sdn routing algorithm based on deep reinforcement learning. IEEE Trans Netw Sci Eng 7(4):3185–3199. https://doi.org/10.1109/TNSE.2020.3017751
Chen J, Xiao W, Li X, Zheng Y, Huang X, Huang D, Wang M (2022) A routing optimization method for software-defined optical transport networks based on ensembles and reinforcement learning. Sensors 22(21):8139. https://doi.org/10.3390/s22218139
W. Liu (2019) Intelligent routing based on deep reinforcement learning in software-defined data-center networks. IEEE Symp Comput Commun (ISCC 2019), Barcelona, Spain, 29 June 2019 - 03 July 2019. https://doi.org/10.1109/ISCC47284.2019.8969579
Chen J, Wang Y, Ou J, Fan C, Lu X, Liao C, Huang X, Zhang H (2022) ALBRL: Automatic load-balancing architecture based on reinforcement learning in software-defined networking. Wirel Commun Mob Comput 2022:3866143. https://doi.org/10.1155/2022/3866143
Huang X, Yuan T, Qiao G, Ren Y (2018) Deep reinforcement learning for multimedia traffic control in software defined networking. IEEE Network 32(6):35–41. https://doi.org/10.1109/MNET.2018.1800097
Guo X, Lin H, Li Z, Peng M (2019) Deep-reinforcement-learning-based QoS-aware secure routing for SDN-IoT. IEEE Internet Things J 7(7):6242–6251. https://doi.org/10.1109/JIOT.2019.2960033
Xu Z, Tang J, Meng J, Zhang W, Yang D (2018) Experience-driven networking: A deep reinforcement learning based approach. IEEE INFOCOM 2018-IEEE Conf Comput Commun 1871–1879. https://doi.org/10.1109/INFOCOM.2018.8485853
Zhang J, Ye M, Guo Z, Yen C, Chao H (2020) CFR-RL: Traffic engineering with reinforcement learning in SDN. IEEE J Sel Areas Commun 38(10):2249–2259. https://doi.org/10.1109/JSAC.2020.3000371
Suárez-Varela J, Mestres A, Yu J, Kuang L, Barlet-Ros P (2019) Routing in optical transport networks with deep reinforcement learning. J Opt Commun Netw 11(11):547–558. https://doi.org/10.1364/JOCN.11.000547
Sun P, Guo Z, Li J, Xu Y, Lan J, Hu Y (2021) Enabling scalable routing in software-defined networks with deep reinforcement learning on critical nodes. IEEE/ACM Trans Netw 30(2):629–640. https://doi.org/10.1109/TNET.2021.3126933
Sun P, Guo Z, Lan J, Li J, Hu Y, Baker T (2021) ScaleDRL: A scalable deep reinforcement learning approach for traffic engineering in SDN with pinning control. Computer Netw 190:107891. https://doi.org/10.1016/j.comnet.2021.107891
Che X, Kang W, Ouyang Y, Yang K, Li J (2021) SDN routing optimization algorithm based on reinforcement learning. Comput Eng Appl 57(12):93–98. https://doi.org/10.3778/j.issn.1002-8331.2003-0423
Sun P, Hu Y, Lan J, Le T, Chen M (2019) TIDE: Time-relevant deep reinforcement learning for routing optimization. Futur Gener Comput Syst 99:401–409. https://doi.org/10.1016/j.future.2019.04.014
Lan J, Zhang X, Hu Y, Sun P (2019) Software-defined networking QoS optimization based on deep reinforcement learning. J Commun 40(12):60–67. https://doi.org/10.11959/j.issn.1000−436x.2019227
Pham T A Q, Hadjadj-Aoul Y, Outtagarts A (2019) Deep reinforcement learning based qos-aware routing in knowledge-defined networking. International Conference on Heterogeneous Networking for Quality, Reliability, Security and Robustness. Springer, Cham:14–26. https://doi.org/10.1007/978-3-030-14413-5_2
Huang R, Guan W, Zhai G, He J, Chu X (2022) Deep graph reinforcement learning based intelligent traffic routing control for software-defined wireless sensor networks. Appl Sci 12(4):1951. https://doi.org/10.3390/app12041951
Hei X, Zhang J, Bensaou B, Cheung C (2004) Wavelength converter placement in least-load-routing-based optical networks using genetic algorithms. J Opt Netw 3(5):363–378. https://doi.org/10.1364/JON.3.000363
Pedro J, Santos J, Pires J (2011) Performance evaluation of integrated OTN/DWDM networks with single-stage multiplexing of optical channel data units. Int Conf Transp Opt Netw. https://doi.org/10.1109/ICTON.2011.5970940
Abdollahzadeh B, Gharehchopogh F, Mirjalili S (2021) Artificial gorilla troops optimizer: a new nature-inspired metaheuristic algorithm for global optimization problems. Int J Intell Syst 36(10):5887–5958. https://doi.org/10.1002/int.22535
Funding
This work is supported by the National Natural Science Foundation of China (grant numbers 61861013), the major program of Guangxi Natural Science Foundation (grant numbers 2020GXNSFDA238001), and the Middle-aged and Young Teachers' Basic Ability Promotion Project of Guangxi (grant numbers 2020KY05033).
Author information
Authors and Affiliations
Contributions
Conceptualization: Junyan Chen and Yong Wang; Methodology: Junyan Chen, Yong Wang and Xuefeng Huang; Software: Xuefeng Huang, Hongmei Zhang and Junyan Chen; Validation: Junyan Chen and Xuefeng Huang; Formal analysis: Junyan Chen and Hongmei Zhang; Investigation: Junyan Chen and Xuefeng Huang; Resources: Xinmei Li; Data curation: Junyan Chen; Writing—original draft preparation: Junyan Chen, Cenhuishan Liao and Xiaolan Xie; Writing—review and editing: Junyan Chen, Hongmei Zhang, Cenhuishan Liao and Xiaolan Xie; Visualization: Wei Xiao and Xinmei Li; Supervision: Yong Wang; Project administration: Yong Wang; Funding acquisition: Yong Wang, Hongmei Zhang and Junyan Chen. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Ethics approval
Not applicable.
Consent to publish
All authors confirm that neither the article nor any parts of its content are currently under consideration or published in another journal. The authors agree to publication in the journal.
Conflict of interest
The authors declare that they have no conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, J., Huang, X., Wang, Y. et al. ASTPPO: A proximal policy optimization algorithm based on the attention mechanism and spatio–temporal correlation for routing optimization in software-defined networking. Peer-to-Peer Netw. Appl. 16, 2039–2057 (2023). https://doi.org/10.1007/s12083-023-01489-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12083-023-01489-7