Abstract
Edge artificial intelligence will empower the ever simple industrial wireless networks (IWNs) supporting complex and dynamic tasks by collaboratively exploiting the computation and communication resources of both machine-type devices (MTDs) and edge servers. In this paper, we propose a multi-agent deep reinforcement learning based resource allocation (MADRL-RA) algorithm for end-edge orchestrated IWNs to support computation-intensive and delay-sensitive applications. First, we present the system model of IWNs, wherein each MTD is regarded as a self-learning agent. Then, we apply the Markov decision process to formulate a minimum system overhead problem with joint optimization of delay and energy consumption. Next, we employ MADRL to defeat the explosive state space and learn an effective resource allocation policy with respect to computing decision, computation capacity, and transmission power. To break the time correlation of training data while accelerating the learning process of MADRL-RA, we design a weighted experience replay to store and sample experiences categorically. Furthermore, we propose a step-by-step ε-greedy method to balance exploitation and exploration. Finally, we verify the effectiveness of MADRL-RA by comparing it with some benchmark algorithms in many experiments, showing that MADRL-RA converges quickly and learns an effective resource allocation policy achieving the minimum system overhead.
摘要
边缘人工智能通过协同利用设备侧和边缘侧有限的网络、计算资源,赋能工业无线网络以支持复杂和动态工业任务。面向资源受限的工业无线网络,我们提出一种基于多智能体深度强化学习的资源分配(MADRL-RA)算法,实现了端边协同资源分配,支持计算密集型、时延敏感型工业应用。首先,建立了端边协同的工业无线网络系统模型,将具有感知能力的工业设备作为自学习的智能代理。然后,采用马尔可夫决策过程对端边资源分配问题进行形式化描述,建立关于时延和能耗联合优化的最小系统开销问题。接着,利用多智能体深度强化学习克服状态空间维灾,同时学习关于计算决策、算力分配和传输功率的有效资源分配策略。为了打破训练数据的时间相关性,同时加速MADRL-RA学习过程,设计了一种带经验权重的经验回放方法,对经验进行分类存储和采样。在此基础上,提出步进的ε-贪婪方法来平衡智能代理对经验的利用与探索。最后,通过大量对比实验,验证了MADRL-RA算法相较于多种基线算法的有效性。实验结果表明,MADRL-RA收敛速度快,能够学习到有效资源分配策略以实现最小系统开销。
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Alfakih T, Hassan MM, Gumaei A, et al., 2020. Task offloading and resource allocation for mobile edge computing by deep reinforcement learning based on SARSA. IEEE Access, 8:54074–54084. https://doi.org/10.1109/access.2020.2981434
Cao ZL, Zhou P, Li RX, et al., 2020. Multiagent deep reinforcement learning for joint multichannel access and task offloading of mobile-edge computing in Industry 4.0. IEEE Int Things J, 7(7):6201–6213. https://doi.org/10.1109/jiot.2020.2968951
Chen Y, Liu ZY, Zhang YC, et al., 2021. Deep reinforcement learning-based dynamic resource management for mobile edge computing in industrial Internet of Things. IEEE Trans Ind Inform, 17(7):4925–4934. https://doi.org/10.1109/tii.2020.3028963
Chu TS, Wang J, Codecà L, et al., 2020. Multi-agent deep reinforcement learning for large-scale traffic signal control. IEEE Trans Intell Transp Syst, 21(3):1086–1095. https://doi.org/10.1109/tits.2019.2901791
Dai YY, Zhang K, Maharjan S, et al., 2020. Edge intelligence for energy-efficient computation offloading and resource allocation in 5G beyond. IEEE Trans Veh Technol, 69(10):12175–12186. https://doi.org/10.1109/tvt.2020.3013990
Feng J, Pei QQ, Yu FR, et al., 2019. Computation offloading and resource allocation for wireless powered mobile edge computing with latency constraint. IEEE Wirel Commun Lett, 8(5):1320–1323. https://doi.org/10.1109/lwc.2019.2915618
Foerster JN, Assael YM, de Freitas N, et al., 2016. Learning to communicate with deep multi-agent reinforcement learning. Proc Advances in Neural Information Processing Systems 29, p.2137–2145.
Guo JF, Song ZZ, Cui Y, et al., 2017. Energy-efficient resource allocation for multi-user mobile edge computing. Proc IEEE Global Communications Conf, p.1–7.
He XM, Lu HD, Du M, et al., 2021. QoE-based task offloading with deep reinforcement learning in edge-enabled Internet of Vehicles. IEEE Trans Intell Transp Syst, 22(4):2252–2261. https://doi.org/10.1109/tits.2020.3016002
Kumar M, Sharma SC, Goel A, et al., 2019. A comprehensive survey for scheduling techniques in cloud computing. J Netw Comput Appl, 143:1–33.
Li HL, Xu HT, Zhou CC, et al., 2020. Joint optimization strategy of computation offloading and resource allocation in multi-access edge computing environment. IEEE Trans Veh Technol, 69(9):10214–10226. https://doi.org/10.1109/tvt.2020.3003898
Lin CC, Deng DJ, Chih YL, et al., 2019. Smart manufacturing scheduling with edge computing using multiclass deep Q network. IEEE Trans Ind Inform, 15(7):4276–4284. https://doi.org/10.1109/tii.2019.2908210
Liu KH, Liao WJ, 2020. Intelligent offloading for multiaccess edge computing: a new actor-critic approach. Proc IEEE Int Conf on Communications, p.1–6.
Liu XY, Xu C, Yu HB, et al., 2021. Deep reinforcement learning-based multi-channel access for industrial wireless networks with dynamic multi-user priority. IEEE Trans Ind Inform, early access. https://doi.org/10.1109/TII.2021.3139349
Lowe R, Wu Y, Tamar A, et al., 2017. Multi-agent actor-critic for mixed cooperative-competitive environments. Proc Advances in Neural Information Processing Systems 30, p.6379–6390.
Lu HD, He XM, Du M, et al., 2020. Edge QoE: computation offloading with deep reinforcement learning for Internet of Things. IEEE Int Things J, 7(10):9255–9265. https://doi.org/10.1109/jiot.2020.2981557
Naparstek O, Cohen K, 2019. Deep multi-user reinforcement learning for distributed dynamic spectrum access. IEEE Trans Wirel Commun, 18(1):310–323. https://doi.org/10.1109/TWC.2018.2879433
Porambage P, Okwuibe J, Liyanage M, et al., 2018. Survey on multi-access edge computing for Internet of Things realization. IEEE Commun Surv Tut, 20(4):2961–2991. https://doi.org/10.1109/comst.2018.2849509
Rashid T, Samvelyan M, de Witt CS, et al., 2018. QMIX: monotonic value function factorisation for deep multiagent reinforcement learning. Proc 35th Int Conf on Machine Learning, p.4292–4301.
Ren YJ, Sun YH, Peng MG, 2021. Deep reinforcement learning based computation offloading in fog enabled industrial Internet of Things. IEEE Trans Ind Inform, 17:4978–4987. https://doi.org/10.1109/tii.2020.3021024
Schaul T, Quan J, Antonoglou I, et al., 2016. Prioritized experience replay. Proc 4th Int Conf on Learning Representations.
Shakarami A, Ghobaei-Arani M, Shahidinejad A, 2020. A survey on the computation offloading approaches in mobile edge computing: a machine learning-based perspective. Comput Netw, 182:107496. https://doi.org/10.1016/j.comnet.2020.107496
Tang L, He SB, 2018. Multi-user computation offloading in mobile edge computing: a behavioral perspective. IEEE Netw, 32(1):48–53. https://doi.org/10.1109/mnet.2018.1700119
Wang HN, Liu N, Zhang YY, et al., 2020. Deep reinforcement learning: a survey. Front Inform Technol Electron Eng, 21:1726–1744. https://doi.org/10.1631/FITEE.1900533
Wei YF, Yu FR, Song M, et al., 2019. Joint optimization of caching, computing, and radio resources for fog-enabled IoT using natural actor—critic deep reinforcement learning. IEEE Int Things J, 6(2):2061–2073. https://doi.org/10.1109/jiot.2018.2878435
Xiong X, Zheng K, Lei L, et al., 2020. Resource allocation based on deep reinforcement learning in IoT edge computing. IEEE J Sel Area Commun, 38(6):1133–1146. https://doi.org/10.1109/jsac.2020.2986615
Xu C, Zeng P, Yu HB, et al., 2021. WIA-NR: ultra-reliable low-latency communication for industrial wireless control networks over unlicensed bands. IEEE Netw, 35(1): 258–265. https://doi.org/10.1109/mnet.011.2000308
Yao XF, Zhou JJ, Lin YZ, et al., 2019. Smart manufacturing based on cyber-physical systems and beyond. J Intell Manuf, 30(8):2805–2817. https://doi.org/10.1007/s10845-017-1384-5
Yu HB, Zeng P, Xu C, 2021. Industrial wireless control networks: from WIA to the future. Engineering, early access. https://doi.org/10.1016/j.eng.2021.06.024
Zhang GL, Zhang WQ, Cao Y, et al., 2018. Energy-delay tradeoff for dynamic offloading in mobile-edge computing system with energy harvesting devices. IEEE Trans Ind Inform, 14(10):4642–4655. https://doi.org/10.1109/tii.2018.2843365
Zhang KQ, Yang ZR, Basar T, 2021. Decentralized multiagent reinforcement learning with networked agents: recent advances. Front Inform Technol Electron Eng, 22:802–814. https://doi.org/10.1631/FITEE.1900661
Zhang P, Peng MG, Cui SG, et al., 2022. Theory and techniques for “intellicise” wireless networks. Front Inform Technol Electron Eng, 23(1):1–4. https://doi.org/10.1631/FITEE.2210000
Zhang YM, Lan XL, Ren J, et al., 2020. Efficient computing resource sharing for mobile edge-cloud computing networks. IEEE/ACM Trans Netw, 28(3):1227–1240. https://doi.org/10.1109/tnet.2020.2979807
Zhu XY, Luo YY, Liu AF, et al., 2021. Multiagent deep reinforcement learning for vehicular computation offloading in IoT. IEEE Int Things J, 8(12):9763–9773. https://doi.org/10.1109/jiot.2020.3040768
Author information
Authors and Affiliations
Contributions
Xiaoyu LIU, Chi XU, and Haibin YU designed the research. Xiaoyu LIU processed the data and drafted the paper. Chi XU, Haibin YU, and Peng ZENG helped organize the paper. Xiaoyu LIU and Chi XU revised and finalized the paper.
Corresponding authors
Additional information
Compliance with ethics guidelines
Xiaoyu LIU, Chi XU, Haibin YU, and Peng ZENG declare that they have no conflict of interest.
Project supported by the National Key R&D Program of China (No. 2020YFB1710900), the National Natural Science Foundation of China (Nos. 62173322, 61803368, and U1908212), the China Postdoctoral Science Foundation (No. 2019M661156), and the Youth Innovation Promotion Association, Chinese Academy of Sciences (No. 2019202)
Rights and permissions
About this article
Cite this article
Liu, X., Xu, C., Yu, H. et al. Multi-agent deep reinforcement learning for end—edge orchestrated resource allocation in industrial wireless networks. Front Inform Technol Electron Eng 23, 47–60 (2022). https://doi.org/10.1631/FITEE.2100331
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.2100331
Key words
- Multi-agent deep reinforcement learning
- End—edge orchestrated
- Industrial wireless networks
- Delay
- Energy consumption