Abstract
Device-to-device (D2D) communication has been regarded as a promising solution to alleviate the mobile traffic explosion problem for its capabilities of improving system data rate and resource utilization. A reconfigurable intelligent surface (RIS) aided mobile D2D communications framework is investigated, where the RIS is deployed to improve communication quality. As the transmission distance of D2D pairs changes, the mode selection for D2D pairs and the phase shift design for RIS is essential for mobile scenarios. Therefore, we formulate a joint optimization problem of mode selection, channel assignment, power allocation, and discrete phase shift selection to maximize the average sum data rate of D2D pairs. This problem is also constrained by the maximum transmit power and the minimum data rate requirements of users, where the latter is to guarantee the fairness of D2D pairs. We first reformulate the original sequential decision-making problem into a Markov game (MG) problem to solve the challenging optimization. Furthermore, a multi-agent deep reinforcement learning (MADRL) framework is proposed, in which multiple agents cooperatively determine the joint mode selection and resource allocation strategy. The proposed MADRL-based framework combines both the multi-pass deep Q-networks (MP-DQN) algorithm and the decaying DQN algorithm to solve the optimization problem. Specifically, we adopt the MP-DQN algorithm for D2D pairs to handle the hybrid discrete-continuous action space. Moreover, the decaying DQN algorithm is invoked by the RIS agent to select discrete phase shifts. Simulation results demonstrate that the proposed algorithm can converge under different cases. The proposed MADRL-based algorithm outperforms the combination algorithm of DQN and the deep deterministic policy gradient (DDPG) in terms of system performance. Moreover, it is also shown that the average sum data rate of D2D pairs can be significantly improved by deploying the RIS and further enhanced by increasing the number of reflecting elements (REs).
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-023-08745-0/MediaObjects/521_2023_8745_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-023-08745-0/MediaObjects/521_2023_8745_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-023-08745-0/MediaObjects/521_2023_8745_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-023-08745-0/MediaObjects/521_2023_8745_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-023-08745-0/MediaObjects/521_2023_8745_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-023-08745-0/MediaObjects/521_2023_8745_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-023-08745-0/MediaObjects/521_2023_8745_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-023-08745-0/MediaObjects/521_2023_8745_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00521-023-08745-0/MediaObjects/521_2023_8745_Fig9_HTML.png)
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
Data sharing are not applicable to this article as no datasets were generated or analyzed during the current study.
References
Abouelmagd EI, Awad ME, Elzayat EMA, Abbas IA (2014) Reduction the secular solution to periodic solution in the generalized restricted three-body problem. Astrophys Space Sci 350:495–505. https://doi.org/10.1007/s10509-013-1756-z
Alkhayyat A, Hammood DA, Mahmoud MS (2020) Transmission mode selection for reliable critical data transmission. In: 2020 3rd International conference on engineering technology and its applications (IICETA), pp 236–240. https://doi.org/10.1109/IICETA50496.2020.9318796
Bester CJ, James SD, Konidaris GD (2019) Multi-pass Q-networks for deep reinforcement learning with parameterised action spaces. arXiv:1905.04388
Bi Z, Zhou W (2020) Deep reinforcement learning based power allocation for D2D network. In: 2020 IEEE 91st vehicular technology conference (VTC2020-Spring), pp 1–5. https://doi.org/10.1109/VTC2020-Spring48590.2020.9129537
Chen Y, Ma C (2020) Overview of D2D communication technology under 5G cellular network coverage. In: 2020 IEEE 6th international conference on computer and communications (ICCC), pp 1297–1301. https://doi.org/10.1109/ICCC51575.2020.9344968
Chen C, Sung C, Chen H (2019) Capacity maximization based on optimal mode selection in multi-mode and multi-pair D2D communications. IEEE Trans Veh Technol 68(7):6524–6534. https://doi.org/10.1109/TVT.2019.2913987
Chen Y, Ai B, Zhang H, Niu Y, Song L, Han Z, Vincent Poor H (2021) Reconfigurable intelligent surface assisted device-to-device communications. IEEE Trans Wirel Commun 20(5):2792–2804. https://doi.org/10.1109/TWC.2020.3044302
Chen J, Guo L, Jia J, Shang J, Wang X (2022) Resource allocation for IRS assisted SGF NOMA transmission: a MADRL approach. IEEE J Sel Areas Commun 40(4):1302–1316. https://doi.org/10.1109/JSAC.2022.3144726
Chen J, Ma Z, Liu Y, Jia J, Wang X (2022) Energy efficient resource allocation for MSCA enabled CoMP in hetnets. IEEE Trans Veh Technol 71(3):2965–2978. https://doi.org/10.1109/TVT.2022.3142075
Chen J, Xie Y, Mu X, Jia J, Liu Y, Wang X (2022) Energy efficient resource allocation for IRS assisted CoMP systems. IEEE Trans Wirel Commun 21(7):5688–5702. https://doi.org/10.1109/TWC.2022.3142784
Cheng N, Zhou H, Lei L, Zhang N, Zhou Y, Shen X, Bai F (2017) Performance analysis of vehicular device-to-device underlay communication. IEEE Trans Veh Technol 66(6):5409–5421. https://doi.org/10.1109/TVT.2016.2627582
Dai Y, Sheng M, Liu J, Cheng N, Shen X, Yang Q (2019) Joint mode selection and resource allocation for D2D-enabled NOMA cellular networks. IEEE Trans Veh Technol 68(7):6721–6733. https://doi.org/10.1109/TVT.2019.2916395
Du B, Liu Y, Atiatallah Abbas I (2016) Existence and asymptotic behavior results of periodic solution for discrete-time neutral-type neural networks. J Frankl Inst 353(2):448–461. https://doi.org/10.1016/j.jfranklin.2015.11.013
Gong S, Lu X, Hoang DT, Niyato D, Shu L, Kim DI, Liang YC (2020) Toward smart wireless communications via intelligent reflecting surfaces: a contemporary survey. IEEE Commun Surv Tutor 22(4):2283–2314. https://doi.org/10.1109/COMST.2020.3004197
Gu B, Zhang X, Lin Z, Alazab M (2021) Deep multiagent reinforcement-learning-based resource allocation for internet of controllable things. IEEE Internet of Things J 8(5):3066–3074. https://doi.org/10.1109/JIOT.2020.3023111
He Y, Ren J, Yu G, Cai Y (2019) D2D communications meet mobile edge computing for enhanced computation capacity in cellular networks. IEEE Trans Wirel Commun 18(3):1750–1763. https://doi.org/10.1109/TWC.2019.2896999
Huang J, Yang Y, He G, Xiao Y, Liu J (2021) Deep reinforcement learning-based dynamic spectrum access for D2D communication underlay cellular networks. IEEE Commun Lett 25(8):2614–2618. https://doi.org/10.1109/LCOMM.2021.3079920
Ji Z, Qin Z (2020) Reconfigurable intelligent surface enhanced device-to-device communications. In: GLOBECOM 2020—2020 IEEE global communications conference, pp 1–6. https://doi.org/10.1109/GLOBECOM42002.2020.9322411
Jia J, Deng Y, Chen J, Aghvami AH, Nallanathan A (2017) Availability analysis and optimization in CoMP and CA-enabled HetNets. IEEE Trans Commun 65(6):2438–2450. https://doi.org/10.1109/TCOMM.2017.2679747
Jiang W, Feng G, Qin S, Yum TSP, Cao G (2019) Multi-agent reinforcement learning for efficient content caching in mobile D2D networks. IEEE Trans Wirel Commun 18(3):1610–1622. https://doi.org/10.1109/TWC.2019.2894403
Khalid W, Yu H, Do DT, Kaleem Z, Noh S (2021) RIS-aided physical layer security with full-duplex jamming in underlay D2D networks. IEEE Access 9:99667–99679. https://doi.org/10.1109/ACCESS.2021.3095852
Khoshafa MH, Ngatched TMN, Ahmed MH (2021) Reconfigurable intelligent surfaces-aided physical layer security enhancement in D2D underlay communications. IEEE Commun Lett 25(5):1443–1447. https://doi.org/10.1109/LCOMM.2020.3046946
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: 3rd International conference for learning representations (ICLR). arXiv:1412.6980
Lien S, Chien C, Tseng F, Ho T (2016) 3GPP device-to-device communications for beyond 4G cellular networks. IEEE Commun Mag 54(3):29–35. https://doi.org/10.1109/MCOM.2016.7432168
Liu Y, Liu W, Obaid MA, Abbas IA (2016) Exponential stability of Markovian jumping Cohen–Grossberg neural networks with mixed mode-dependent time-delays. Neurocomputing 177:409–415. https://doi.org/10.1016/j.neucom.2015.11.046
Liu T, Feng L, Li W, Yang Z (2021) Radio resource allocation for RIS-aided D2D communication based on greedy hypergraph-with-weight coloring. In: 2021 22nd Asia-pacific network operations and management symposium (APNOMS), pp 84–89. https://doi.org/10.23919/APNOMS52696.2021.9562507
Liu Y, Liu X, Mu X, Hou T, Xu J, Di Renzo M, Al-Dhahir N (2021) Reconfigurable intelligent surfaces: principles and opportunities. IEEE Commun Surv Tutor 23(3):1546–1577. https://doi.org/10.1109/COMST.2021.3077737
Mach P, Becvar Z, Najla M (2019) Resource allocation for D2D communication with multiple D2D pairs reusing multiple channels. IEEE Wirel Commun Lett 8(4):1008–1011. https://doi.org/10.1109/LWC.2019.2903798
Pradhan C, Li A, Song L, Li J, Vucetic B, Li Y (2020) Reconfigurable intelligent surface (RIS)-enhanced two-way OFDM communications. IEEE Trans Veh Technol 69(12):16270–16275. https://doi.org/10.1109/TVT.2020.3038942
Sun Y, Peng M, Mao S (2019) Deep reinforcement learning-based mode selection and resource management for green fog radio access networks. IEEE Internet of Things J 6(2):1960–1971. https://doi.org/10.1109/JIOT.2018.2871020
Tan J, Liang YC, Zhang L, Feng G (2021) Deep reinforcement learning for joint channel selection and power control in D2D networks. IEEE Trans Wirel Commun 20(2):1363–1378. https://doi.org/10.1109/TWC.2020.3032991
Tang H, Ding Z (2016) Mixed mode transmission and resource allocation for D2D communication. IEEE Trans Wirel Commun 15(1):162–175. https://doi.org/10.1109/TWC.2015.2468725
Tao Q, Wang J, Zhong C (2020) Performance analysis of intelligent reflecting surface aided communication systems. IEEE Commun Lett 24(11):2464–2468. https://doi.org/10.1109/LCOMM.2020.3011843
Tehrani MN, Uysal M, Yanikomeroglu H (2014) Device-to-device communication in 5G cellular networks: challenges, solutions, and future directions. IEEE Commun Mag 52(5):86–92. https://doi.org/10.1109/MCOM.2014.6815897
Wang X, Zhang Y, Shen R, Xu Y, Zheng FC (2020) DRL-based energy-efficient resource allocation frameworks for uplink NOMA systems. IEEE Internet of Things J 7(8):7279–7294. https://doi.org/10.1109/JIOT.2020.2982699
Wu Q, Zhang R (2020) Towards smart and reconfigurable environment: intelligent reflecting surface aided wireless network. IEEE Commun Mag 58(1):106–112. https://doi.org/10.1109/MCOM.001.1900107
Wu Q, Zhang R (2020) Towards smart and reconfigurable environment: intelligent reflecting surface aided wireless network. IEEE Commun Mag 58(1):106–112. https://doi.org/10.1109/MCOM.001.1900107
Wu Y, Yu W, Griffith D, Golmie N (2020) Modeling and performance assessment of dynamic rate adaptation for M2M communications. IEEE Trans Netw Sci Eng 7(1):285–303. https://doi.org/10.1109/TNSE.2018.2869093
Xiang H, Yang Y, He G, Huang J, He D (2022) Multi-agent deep reinforcement learning-based power control and resource allocation for D2D communications. IEEE Wirel Commun Lett 11(8):1659–1663. https://doi.org/10.1109/LWC.2022.3170998
Xu X, Zhang Y, Sun Z, Hong Y, Tao X (2016) Analytical modeling of mode selection for moving D2D-enabled cellular networks. IEEE Commun Lett 20(6):1203–1206. https://doi.org/10.1109/LCOMM.2016.2552171
Yang H, Ye Y, Chu X, Dong M (2020) Resource and power allocation in SWIPT-enabled device-to-device communications based on a nonlinear energy harvesting model. IEEE Internet of Things J 7(11):10813–10825. https://doi.org/10.1109/JIOT.2020.2988512
Yang Y, Zheng B, Zhang S, Zhang R (2020) Intelligent reflecting surface meets OFDM: protocol design and rate maximization. IEEE Trans Commun 68(7):4522–4535. https://doi.org/10.1109/TCOMM.2020.2981458
Yang G, Liao Y, Liang YC, Tirkkonen O, Wang G, Zhu X (2021) Reconfigurable intelligent surface empowered device-to-device communication underlaying cellular networks. IEEE Trans Commun 69(11):7790–7805. https://doi.org/10.1109/TCOMM.2021.3102640
Zeng S, Zhang H, Di B, Han Z, Song L (2021) Reconfigurable intelligent surface (RIS) assisted wireless coverage extension: RIS orientation and location optimization. IEEE Commun Lett 25(1):269–273. https://doi.org/10.1109/LCOMM.2020.3025345
Zhong R, Liu X, Liu Y, Chen Y (2022) Multi-agent reinforcement learning in NOMA-aided UAV networks for cellular offloading. IEEE Trans Wirel Commun 21(3):1498–1512. https://doi.org/10.1109/TWC.2021.3104633
Zhou G, Pan C, Ren H, Wang K, Nallanathan A (2020) A framework of robust transmission design for IRS-aided MISO communications with imperfect cascaded channels. IEEE Trans Signal Process 68:5092–5106. https://doi.org/10.1109/TSP.2020.3019666
Zhou H, Wu T, Zhang H, Wu J (2021) Incentive-driven deep reinforcement learning for content caching and D2D offloading. IEEE J Sel Areas Commun 39(8):2445–2460. https://doi.org/10.1109/JSAC.2021.3087232
Zhu K, Hossain E (2015) Joint mode selection and spectrum partitioning for device-to-device communication: a dynamic Stackelberg game. IEEE Trans Wirel Commun 14(3):1406–1420
Funding
This work was supported in part by the National Natural Science Foundation of China under Grants No. 61972079, 62172084, 62132004, in part by the Major Research Plan of National Natural Science Foundation of China under Grant No. 92167103, in part by the LiaoNing Revitalization Talents Program under Grant No. XLYC2007162, in part by the LiaoNing Key Research and Development Program under Grant No. 2023JH2/101300196, in part by the Central Government Guided Local Science and Technology Development Fund Project under Grant No. 2020ZY0003, in part by the Science and Technology Plan Project of Inner Mongolia Autonomous Region of China under Grant No. 2020GG0189, in part by the Fundamental Research Funds for the Central Universities under Grants No. N2224001-7, N2216009, N2216006, N2116004.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We promise that this manuscript is the authors’ original work and has not been published nor has it been submitted simultaneously elsewhere. All authors have checked the manuscript and have agreed to the submission.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Guo, L., Jia, J., Chen, J. et al. Deep reinforcement learning empowered joint mode selection and resource allocation for RIS-aided D2D communications. Neural Comput & Applic 35, 18231–18249 (2023). https://doi.org/10.1007/s00521-023-08745-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08745-0