Abstract
We investigate the problem of resource block (RB) and power allocation jointly and in a distributed manner using game theoretic learning solutions, in an underlay device-to-device network where device pairs communicate directly with each other by reusing the spectrum allocated to the cellular users. We formulate the joint RB and power allocation as multi-agent learning problems with discrete strategy sets; and suggest partially distributed and fully distributed learning algorithms to determine the RB and power level to be used by each device pair. The partially distributed algorithms, viz., Fictitious Play and its variant Fading Memory Joint Strategy Fictitious Play with Inertia, achieve Nash Equilibrium (NE) of the sum-rate maximization game in a static wireless environment. The completely distributed and uncoupled Stochastic Learning Algorithm converges to pure strategy NE of the interference mitigation game in a time-varying radio environment. We provide proofs for the existence of NE and convergence of the learning algorithms to the NE. Performance of the proposed schemes are evaluated in log-normal, Rayleigh and Nakagami fading environments and compared with an existing hybrid scheme and a centralized scheme. The simulation results show that the partially distributed schemes give the same performance as the centralized scheme, and the fully distributed scheme gives similar performance as the hybrid scheme but with much reduced signaling and computation overhead.
Similar content being viewed by others
References
Andersin, M., Rosberg, Z., & Zander, J. (1998). Distributed discrete power control in cellular PCS. Wireless Personal Communications, 6(3), 211–231.
Andrews, J. G., Buzzi, S., Choi, W., Hanly, S. V., Lozano, A., Soong, A. C., et al. (2014). What will 5G be? IEEE Journal on selected areas in communications, 32(6), 1065–1082.
Asadi, A., Wang, Q., & Mancuso, V. (2014). A survey on device-to-device communication in cellular networks. IEEE Communications Surveys Tutorials, 16(4), 1801–1819. https://doi.org/10.1109/COMST.2014.2319555.
Bennis, M., Perlaza, S. M., Blasco, P., Han, Z., & Poor, H. V. (2013). Self-organization in small cell networks: A reinforcement learning approach. IEEE Transactions on Wireless Communications, 12(7), 3202–3212. https://doi.org/10.1109/TWC.2013.060513.120959.
ETSI T. (2011). 136 931 v9. 0.0,-LTE; Evolved Universal terrestrial Radio Access (E-UTRA); Radio Frequency (RF) requirements for LTE pico node B. Tech. rep., 3GPP TR 36.931 version 9.0. 0 Release 9), 2011. Online at: http://www.etsi.org/deliver/etsi_ts/136100_136199/136104/09.04.00_60/ts_136104v090400p.pdf. Accessed April 2017.
Fudenberg, D., & Tirole, J. (1991). Game theory, 1991. Cambridge, Massachusetts, 393, 12.
Gu, J., Bae, S. J., Choi, B. G., & Chung, M. Y. (2011). Dynamic power control mechanism for interference coordination of device-to-device communication in cellular networks. In: 2011 Third international conference on ubiquitous and future networks (ICUFN) (pp. 71–75). https://doi.org/10.1109/ICUFN.2011.5949138.
Han, K., Choi, S., Choi, S., et al. (2011). Game based self-organizing scheme for femtocell networks. In International conference on game theory for networks (pp. 57–75). Springer.
Han, T., Yin, R., Xu, Y., & Yu, G. (2012). Uplink channel reusing selection optimization for device-to-device communication underlaying cellular networks. In 2012 IEEE 23rd international symposium on personal, indoor and mobile radio communications—(PIMRC) (pp. 559–564). https://doi.org/10.1109/PIMRC.2012.6362848.
He, Y., Wang, F., & Wu, J. (2014). Resource management for device-to-device communications in heterogeneous networks using stackelberg game. International Journal of Antennas and Propagation. https://doi.org/10.1155/2014/395731.
Maghsudi, S., & Stanczak, S. (2015). Channel selection for network-assisted D2D communication via no-regret bandit learning with calibrated forecasting. IEEE Transactions on Wireless Communications, 14(3), 1309–1322. https://doi.org/10.1109/TWC.2014.2365803.
Maghsudi, S., & Stanczak, S. (2016). Hybrid centralized-distributed resource allocation for device-to-device communication underlaying cellular networks. IEEE Transactions on Vehicular Technology, 65(4), 2481–2495. https://doi.org/10.1109/TVT.2015.2423691.
Marden, J. R., Arslan, G., & Shamma, J. S. (2009). Joint strategy fictitious play with inertia for potential games. IEEE Transactions on Automatic Control, 54(2), 208–220.
Monderer, D., & Shapley, L. S. (1996a). Fictitious play property for games with identical interests. Journal of economic theory, 68(1), 258–265.
Monderer, D., & Shapley, L. S. (1996b). Potential games. Games and Economic Behavior, 14(1), 124–143.
Peng, B., Hu, C., Peng, T., & Wang, W. (2012). Optimal resource allocation for multi-D2D links underlying OFDMA-based communications. In: 2012 8th International conference on wireless communications, networking and mobile computing (pp. 1–4). https://doi.org/10.1109/WiCOM.2012.6478598.
Phunchongharn, P., Hossain, E., & Kim, D. I. (2013). Resource allocation for device-to-device communications underlaying LTE-advanced networks. IEEE Wireless Communications, 20(4), 91–100. https://doi.org/10.1109/MWC.2013.6590055.
Sastry, P. S., Phansalkar, V. V., & Thathachar, M. A. L. (1994). Decentralized learning of nash equilibria in multi-person stochastic games with incomplete information. IEEE Transactions on Systems, Man, and Cybernetics, 24(5), 769–777. https://doi.org/10.1109/21.293490.
Song, L., Niyato, D., Han, Z., & Hossain, E. (2014). Game-theoretic resource allocation methods for device-to-device communication. IEEE Wireless Communications, 21(3), 136–144. https://doi.org/10.1109/MWC.2014.6845058.
Wu, Q., Xu, Y., Wang, J., Shen, L., Zheng, J., & Anpalagan, A. (2013). Distributed channel selection in time-varying radio environment: Interference mitigation game with uncoupled stochastic learning. IEEE Transactions on Vehicular Technology, 62(9), 4524–4538. https://doi.org/10.1109/TVT.2013.2269152.
Xu, C., Song, L., Han, Z., Zhao, Q., Wang, X., Cheng, X., et al. (2013). Efficiency resource allocation for device-to-device underlay communication systems: A reverse iterative combinatorial auction based approach. IEEE Journal on Selected Areas in Communications, 31(9), 348–358. https://doi.org/10.1109/JSAC.2013.SUP.0513031.
Xu, Y., Wang, J., Wu, Q., Anpalagan, A., & Yao, Y. D. (2012). Opportunistic spectrum access in unknown dynamic environment: A game-theoretic stochastic learning solution. IEEE Transactions on Wireless Communications, 11(4), 1380–1391. https://doi.org/10.1109/TWC.2012.020812.110025.
Yu, C. H., Doppler, K., Ribeiro, C. B., & Tirkkonen, O. (2011). Resource sharing optimization for device-to-device communication underlaying cellular networks. IEEE Transactions on Wireless Communications, 10(8), 2752–2763. https://doi.org/10.1109/TWC.2011.060811.102120.
Zhang, Y., Xu, Y., Gao, M., Zhang, Q., Li, H., Ahmad, I., & Feng, Z. (2015). Resource management in device-to-device underlaying cellular network. In 2015 IEEE wireless communications and networking conference (WCNC) (pp. 1631–1636). https://doi.org/10.1109/WCNC.2015.7127712.
Zhu, X., Wen, S., Cao, G., Zhang, X., & Yang, D. (2012). QoS-based resource allocation scheme for device-to-device (D2D) radio underlaying cellular networks. In 2012 19th International conference on telecommunications (ICT) (pp. 1–6). https://doi.org/10.1109/ICTEL.2012.6221213.
Zulhasnine, M., Huang, C., & Srinivasan, A. (2010). Efficient resource allocation for device-to-device communication underlaying LTE network. In 2010 IEEE 6th International conference on wireless and mobile computing, networking and communications (pp. 368–375). https://doi.org/10.1109/WIMOB.2010.5645039.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Appendices
Appendix A: Proof of Theorem 1
We choose the potential function as
where \(U(a_{i},{\varvec{a}}_{-i})\) is given by (8) where \(a_{i}\) is one of the L possible transmit configurations. Let \(\pi (s_{i})\) be the set of all D2D pairs other than i that choose the same RB \(s_{i}\).Hence, we have,
For any D2D pair i playing action \(a_{i}\) at iteration t, there are three possible choices for \(\bar{a_{_{i}}}\) played in iteration \(t+1\).
Case-(i) D2D pair i changes channel from \(s_{i}\) in iteration t to \(\bar{s_{i}}\) in iteration \(t+1\) but transmit power is constant at \(p_{i}\), i.e., \(a_{i}=[s_{i},p_{i}]\) and \(\bar{a_{i}}=[\bar{s_{i}},p_{i}]\).
Change in potential function when D2D pair i switches from RB \(s_{i}\) to \(\bar{s_{i}}= -\) (change in utility of i \(+\) change in utility of D2D pairs in \(s_i\) when i switches from \(s_{i}\) to \(\bar{s_{i}} +\) change in utility of D2D pairs in \(\bar{s_i}\) when i switches from \(s_{i}\) to \(\bar{s_{i}} +\) change in utility of D2D pairs \(k\in {{\mathscr {M}}}\setminus \{{i}~\cup ~\pi (s_{i})~\cup ~\pi (\bar{s_i})\}\)) when i switches from \(s_{i}\) to \(\bar{s_{i}}\)).
The fourth term in the above summation is zero since D2D pairs k are not affected by the strategy change of D2D pair i. Hence, we have
which reduces to
since \({\bar{h}}_{ji}^{\bar{s_{i}}}={\bar{h}}_{ji}^{s_{i}}={\bar{h}}_{ij}^{s_{i}}={\bar{h}}_{ij}^{\bar{s_{i}}}\).
Case-(ii) D2D pair i changes transmit power from \(p_{i}\) in iteration t to \(\bar{p_{i}}\) in iteration \(t+1\) but chooses the same RB \(s_i\), i.e.,
\(a_{i}=[s_{i},p_{i}]\) and \(\bar{a_{i}}=[s_{i},\bar{p_{i}}]\).
Change in potential function when D2D pair i switches from transmit power \(p_i\) to \(\bar{p_i}\) on RB \(s_i= -\) (change in utility of i \(+\) change in utility of D2D pairs in \(s_i\) when i switches from \(p_{i}\) to \(\bar{p_{i}} +\) change in utility of D2D pairs \(k\in {{\mathscr {M}}}\setminus \{{i}~\cup ~\pi (s_{i}\}\) when i switches from \(p_{i}\) to \(\bar{p_{i}}\)).
The third term in the above summation is zero since D2D pairs k are not affected by the strategy change of D2D pair i. Hence, we have
Using the same reasoning as in Case-1 we have
Case-(iii) D2D pair i changes RB from \(s_{i}\) in iteration t to \(\bar{s_{i}}\) in iteration \(t+1\) and transmit power from \(p_{i}\) to \(\bar{p_{i}}\), i.e., \(a_{i}=[s_{i},p_{i}]\) and \(\bar{a_{i}}=[\bar{s_{i}},\bar{p_{i}}]\).
Change in potential function when D2D pair i switches from RB \(s_{i}\) to \(\bar{s_{i}}= -\) (change in utility of i \(+\) change in utility of D2D pairs in \(s_i\) when i switches RB from \(s_{i}\) to \(\bar{s_{i}}\) and transmit power from \(p_i\) to \(\bar{p_i} +\) change in utility of D2D pairs in \(\bar{s_i}\) when i switches from \(s_{i}\) to \(\bar{s_{i}}\) and transmit power from \(p_i\) to \(\bar{p_i} +\) change in utility of D2D pairs \(k\in {{\mathscr {M}}}\setminus \{{i}~\cup ~\pi (s_{i})~\cup ~\pi (\bar{s_i})\}\)) when i switches from \(s_{i}\) to \(\bar{s_{i}}\)).
The fourth term in the above summation is zero since D2D pairs k are not affected by the strategy change of D2D pair i. Hence, we have
which reduces to
From (32), (35) and (38), we have
Hence, we proved that the interference mitigation game is an ordinal potential game with potential function given by (27). \(\square \)
Appendix B: Proof of Theorem 2
As in [18] we can re-write the ODE in (22) as
Using (18), (19) and (23), (40) can be written as
We have
since
Applying (41), (42) and (24), we have
where
From (44), (41) and (40), we have
Using (24) we have
\({\mathbf{v }(t)}\) converges to a stationary point of the ODE of (22) . Thus from Proposition 2, Theorem 2 follows. \(\square \)
Rights and permissions
About this article
Cite this article
Dominic, S., Jacob, L. Learning algorithms for joint resource block and power allocation in underlay D2D networks. Telecommun Syst 69, 285–301 (2018). https://doi.org/10.1007/s11235-018-0438-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11235-018-0438-0