Abstract
The cooperative frequency reuse among base stations (BSs) can improve the system spectral efficiency by reducing the intercell interference through channel assignment and precoding. This paper presents a game-theoretic study of channel assignment for realizing network multiple-input multiple-output (MIMO) operation under time-varying wireless channel. We propose a new joint precoding scheme that carries enhanced interference mitigation and capacity improvement abilities for network MIMO systems. We formulate the channel assignment problem from a game-theoretic perspective with BSs as the players, and show that our game is an exact potential game given the proposed utility function. A distributed, stochastic learning-based algorithm is proposed where each BS progressively moves toward the Nash equilibrium (NE) strategy based on its own action-reward history only. The convergence properties of the proposed learning algorithm toward an NE point are theoretically and numerically verified for different network topologies. The proposed learning algorithm also demonstrates an improved capacity and fairness performance as compared to other schemes through extensive link-level simulations.









Similar content being viewed by others
Notes
Note that with the approximated utility function the existence of a pure strategy NE is no longer guaranteed theoretically. However, as will be shown in Sect. 6, convergence to NE is observed numerically.
References
Zhang, H., Dai, H., & Zhou, Q. (2004). Base station cooperation for multiuser MIMO: Joint transmission and BS selection. In Proceedings of IEEE CISS ’04.
Chang, R. Y., Tao, Z., Zhang, J., & Kuo, C.-C. J. (2009). Multicell OFDMA downlink resource allocation using a graphic framework. IEEE Transactions on Vehicular Technology, 58(7), 3494–3507.
Hadisusanto, Y., Thiele, L., & Jungnickel, V. (2008). Distributed base station cooperation via block-diagonalization and dual-decomposition. In Proceedings of IEEE GLOBECOM ’08, pp. 1–5.
Spencer, Q. H., Swindlehurst, A. L., & Haardt, M. (2004). Zero-forcing methods for downlink spatial multiplexing in multiuser MIMO channels. IEEE Transactions on Signal Processing, 52(2), 461–471.
Caire, G., Ramprashad, S. A., & Papadopoulos, H. C. (2010). Rethinking network MIMO: Cost of CSIT, performance analysis, and architecture comparisons. In Proceedings of ITA ’10, pp. 1–10.
Zhang, J., Chen, R., Andrews, J. G., Ghosh, A., & Heath, R. W. (2009). Networked MIMO with clustered linear precoding. IEEE Transactions on Wireless Communications, 8(4), 1910–1921.
Kaviani, S., Simeone, O., Krzymien, W. A., & Shamai, S. (2011, December). Linear MMSE precoding and equalization for network MIMO with partial cooperation. In Proceedings of IEEE GLOBECOM ’11, pp. 1–6.
de Kerret, P., & Gesbert, D. (2012, April). Sparse precoding in multicell MIMO systems. In Proceedings of IEEE WCNC ’12, pp. 958–962.
de Kerret, P., & Gesbert, D. (2011, August). The multiplexing gain of a two-cell MIMO channel with unequal CSI. In Proceedings of IEEE ISIT ’11, pp. 558–562.
Zakhour, R., Ho, Z., & Gesbert, D. (2009, April). Distributed beamforming coordination in multicell MIMO channels. In Proceedings of IEEE VTC Spring ’09, pp. 1–5.
Zakhour, R., & Gesbert, D. (2010). Distributed multicell-MISO precoding using the layered virtual SINR framework. IEEE Transactions on Wireless Communications, 9(8), 2444–2448.
Bjornson, E., Jalden, N., Bengtsson, M., & Ottersten, B. (2011). Optimality properties, distributed strategies, and measurement-based evaluation of coordinated multicell OFDMA transmission. IEEE Transactions on Signal Processing, 59(12), 6086–6101.
Sundaresan, K., & Rangarajan, S. (2009). Efficient resource management in OFDMA femtocells. In Proceedings on ACM MobiHoc ’09, pp. 33–42.
Lopez-Perez, D., Valcarce, A., de la Roche, G., & Zhang, J. (2009). OFDMA femtocells: A roadmap on interference avoidance. IEEE Communications Magazine, 47(9), 41–48.
Hatoum, A., Aitsaadi, N., Langar, R., Boutaba, R., & Pujolle, G. (2011, June). FCRA: Femtocell cluster-based resource allocation scheme for OFDMA networks. In Proceedings of IEEE ICC ’11, pp. 1–6.
Bloem, M., Alpcan, T., & Başar, T. (2007). A stackelberg game for power control and channel allocation in cognitive radio networks. In Proceedings of ICST VALUETOOLS ’07, p. 4.
Husheng, L. (2010). Multiagent Q-learning for aloha-like spectrum access in cognitive radio systems. EURASIP Journal on Wireless Communications and Networking, 2010. http://jwcn.eurasipjournals.com/content/2010/1/876216/.
Galindo-Serrano, A., & Giupponi, L. (2011, October) Femtocell systems with self organization capabilities. In Proceedings of IEEE NetGCoop ’11, pp. 1–7.
Nie, N., & Comaniciu, C. (2005, November). Adaptive channel allocation spectrum etiquette for cognitive radio networks. In Proceedings of IEEE DySPAN ’05, pp. 269–278.
Xu, Y., Wang, J., Anpalagan, A., & Yao, Y.-D. (2012). Opportunistic spectrum access in unknown dynamic environment: A game-theoretic stochastic learning solution. IEEE Transactions on Wireless Communications, 11(4), 1380–1391.
Zhong, W., & Youyun, X. (2010). Game theoretic multimode precoding strategy selection for MIMO multiple access channels. IEEE Signal Processing Letters, 17(6), 563–566.
Sastry, P. S., Phansalkar, V. V., & Thathachar, M. A. L. (1994). Decentralized learning of Nash equilibria in multi-person stochastic games with incomplete information. IEEE Transactions on Systems, Man and Cybernetics, 24(5), 769–777.
Tembine, H. (2011). Dynamic robust games in MIMO systems. IEEE Transactions on Systems, Man and Cybernetics B, 41(4), 990–1002.
Busoniu, L., Babuska, R., & De Schutter, B. (2008). A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man and Cybernetics C, 38(2), 156–172.
Cominetti, R., Melo, E., & Sorin, S. (2010). A payoff-based learning procedure and its application to traffic games. Games and Economic Behavior, 70(1), 71–83.
Bravo, M. (2011). An adjusted payoff-based procedure for normal form games. arXiv preprint arXiv:1106.5596.
Khan, M. A., Tembine, H., & Vasilakos, A. V. (2012). Game dynamics and cost of learning in heterogeneous 4G networks. IEEE Journal on Selected Areas in Communications, 30(1), 198–213.
Goldsmith, A. J., & Chua, S.-G. (1997). Variable-rate variable-power MQAM for fading channels. IEEE Transactions on Communications, 45(10), 1218–1230.
Sadek, M., Tarighat, A., & Sayed, A. H. (2007). A leakage-based precoding scheme for downlink multi-user MIMO channels. IEEE Transactions on Wireless Communications, 6(5), 1711–1721.
Monderer, D., & Shapley, L. S. (1996). Potential games. Games and Economic Behavior, 14, 124–143.
Brown, G. W. (1951). Iterative solution of games by fictitious play. Activity Analysis of Production and Allocation, 13(1), 374–376.
Borkar, V. S. (1997). Stochastic approximation with two time scales. Systems & Control Letters, 29(5), 291–294.
Leslie, D. S., & Collins, E. J. (2003). Convergent multiple-timescales reinforcement learning algorithms in normal form games. The Annals of Applied Probability, 13(4), 1231–1251.
Leslie, D. S., & Collins, E. J. (2005). Individual Q-learning in normal form games. SIAM Journal on Control and Optimization, 44(2), 495–514.
Nemirovski, A. S., Juditsky, A., Lan, G., & Shapiro, A. (2009). Robust stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 19(4), 1574–1609.
Coucheney, P., Gaujal, B., & Mertikopoulos, P. (2014). Penalty-regulated dynamics and robust learning procedures in games. arXiv preprint arXiv:1303.2270.
Leslie, D. S., & Collins, E. J. (2006). Generalised weakened fictitious play. Games and Economic Behavior, 56, 285–298.
Fudenberg, D., & Levine, D. K. (1998). The theory of learning in games (Vol. 2). Cambridge: MIT Press.
Bournez, O., & Cohen, J. (2013). Learning equilibria in games by stochastic distributed algorithms. In E. Gelenbe & R. Lent (Eds.), Computer and information sciences III, (pp. 31–38). London: Springer.
Tembine, H. (2012). Distributed strategic learning for wireless engineers. Boca Raton: CRC Press.
Billingsley, P. (1995). Probability and measure. Hoboken: Wiley-Interscience.
Kushner, H. J., & Yin, G. G. (2003). Stochastic approximation and recursive algorithms and applications. Berlin: Springer.
Benaïm, M. (1999). Dynamics of stochastic approximation algorithms. Séminaire de Probabilités XXXIII, 1709, 1–68.
Beneveniste, A., Metivier, M., & Priouret, P. (1987). Adaptive algorithms and stochastic approximations. Berlin: Springer.
Benaïm, M., & Hirsch, M. W. (1999). Stochastic approximation algorithms with constant step size whose average is cooperative. The Annals of Applied Probability, 9(1), 216–241.
3GPP. (2011). Spatial channel model for multiple input multiple output (MIMO) simulations (release 10). 3gpp technical report (tr 25.996) v10.0.0, March 2011.
Jain, R., Chiu, D., & Hawe, W. (1984). A quantitative measure of fairness and discrimination for resource allocation in shared computer systems. DEC Research Report TR-301.
Author information
Authors and Affiliations
Corresponding author
Proof of proposition 4
Proof of proposition 4
The Proposition is proved by investigating the nondecreasing and upper-bounded properties of \(\varPsi\) along the trajectory of the ODE in (27). First, we rewrite the ODE in (27) as follows:
Given that \(\varPsi (\mathbf {e}_{s_i}, \mathbf {P}_{-i}) = {\partial \varPsi (\mathbf {P})}/{\partial p_{i,s_i}}\) is positively correlated with \(\psi _i(\mathbf {e}_{s_i}, \mathbf {P}_{-i})\), and let \(D_{i,s_i,s'_i} = \psi _i(\mathbf {e}_{s_i}, \mathbf {P}_{-i})-\psi _i(\mathbf {e}_{s'_i}, \mathbf {P}_{-i})\), \(E_{i,s_i,s'_i} = \varPsi (\mathbf {e}_{s_i}, \mathbf {P}_{-i})-\varPsi (\mathbf {e}_{s'_i}, \mathbf {P}_{-i})\), we may write
By applying (35) and (36), the derivation of \(\varPsi (\mathbf {P})\) with respect to \(t\) is given by
where the last inequality holds since given the condition in (36), \(D_{i,s_i,s'_i}\) and \(E_{i,s_i,s'_i}\) always have the same sign.
Thus, \(\varPsi\) is nondecreasing along the trajectories of the ODE, and asymptotically all the trajectories will be in the set \(\{\mathbf {P} \in \mathcal {P}: \frac{d\varPsi (\mathbf {P})}{dt} = 0\}\). From (35) and (37), we know
According to Proposition 3, when starting from an interior point of the simplex of the mixed strategy space \(\mathcal {P}\), the trajectory of the ODE in (35) converges to a stable stationary point, i.e., an NE. Then, by Proposition 2, the linearly interpolated process of the strategy update \(p_{i,s_i}(n)\) is bounded within the neighborhood of the trajectory of (35). Thus, we complete the proof [22, Theorem 3.3].
Rights and permissions
About this article
Cite this article
Tseng, LC., Chien, FT., Chang, R.Y. et al. Distributed channel assignment for network MIMO: game-theoretic formulation and stochastic learning. Wireless Netw 21, 1211–1226 (2015). https://doi.org/10.1007/s11276-014-0844-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11276-014-0844-5