Skip to main content

Resource Allocation for Multi-source Multi-relay Wireless Networks: A Multi-Armed Bandit Approach

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCCN,volume 12845))

Abstract

In this paper, we consider the problem of link adaptation (rate allocation) of Orthogonal Multiple Access Multiple Relay Channel (OMAMRC) using the Multi-Armed Bandit (MAB) online learning framework. The cooperative system is composed of a transmission phase where sources transmit in a round robin manner, and a retransmission phase where a scheduled node sends redundancies. We assume that we have no knowledge of the Channel State Information (CSI) nor of the Channel Distributed Information (CDI). Accordingly, rate allocation must be learned online following a sequential learning algorithm. We adapt to one variant of the MAB framework algorithms, the Upper Confidence Bound (UCB) family, and specifically the UCB1 algorithm. The UCB1 algorithm achieves a logarithmic regret uniformly over time, without any preliminary knowledge about the reward distributions. Due to the exponential growth of the number of arms, following the multiple sources included in the rate allocation, the UCB1 algorithm features a complexity problem. Thus, we propose a sequential UCB1 (SUCB1) algorithm which solves the complexity issue, and outperforms the UCB1 algorithm.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Kramer, G., Marić, I., Yates, R.D.: Cooperative communications (2007)

    Google Scholar 

  2. Kramer, G., Gastpar, M., Gupta, P.: Cooperative strategies and capacity theorems for relay networks. IEEE Trans. Inf. Theory 51(9), 3037–3063 (2005)

    Article  MathSciNet  Google Scholar 

  3. Sankaranarayanan, L., Kramer, G., Mandayam, N.B.: Hierarchical sensor networks: capacity bounds and cooperative strategies using the multiple-access relay channel model. In: 2004 First Annual IEEE Communications Society Conference on Sensor and Ad Hoc Communications and Networks, SECON 2004, pp. 191–199. IEEE (2004)

    Google Scholar 

  4. Laneman, J.N.: Cooperative diversity in wireless networks: algorithms and architectures. Ph.D. dissertation, Massachusetts Institute of Technology, Cambridge, MA (2002)

    Google Scholar 

  5. Lim, S.H., Kim, Y.H., Gamal, A.E., Chung, S.Y.: Noisy network coding. IEEE Trans. Inf. Theory 57(5), 3132–3152 (2011)

    Article  MathSciNet  Google Scholar 

  6. Avestimehr, A.S., Diggavi, S.N., Tse, D.N.C.: Wireless network information flow: a deterministic approach. IEEE Trans. Inf. Theory 57(4), 1872–1905 (2011)

    Article  MathSciNet  Google Scholar 

  7. Cover, T., Gamal, A.E.: Capacity theorems for the relay channel. IEEE Trans. Inf. Theory 25(5), 572–584 (1979)

    Article  MathSciNet  Google Scholar 

  8. Khansa, A.A., Cerovic, S., Visoz, R., Hayel, Y., Lasaulce, S.: Slow-link adaptation algorithm for multi-source multi-relay wireless networks using best-response dynamics. To be presented at NetGCOOP (2021)

    Google Scholar 

  9. Bubeck, S., Cesa-Bianchi, N.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. arXiv preprint arXiv:1204.5721 (2012)

  10. Lai, T.L., Robbins, H.: Asymptotically efficient adaptive allocation rules. Adv. Appl. Math. 6(1), 4–22 (1985)

    Article  MathSciNet  Google Scholar 

  11. Weng, L.: The multi-armed bandit problem and its solutions. lilianweng.github.io/lil-log (2018)

    Google Scholar 

  12. Bubeck, S.: Bandits games and clustering foundations. Ph.D. dissertation, INRIA Nord Europe (2010)

    Google Scholar 

  13. Garivier, A., Cappé, O.: The KL-UCB algorithm for bounded stochastic bandits and beyond. In: Proceedings of the 24th Annual Conference on Learning Theory. In: JMLR Workshop and Conference Proceedings, pp. 359–376 (2011)

    Google Scholar 

  14. Thompson, W.R.: On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3/4), 285–294 (1933)

    Article  Google Scholar 

  15. Chapelle, O., Li, L.: An empirical evaluation of Thompson sampling. Adv. Neural Inf. Process. Syst. 24, 2249–2257 (2011)

    Google Scholar 

  16. Russo, D., Van Roy, B., Kazerouni, A., Osband, I., Wen, Z.: A tutorial on Thompson sampling. arXiv preprint arXiv:1707.02038 (2017)

  17. Combes, R., Magureanu, S., Proutiere, A.: Minimal exploration in structured stochastic bandits. arXiv preprint arXiv:1711.00400 (2017)

  18. Ameur, W.B., Mary, P., Hélard, J.-F., Dumay, M., Schwoerer, J.: Autonomous power decision for the grant free access MUSA scheme in the mMTC scenario. Sensors 21(1), 116 (2021)

    Article  Google Scholar 

  19. Nasim, I., Ibrahim, A.S., Kim, S.: Learning-based beamforming for multi-user vehicular communications: a combinatorial multi-armed bandit approach. IEEE Access 8, 219 891–219 902 (2020)

    Google Scholar 

  20. Chen, W., Wang, Y., Yuan, Y.: Combinatorial multi-armed bandit: general framework and applications. In: International Conference on Machine Learning. PMLR, pp. 151–159 (2013)

    Google Scholar 

  21. Kuchibhotla, V., Harshitha, P., Elugoti, D.: Combinatorial sleeping bandits with fairness constraints and long-term non-availability of arms. In: 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA), pp. 1575–1581. IEEE (2020)

    Google Scholar 

  22. Cerovic, S., Visoz, R., Madier, L., Berthet, A.O.: Centralized scheduling strategies for cooperative HARQ retransmissions in multi-source multi-relay wireless networks. In: Proceedings of IEEE ICC 2018, Kansas City, MO, USA, May 2018

    Google Scholar 

  23. Mohamad, A., Visoz, R., Berthet, A.O.: Outage analysis of various cooperative strategies for the multiple access multiple relay channel. In: IEEE 24th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC). IEEE 2013, pp. 1321–1326 (2013)

    Google Scholar 

  24. Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2), 235–256 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ali Al Khansa .

Editor information

Editors and Affiliations

Appendix

Appendix

1.1 A: Outage Events

Based on [23] proposition 1, we see a direct relation between the individual outage and the common outage. The individual outage is defined as the event that an individual source is not decoded correctly at the destination after \(T_{max}\) rounds. Similarly, common outage is defined for a set of sources, and it is declared when at least one of the sources within this set is not decoded correctly at the destination. In other words, common outage of a set occurs when one or more of its source nodes are in an individual outage.

Fig. 3.
figure 3

Efficiency for \(\gamma = -4\) dB

Fig. 4.
figure 4

Efficiency for \(\gamma = 6\) dB

Fig. 5.
figure 5

Efficiency for \(\gamma = 21\) dB

Fig. 6.
figure 6

ASE vs \(\gamma \) after 500 samples

Both, the individual outage event \(\mathcal {O}_{s,t}(a_t,\mathcal {S}_{a_t,t-1} |\mathcal {P}_{t-1})\) of a source s after round t, and the common outage event \(\mathcal {E}_t(a_t,\mathcal {S}_{a_t,t-1} |\mathcal {P}_{t-1})\) after round t, depend directly on the rate being scheduled. In addition, they depend on the selected node \(a_t\in \mathcal {N}\) and its associated decoding set \(\mathcal {S}_{a_t,t-1}\). They are conditional on the knowledge of \(\mathcal {P}_{t-1}\), where \(\mathcal {P}_{t-1}\) denotes the set collecting the nodes \(\widehat{a}_l\) which were selected in rounds \(l \in \{ 1,\dots ,t-1 \}\) prior to round t together with their associated decoding sets \(\mathcal {S}_{\widehat{a}_l,l-1}\), and the decoding set of the destination \(\mathcal {S}_{d,t-1}\) (\(\mathcal {S}_{d,0}\) is the destination’s decoding set after the first phase).

Analytically, the common outage event of a given subset of sources is declared if the vector of their rates lies outside of the corresponding MAC capacity region. For some subset of sources \(\mathcal {B}\subseteq \overline{\mathcal {S}}_{d,t-1}\), where \(\overline{\mathcal {S}}_{d,t-1}=\mathcal {S} \setminus {\mathcal {S}}_{d,t-1}\) is the set of non-successfully decoded sources at the destination after round \(t-1\), and for a candidate node \(a_t\) this event can be expressed as:

$$\begin{aligned} \begin{aligned}&\mathcal {E}_{t,\mathcal {B}}(a_t,\mathcal {S}_{a_t,t-1})\\&=\bigcup _{\mathcal {U}\subseteq \mathcal {B}} \Big \{ \sum _{i \in \mathcal {U}}R_i > \sum _{i \in \mathcal {U}} I_{i,d} + \alpha \sum _{l=1}^{t-1} I_{\widehat{a}_l,d} \mathcal {C}_{\widehat{a}_l}(\mathcal {U}) + \alpha I_{a_t,d} \mathcal {C}_{a_t}(\mathcal {U}) \Big \}, \end{aligned} \end{aligned}$$
(4)

where \(I_{a,b}\) denotes the mutual information between the nodes a and b (the mutual information is defined based on the channel inputs, check Sect. 6 for Gaussian inputs example), and where \(\mathcal {C}_{\widehat{a}_l}\) and \(\mathcal {C}_{a_t}\) have the following definitions:

$$\begin{aligned} \begin{aligned}&\mathcal {C}_{\widehat{a}_l}(\mathcal {U})=\Big [ (\mathcal {S}_{\widehat{a}_l,l-1}\cap \mathcal {U} \ne \emptyset )\wedge (\mathcal {S}_{\widehat{a}_l,l-1}\cap \mathcal {I}=\emptyset ) \Big ],\\&\mathcal {C}_{a_t}(\mathcal {U})=\Big [ (\mathcal {S}_{a_t,t-1}\cap \mathcal {U} \ne \emptyset )\wedge (\mathcal {S}_{a_t,t-1}\cap \mathcal {I}=\emptyset ) \Big ]. \end{aligned} \end{aligned}$$
(5)

The individual outage event of a source s after round t for a candidate node \(a_t\) can be defined as:

$$\begin{aligned} \begin{aligned}&\mathcal {O}_{s,t}(a_t,\mathcal {S}_{a_t,t-1})=\bigcap _{\mathcal {I}\subset \overline{\mathcal {S}}_{d,t-1},\mathcal {B}= \overline{\mathcal {I}},s\in \mathcal {B}}\mathcal {E}_{t,\mathcal {B}}(a_t,\mathcal {S}_{a_t,t-1}), \\&=\bigcap _{\mathcal {I}\subset \overline{\mathcal {S}}_{d,t-1}} \bigcup _{\mathcal {U}\subseteq \overline{\mathcal {I}}:s\in \mathcal {U}} \Big \{ \sum _{i \in \mathcal {U}}R_i > \sum _{i \in \mathcal {U}} I_{i,d} + \alpha \sum _{l=1}^{t-1} I_{\widehat{a}_l,d} \mathcal {C}_{\widehat{a}_l}(\mathcal {U}) + \alpha I_{a_t,d} \mathcal {C}_{a_t}(\mathcal {U}) \Big \}, \end{aligned} \end{aligned}$$
(6)

where \(\overline{\mathcal {I}}=\overline{\mathcal {S}}_{d,t-1}\setminus \mathcal {I}\).

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Khansa, A.A., Visoz, R., Hayel, Y., Lasaulce, S. (2021). Resource Allocation for Multi-source Multi-relay Wireless Networks: A Multi-Armed Bandit Approach. In: Elbiaze, H., Sabir, E., Falcone, F., Sadik, M., Lasaulce, S., Ben Othman, J. (eds) Ubiquitous Networking. UNet 2021. Lecture Notes in Computer Science(), vol 12845. Springer, Cham. https://doi.org/10.1007/978-3-030-86356-2_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86356-2_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86355-5

  • Online ISBN: 978-3-030-86356-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics