Skip to main content
Log in

QoS-Based Blind Spectrum Selection with Multi-armed Bandit Problem in Cognitive Radio Networks

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

In the framework of cognitive radio, joint spectrum sensing and access strategies have been extensively studied recently. As a matter of fact, the sensing ability of cognitive radio is limited and the channel statistics may not be known as a priori. In this paper, we investigate the blind spectrum selection with the multi-armed bandit model, considering both primary user activities and channel quality to meet diverse QoS requirements, e.g. high transmission success rate for real-time applications and high throughput for best-effort applications. Firstly we propose a policy kth-UCB1 which is based on the UCB1 policy for multi-armed bandit problem but converges to the kth-best arm. Then we design a distributed order-optimal policy for multiple users accessing the rank-best channels according to their QoS requirements. The expected regret of proposed policy is proved to be logarithmic in the number of time slots and the simulation results implies it has better performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Zhao, Q., & Sadler, B. M. (2007). A survey of dynamic spectrum access. IEEE Signal Processing Magazine, 24(3), 79–89.

    Article  Google Scholar 

  2. Huang, J., Xing, G., Zhou, G., & Zhou R. (2010). Beyond co-existence: Exploiting WiFi white space for Zigbee performance assurance. In 2010 18th IEEE international conference on Network Protocols (ICNP) (pp. 305–314).

  3. Anandkumar, A., Michael, N., & Tang A. (2010). Opportunistic spectrum access with multiple users: Learning under competition. In INFOCOM, 2010 Proceedings IEEE. (pp. 1–9).

  4. Liu, K., & Zhao, Q. (2010). Decentralized multi-armed bandit with multiple distributed players. In Information theory and applications workshop (ITA), 2010. (pp. 1–10).

  5. Lee, W.-Y., & Akyldiz, I. F. (2011). A spectrum decision framework for cognitive radio networks. IEEE Transactions on Mobile Computing, 10(2), 161–174.

    Article  Google Scholar 

  6. Lai, T. L., & Robbins, H. (1985). Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6(1), 4–22.

    Article  MathSciNet  MATH  Google Scholar 

  7. Agrawal, R. (1995). Sample mean based index policies with O(log n) regret for the multi-armed bandit problem. Advances in Applied Probability, 27(4), 1054–1078.

    Article  MathSciNet  MATH  Google Scholar 

  8. Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2–3), 235–256.

    Article  MATH  Google Scholar 

  9. Zhao, Q., Tong, L., Swami, A., & Chen, Y. (2007). Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks: A POMDP framework. IEEE Journal on Selected Areas in Communications, 25(3), 589–600.

    Article  Google Scholar 

  10. Lai, L., El Gamal, H., El-Gamal, H., Jiang, H., & Poor, H. V. (2011). Cognitive medium access: Exploration, exploitation, and competition. IEEE Transactions on Mobile Computing, 10(2), 239–253.

    Article  Google Scholar 

  11. Liu, K., & Zhao, Q. (2008). A restless bandit formulation of opportunistic access: Indexablity and index policy. In 5th IEEE annual communications society conference on sensor, mesh and ad hoc communications and networks workshops, 2008, SECON Workshops’08 (pp. 1–5).

  12. Anandkumar, A., Michael, N., Member, S., Tang, A. K., & Swami, A. (2011). Distributed algorithms for learning and cognitive medium access with logarithmic regret. IEEE Journal on Selected Areas in Communications, 29(4), 731–745.

    Article  Google Scholar 

  13. Torabi, N., Rostamzadeh, K., & Leung, V. C. M. (2012). Rank-optimal channel selection strategy in cognitive networks. In 2012 IEEE on Global Communications Conference (GLOBECOM) (pp. 410–415).

  14. Kalathil, D., Nayyar, N., & Jain, R. (2014). Decentralized learning for multiplayer multiarmed bandits. IEEE Transactions on Information Theory, 60(4), 2331–2345.

    Article  MathSciNet  Google Scholar 

  15. Qu, S., & Xin, Y. (2009) Distribution of SNR and error probability of a two-hop relay link in Rayleigh Fading. In 2009 IEEE 70th vehicular technology conference fall (VTC 2009-Fall) (pp. 1–4).

  16. Anantharam, V., Varaiya, P., & Walrand, J. (1987). Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part I: I.I.D. rewards. IEEE Transactions on Automatic Control, 32(11), 968–976.

    Article  MathSciNet  MATH  Google Scholar 

  17. Kim, H., & Shin, K. G. (2008). Efficient discovery of spectrum opportunities with MAC-layer sensing in cognitive radio networks. IEEE Transactions on Mobile Computing, 7(5), 533–545.

    Article  MathSciNet  Google Scholar 

  18. Huang, J., Zhou, H., Chen, Y., Chen, B., Zhu, X., & Kong, R. (2012). Optimal channel sensing order for various applications in cognitive radio networks. Wireless Personal Communications, 71(3), 1721–1740.

    Article  Google Scholar 

  19. Song, M., & He, B. (2007) Capacity analysis for flat and clustered wireless sensor networks. In International conference on wireless algorithms, systems and applications, 2007. WASA 2007 (pp. 249–253).

  20. Gai, Y., & Krishnamachari, B. (2011) Decentralized online learning algorithms for opportunistic spectrum access. In 2011 IEEE global telecommunications conference (GLOBECOM 2011) (pp. 1–6).

Download references

Acknowledgments

The work presented in this paper was supported by the International S&T cooperation Program of China under Grant No. 2013DFA12460 and the Fundamental Research Funds for the Central Universities of China under Grant No. 2042015KF0053. The authors also thank Aurélien Garivier and Emilie Kaufmann for providing their MAB simulation code.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yongqun Chen or Huaibei Zhou.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Y., Zhou, H., Kong, R. et al. QoS-Based Blind Spectrum Selection with Multi-armed Bandit Problem in Cognitive Radio Networks. Wireless Pers Commun 89, 663–685 (2016). https://doi.org/10.1007/s11277-016-3301-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-016-3301-1

Keywords

Navigation