Skip to main content
Log in

Reinforcement learning models for scheduling in wireless networks

  • Review Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

The dynamicity of available resources and network conditions, such as channel capacity and traffic characteristics, have posed major challenges to scheduling in wireless networks. Reinforcement learning (RL) enables wireless nodes to observe their respective operating environment, learn, and make optimal or near-optimal scheduling decisions. Learning, which is the main intrinsic characteristic of RL, enables wireless nodes to adapt to most forms of dynamicity in the operating environment as time goes by. This paper presents an extensive review on the application of the traditional and enhanced RL approaches to various types of scheduling schemes, namely packet, sleep-wake and task schedulers, in wireless networks, as well as the advantages and performance enhancements brought about by RL. Additionally, it presents how various challenges associated with scheduling schemes have been approached using RL. Finally, we discuss various open issues related to RL-based scheduling schemes in wireless networks in order to explore new research directions in this area. Discussions in this paper are presented in a tutorial manner in order to establish a foundation for further research in this field.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Sutton R S, Barto A G. Reinforcement learning: an introduction. US: MIT Press, 1998

    Google Scholar 

  2. Stidham S J. Applied probability in operations research: a retrospective//Preprint: analysis, design, and control of queueing systems. Operation Research, 2002, 50(1): 197–216

    Article  MathSciNet  MATH  Google Scholar 

  3. Thompson M S, Mackenzie A B, Dasilva L A, Hadjichristofi G. A mobile ad hoc networking competition: a retrospective look at the MANIAC challenge. IEEE Communications Magazine, 2012, 50(7): 121–127

    Article  Google Scholar 

  4. Li X, Falcon R, Nayak A, Stojmenovic I. Servicing wireless sensor networks by mobile robots. IEEE Communications Magazine, 2012, 50(7): 147–154

    Article  Google Scholar 

  5. Xue Y, Lin Y, Cai H, Chi C. Autonomic joint session scheduling strategies for heterogeneous wireless networks. In: Proceedings of the 2008 IEEE Wireless Communications and Networking Conference. 2008, 2045–2050

    Chapter  Google Scholar 

  6. Song M, Xin C, Zhao Y, Cheng X. Dynamic spectrum access: from cognitive radio to network radio. IEEE Wireless Communications, 2012, 19(1): 23–29

    Article  Google Scholar 

  7. Mao J, Xiang F, Lai H. RL-based superframe order adaptation algorithm for IEEE 802.15.4 networks. In: Proceedings of the 2009 Chinese Control and Decision Conference. 2009, 4708–4711

    Google Scholar 

  8. Shah K, Kumar M. Distributed independent reinforcement learning (DIRL) approach to resource management in wireless sensor networks. In: Proceedings of the 4th International Conference on Mobile Ad-hoc and Sensor Systems. 2007, 1–9

    Google Scholar 

  9. Niu J. Self-learning scheduling approach for wireless sensor network. In: Proceedings of the 2010 International Conference on Future Computer and Communication. 2010, 253–257

    Google Scholar 

  10. Kaelbling L P, Littman M L, Wang X. Reinforcement learning: a survey. Journal of Artificial Intelligence Research, 1996, 4: 237–285

    Google Scholar 

  11. Bourenane M. Adaptive scheduling in mobile ad hoc networks using reinforcement learning approach. In: Proceedings of the 9th International Conference on Innovations in Information Technology. 2011, 392–397

    Google Scholar 

  12. Felice M D, Chowdhury K R, Kassler A, Bononi L. Adaptive sensing scheduling and spectrum selection in cognitive wireless mesh networks. In: Proceedings of the 2011 International Conference on Computer Communication Networks. 2011, 1–6

    Google Scholar 

  13. Zouaidi S, Mellouk A, Bourennane M, Hoceini S. Design and performance analysis of inductive QoS scheduling for dynamic network routing. In: Proceedings of the 20th Conference on Software, Telecomm, Computer Networks. 2008, 140–146

    Google Scholar 

  14. Sallent O, Pérez-Romero J, Sánchez-González J, Agustí R, Díazguerra MA, Henche D, Paul D. A roadmap from UMTS optimization to LTE self-optimization. IEEE Communications Magazine, 2011, 49(6): 172–182

    Article  Google Scholar 

  15. Bobarshad H, van der Schaar M, Aghvami A H, Dilmaghani R S, Shikh-Bahaei M R. Analytical modeling for delay-sensitive video over WLAN. IEEE Transactions on Multimedia, 2012, 14(2): 401–414

    Article  Google Scholar 

  16. Liu Z, Elhanany I. RL-MAC: a QoS-aware reinforcement learning based MAC protocol for wireless sensor networks. In: Proceedings of the 2006 Conference on Networking, Sensing and Control. 2006, 768–773

    Google Scholar 

  17. Yu R, Sun Z, Mei S. Packet scheduling in broadband wireless networks using neuro-dynamic programming. In: Proceedings of the 65th IEEE Vehicular Technology Conference. 2007, 2276–2780

    Google Scholar 

  18. Khan M I, Rinner B. Resource coordination in wireless sensor net works by cooperative reinforcement learning. In: Proceedings of the 2012 IEEE International Conference on Pervasive Computing and Communications. 2012, 895–900

    Google Scholar 

  19. Kok J R, Vlassis N. Collaborative multiagent reinforcement learning by payoff propagation. Journal of Machine Learning Research, 2006, 7: 1789–1828

    MathSciNet  MATH  Google Scholar 

  20. Schneider J, Wong W-K, Moore A, Riedmiller M, Distributed value functions. In: Proceedings of the 16th Conference on Machine Learning. 1999, 371–378

    Google Scholar 

  21. Sahoo A, Manjunath D. Revisiting WFQ: minimum packet lengths tighten delay and fairness bounds. IEEE Communications Letters, 2007, 11(4): 366–368

    Article  Google Scholar 

  22. Yu H, Ding L, Liu N, Pan Z, Wu P, You X. Enhanced first-in-first-outbased round-robin multicast scheduling algorithm for input-queued switches. IET Communications, 2011, 5(8): 1163–1171

    Article  Google Scholar 

  23. Yau K L A, Komisarczuk P, Teal P D. Enhancing network performance in distributed cognitive radio networks using single-agent and multi-agent reinforcement learning. In: Proceedings of the 2010 Conference on Local Computer Networks. 2010, 152–159

    Google Scholar 

  24. Engineering Systems Division (ESD). ESD Symposium Committee Overview. In: Proceedings of Massachusetts Institute of Technology ESD Internal Symposium. 2002. http://esd.mit.edu/WPS

    Google Scholar 

  25. Ouzecki D, Jevtic D. Reinforcement learning as adaptive network routing of mobile agents. In: Proceedings of the 33rd International Conference on Information and Communication Technology. Electronics and Microelectronics. 2010, 479–484

    Google Scholar 

  26. Bhorkar A A, Naghshvar M, Javidi T, Rao B D. Adaptive opportunistic routing for wireless ad hoc networks. IEEE/ACM Transactions on Network, 2012, 20(1): 243–256

    Article  Google Scholar 

  27. Lin Z, Schaar M V D. Autonomic and distributed joint routing and power control for delay-sensitive applications in multi-hop wireless networks. IEEE Transactions on Wireless Communications, 2011, 10(1): 102–113

    Article  Google Scholar 

  28. Santhi G, Nachiappan A, Ibrahime M Z, Raghunadhane R, Favas M K. Q-learning based adaptive QoS routing protocol for MANETs. In: Proceedings of the 2011 International Conference on Recent Trends in Information Technology. 2011, 1233–1238

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kok-Lim Alvin Yau.

Additional information

Kok-Lim Alvin Yau has a BE from Universiti Teknologi Petronas, Malaysia, a MSc from National University of Singapore, and a PhD from Victoria University, New Zealand. His research interests are wireless networks and applied artificial intelligence.

Kae Hsiang Kwong has a BE from Jinan University, China and a PhD from University of Strathclyde, UK. His research interests include network infrastructure design, monitoring, and performance optimization.

Chong Shen has a BE from Wuhan University, China, a MPhil from University of Strathclyde, UK and a PhD from Cork Institute of Technology, Ireland. His research interests cover layers 2 and 3 algorithm and protocol.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yau, KL.A., Kwong, K.H. & Shen, C. Reinforcement learning models for scheduling in wireless networks. Front. Comput. Sci. 7, 754–766 (2013). https://doi.org/10.1007/s11704-013-2291-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-013-2291-3

Keywords

Navigation