Skip to main content
Log in

RLProph: a dynamic programming based reinforcement learning approach for optimal routing in opportunistic IoT networks

  • Published:
Wireless Networks Aims and scope Submit manuscript

Abstract

Routing in Opportunistic Internet of Things networks (OppIoTs) is a challenging task because of intermittent connectivity between devices and the lack of a fixed path between the source and destination of messages. Recently, machine learning (ML) and reinforcement learning (RL) have been used with great success to automate processes in a number of different problem domains. In this paper, we seek to fully automate the OppIoT routing process by using the Policy Iteration algorithm to maximize the possibility of message delivery. Moreover, we model the OppIoT environment as a Markov decision process (MDP) replete with states, actions, rewards, and transition probabilities. The proposed routing protocol, RLProph, is able to optimize the routing process via the optimal policy obtained by solving the MDP using Policy Iteration. Through extensive simulations, we show that RLProph outperforms a number of ML-based and context-aware routing protocols on a multitude of performance criteria.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Juang, P., Oki, H., Wang, Y., Martonosi, M., Peh, L. S., & Rubenstein, D. (2002). Energy-efficient computing for wildlife tracking: Design tradeoffs and early experiences with ZebraNet. ACM SIGARCH Computer Architecture News, 30(5), 96–107.

    Article  Google Scholar 

  2. Pentland, A., Fletcher, R., & Hasson, A. (2004). Daknet: Rethinking connectivity in developing nations. Computer, 37(1), 78–83.

    Article  Google Scholar 

  3. Doria, A., Uden, M., & Pandey, D. P. (2002). Providing connectivity to the saami nomadic community. International Conference on open collaborative design for sustainable innovation. 01/12/2002-02/12/2002.

  4. Guo, B., Zhang, D., Wang, Z., Yu, Z., & Zhou, X. (2013). Opportunistic IoT: Exploring the harmonious interaction between human and the internet of things. Journal of Network and Computer Applications, 36(6), 1531–1539.

    Article  Google Scholar 

  5. Vahdat, A., & Becker, D. (2000). Epidemic routing for partially connected ad hoc networks. Technical Report CS-200006, Duke University. Available from: https://cloudcoder.cs.duke.edu/techreports/2000/2000-06.ps.

  6. Lindgren, A., Doria, A., & Schelén, O. (2003). Probabilistic routing in intermittently connected networks. SIGMOBILE Mobile Computer Communication and Review., 7(3), 19–20. https://doi.org/10.1145/961268.961272.

    Article  Google Scholar 

  7. Dhurandher, S. K., Sharma, D. K., Woungang, I., & Bhati, S. (2013). HBPR: History based prediction for routing in infrastructure-less opportunistic networks. In 2013 IEEE 27th international Conference on advanced information networking and applications (AINA) (pp. 931–936).

  8. Dhurandher, S. K., Sharma, D. K., Woungang, I., & Saini, A. (2015). Efficient routing based on past information to predict the future location for message passing in infrastructure-less opportunistic networks. The Journal of Supercomputing, 71(5), 1694–1711.

    Article  Google Scholar 

  9. Sharma, D. K., Dhurandher, S. K., Woungang, I., Srivastava, R. K., Mohananey, A., & Rodrigues, J. J. (2016). A machine learning-based protocol for efficient routing in opportunistic networks. IEEE Systems Journal, 12(3), 2207–2213.

    Article  Google Scholar 

  10. Vashishth, V., Chhabra, A., & Sharma, D. K. (2019). GMMR: A Gaussian mixture model based unsupervised machine learning approach for optimal routing in opportunistic IoT networks. Computer Communications, 134, 138–148.

    Article  Google Scholar 

  11. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., & Wierstra, D., et al. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.

  12. Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., et al. (2017). Mastering the game of Go without human knowledge. Nature, 550(7676), 354.

    Article  Google Scholar 

  13. Bellman, R. (1957). A Markovian Decision Process. Indiana University Mathematics Journal, 6, 679–684.

    Article  MathSciNet  Google Scholar 

  14. Telser, L. G. (1961). Dynamic Programming and Markov Processes. Ronald A. Howard. Journal of Political Economy., 69(3), 296–297. https://doi.org/10.1086/258477.

  15. Huang, T. K., Lee, C. K., & Chen, L. J. (2010). Prophet+: An adaptive prophet-based routing protocol for opportunistic network. In 2010 24th IEEE international Conference on advanced information networking and applications (AINA) (pp. 112–119). IEEE.

  16. Nguyen, H. A., Giordano, S., & Puiatti, A. (2007). Probabilistic routing protocol for intermittently connected mobile ad hoc network (propicman). In IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks (pp. 1–6). Available from: https://ieeexplore.ieee.org/abstract/document/4351696/.

  17. Leguay, J., Friedman, T., & Conan, V. (2005). MobySpace: Mobility pattern space routing for DTNs. In ACM SIGCOMM Conference.

  18. Burns, B., Brock, O., & Levine, B. N. (2005). MV routing and capacity building in disruption tolerant networks. In Proceedings IEEE 24th annual joint Conference of the IEEE computer and communications societies (vol. 1, pp. 398–408).

  19. Boldrini, C., Conti, M., Jacopini, J., & Passarella, A. (2007). HiBOp: A history based routing protocol for opportunistic networks. In 2007 IEEE international Symposium on a world of wireless, mobile and multimedia networks (pp. 1–12).

  20. Dhurandher, S. K., Borah, S. J., Obaidat, M. S., Sharma, D. K., Gupta, S., & Baruah, B. (2015). Probability-based controlled flooding in opportunistic networks. In: 2015 12th International joint Conference one-Business and telecommunications (ICETE) (vol. 6, pp. 3–8). IEEE.

  21. Xiao, M., Wu, J., & Huang, L. (2014). Community-aware opportunistic routing in mobile social networks. IEEE Transactions on Computers, 63(7), 1682–1695.

    Article  MathSciNet  Google Scholar 

  22. Sharma, D. K., Sharma, A., & Kumar, J., et al. (2017). KNNR: K-nearest neighbour classification based routing protocol for opportunistic networks. In: 2017 Tenth international Conference on contemporary computing (IC3) (pp. 1–6). IEEE.

  23. Derakhshanfard, N., Sabaei, M., & Rahmani, A. (2017). CPTR: Conditional probability tree based routing in opportunistic networks. Wireless Networks. https://doi.org/10.1007/s11276-015-1136-4.

    Article  Google Scholar 

  24. Zhang, J., Huang, H., et al. (2019). Destination-aware metric based social routing for mobile opportunistic networks. Wireless Networks. https://doi.org/10.1007/s11276-018-01907-2.

    Article  Google Scholar 

  25. Jang, K., Lee, J., et al. (2016). An adaptive routing algorithm considering position and social similarities in an opportunistic network. Wireless Networks. https://doi.org/10.1007/s11276-015-1048-3.

    Article  Google Scholar 

  26. Malekyan, S., Bag-Mohammadi, M., et al. (2019). On selection of forwarding nodes for long opportunistic routes. Wireless Networks. https://doi.org/10.1007/s11276-017-1636-5.

    Article  Google Scholar 

  27. Wu, J., Chen, Z., & Zhao, M. (2019). Information cache management and data transmission algorithm in opportunistic social networks. Wireless Networks. https://doi.org/10.1007/s11276-018-1691-6.

    Article  Google Scholar 

  28. Spyropoulos, T., Psounis, K., & Raghavendra, C. S. (2005). Spray and wait: An efficient routing scheme for intermittently connected mobile networks. In Proceedings of the 2005 ACM SIGCOMM Workshop on Delay-tolerant networking (pp. 252–259). ACM.

  29. Spyropoulos, T., Psounis, K., & Raghavendra, C. S. (2007). Spray and focus: Efficient mobility-assisted routing for heterogeneous and correlated mobility. In Null, (pp. 79–85). IEEE.

  30. Keränen, A., Ott, J., & Kärkkäinen, T. (2009). The ONE simulator for DTN protocol evaluation. In Proceedings of the 2nd international conference on simulation tools and techniques. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering) (pp. 55).

  31. Fayyad, U., & Irani, K. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. Available from: https://trs.jpl.nasa.gov/handle/2014/35171.

  32. Kurgan, L. A., & Cios, K. J. (2004). CAIM discretization algorithm. IEEE Transactions on Knowledge and Data Engineering, 16(2), 145–153.

    Article  Google Scholar 

  33. Tsai, C. J., Lee, C. I., & Yang, W. P. (2008). A discretization algorithm based on class-attribute contingency coefficient. Information Sciences, 178(3), 714–731.

    Article  Google Scholar 

  34. Scott, J., Gass, R., Crowcroft, J., Hui, P., Diot, C., & Chaintreau, A. (2009). CRAWDAD dataset cambridge/haggle (v. 2009-05-29); Downloaded from https://crawdad.org/cambridge/haggle/20090529.Accessed 6 Jan 2020.

  35. Chhabra, A., Vashishth, V., & Sharma, D. K. (2017). A game theory based secure model against Black hole attacks in Opportunistic Networks. In 2017 51st Annual conference on information sciences and systems (CISS), (pp. 1–6). IEEE.

  36. Chhabra, A., Vashishth, V., & Sharma, D. K. (2018). A fuzzy logic and game theory based adaptive approach for securing opportunistic networks against black hole attacks. International Journal of Communication Systems., 31(4), e3487.

    Article  Google Scholar 

  37. Chhabra, A., Vashishth, V., & Sharma, D. K. (2017). SEIR: A Stackelberg game based approach for energy-aware and incentivized routing in selfish Opportunistic Networks. In 2017 51st Annual Conference on information sciences and systems (CISS), (pp. 1–6). IEEE.

  38. Sutton, R. S., McAllester, D. A., Singh, S. P., & Mansour, Y. (2000). Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems (pp. 1057–1063).

  39. Van Hasselt, H., Guez, A., & Silver, D. (2016). Deep reinforcement learning with double q-learning. In Thirtieth AAAI Conference on Artificial Intelligence. Available from: https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/viewPaper/12389.

Download references

Acknowledgements

This work is partially supported by FCT/MCTES through national funds and when applicable co-funded EU funds under the Project UIDB/EEA/50008/2020; and by Brazilian National Council for Research and Development (CNPq) via Grant No. 309335/2017-5.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deepak Kumar Sharma.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sharma, D.K., Rodrigues, J.J.P.C., Vashishth, V. et al. RLProph: a dynamic programming based reinforcement learning approach for optimal routing in opportunistic IoT networks. Wireless Netw 26, 4319–4338 (2020). https://doi.org/10.1007/s11276-020-02331-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11276-020-02331-1

Keywords

Navigation