Skip to main content

Advertisement

Log in

Fast reinforcement learning algorithms for joint adaptive source coding and transmission control in IoT devices with renewable energy storage

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Energy harvesting and source coding are two key techniques that can be exploited to mitigate the device battery limitation in Internet of things (IoT). However, these mitigating techniques come with the expense of adding to the complexity of control and optimization in the digital communication chain. In this paper, to strike a balance between the opposing goals of energy-efficient communication, high-fidelity reconstruction at the IoT gateway, low packet drop ratio, and timeliness of data, we address the delay-constrained joint lossy compression and transmission control problem for an IoT device with rechargeable energy storage. Given the stochastic dynamics emanated from the random nature of the energy harvesting process as well as the fading channel, we formulate a stochastic optimization problem using the formalism of constrained Markov decision process (CMDP) and utilize the standard Lagrangian technique to recast the problem in the unconstrained form. To compute the optimal control policy, we propose a two-timescale stochastic approximation algorithm consisting of some reinforcement learning (RL) algorithms for estimating the CMDP’s value function and stochastic gradient descent for estimating the Lagrange multiplier. Specifically, we propose three RL procedures: one based on standard Q-learning, and two accelerated learning procedures, namely post-decision state (PDS) and virtual experience (VE) learning. These algorithms exploit the known system dynamics and batch updates to overcome the slowness caused by the asynchronous updating pattern in Q-learning. Simulation results demonstrate that the proposed PDS and VE learning algorithms speed up the convergence to the optimal control policy by one and two orders of magnitude, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

source coding and data transmission model

Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Availability of data and material

Data sharing is not applicable—no new data were generated.

References

  1. Musaddiq A, Bin ZY, Hahm O, Yu H, Bashir AK, Kim SW (2018) A survey on resource management in IoT operating systems. IEEE Access. https://doi.org/10.1109/ACCESS.2018.2808324

    Article  Google Scholar 

  2. Osouli Tabrizi H, Al-Turjman F (2020) AI for dynamic packet size optimization of batteryless IoT nodes: a case study for wireless body area sensor networks. Neural Comput Appl 32(20):16167–16178. https://doi.org/10.1007/s00521-020-04813-x

    Article  Google Scholar 

  3. Zang W, Miao F, Gravina R, Sun F, Fortino G, Li Y (2020) CMDP-based intelligent transmission for wireless body area network in remote health monitoring. Neural Comput Appl 32(3):829–837. https://doi.org/10.1007/s00521-019-04034-x

    Article  Google Scholar 

  4. Tavli B, Bagci I, Ceylan O (2010) Optimal data compression and forwarding in wireless sensor networks. IEEE Commun Lett 14(5):408–410. https://doi.org/10.1109/LCOMM.2010.05.092372

    Article  Google Scholar 

  5. Liu H-S, Chuang C-C, Lin C-C, Chang R-I, Wang C-H, Hsieh C-C (2011) Data compression for energy efficient communication on ubiquitous sensor network. J Appl Sci Eng 14(3):245–254. https://doi.org/10.6180/jase.2011.14.3.08

    Article  Google Scholar 

  6. Reddy V, Gayathri P (2020) Energy efficient data transmission in WSN thru compressive slender penetrative etiquette. J Ambient Intell Humaniz Comput 11(11):4681–4693. https://doi.org/10.1007/s12652-020-01724-6

    Article  Google Scholar 

  7. Narendra N, Ponnalagu K, Ghose A, Tamilselvam S (2015) Goal-driven context-aware data filtering in IoT-based systems. In: 2015 IEEE 18th international conference on intelligent transportation systems. IEEE, pp 2172–2179

  8. Kim D-Y, Jeong Y-S, Kim S (2017) Data-filtering system to avoid total data distortion in IoT networking. Symmetry (Basel) 9(1):16. https://doi.org/10.3390/sym9010016

    Article  MathSciNet  Google Scholar 

  9. Ferrer-Cid P, Barcelo-Ordinas JM, Garcia-Vidal J, Ripoll A, Viana M (2020) Multisensor data fusion calibration in IoT Air pollution platforms. IEEE Internet Things J 7(4):3124–3132. https://doi.org/10.1109/JIOT.2020.2965283

    Article  Google Scholar 

  10. Zhou J, Hu L, Wang F, Lu H, Zhao K (2013) An efficient multidimensional fusion algorithm for IoT data based on partitioning. Tsinghua Sci Technol 18(4):369–378. https://doi.org/10.1109/TST.2013.6574675

    Article  Google Scholar 

  11. Hao H, Wang M, Tang Y, Li Q (2019) Research on data fusion of multi-sensors based on fuzzy preference relations. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3778-5

    Article  Google Scholar 

  12. Neely MJ (2008) Dynamic data compression for wireless transmission over a fading channel. In: 2008 42nd annual conference on information sciences and systems. IEEE, pp 1210–1215

  13. Neely MJ, Sharma A (2008) dynamic data compression with distortion constraints for wireless transmission over a fading channel

  14. Centenaro M, Rossi M, Zorzi M (2016) Joint optimization of lossy compression and transport in wireless sensor networks. 2016 IEEE globecom workshops. IEEE, Washington, DC, pp 1–6

    Google Scholar 

  15. Pielli C, Stefanovic C, Popovski P, Zorzi M (2018) Joint compression, channel coding, and retransmission for data fidelity with energy harvesting. IEEE Trans Commun 66(4):1425–1439. https://doi.org/10.1109/TCOMM.2017.2785323

    Article  Google Scholar 

  16. Yu Y, Krishnamachari B, Prasanna VK (2008) Data gathering with tunable compression in sensor networks. IEEE Trans Parallel Distrib Syst 19(2):276–287. https://doi.org/10.1109/TPDS.2007.70709

    Article  Google Scholar 

  17. Vecchio M, Giaffreda R, Marcelloni F (2014) Adaptive lossless entropy compressors for tiny IoT devices. IEEE Trans Wirel Commun 13(2):1088–1100. https://doi.org/10.1109/TWC.2013.121813.130993

    Article  Google Scholar 

  18. Tahir M, Farrell R (2009) Optimal communication-computation tradeoff for wireless multimedia sensor network lifetime maximization. In: 2009 IEEE Wireless Communications and Networking Conference. IEEE, Budapest, pp 1–6

  19. Hu W, Zhang W, Hu H, Wen Y, Tseng K-J (2017) Toward joint compression-transmission optimization for green wearable devices: an energy-delay tradeoff. IEEE Internet Things J 4(4):1006–1018. https://doi.org/10.1109/JIOT.2017.2704605

    Article  Google Scholar 

  20. Hakami V, Mostafavi S, Javan NT, Rashidi Z (2020) An optimal policy for joint compression and transmission control in delay-constrained energy harvesting IoT devices. Comput Commun 160(1):554–566. https://doi.org/10.1016/j.comcom.2020.07.005

    Article  Google Scholar 

  21. Zordan D, Melodia T, Rossi M (2016) On the design of temporal compression strategies for energy harvesting sensor networks. IEEE Trans Wirel Commun 15(2):1336–1352. https://doi.org/10.1109/TWC.2015.2489200

    Article  Google Scholar 

  22. Gunduz D, Stamatiou K, Michelusi N, Zorzi M (2014) Designing intelligent energy harvesting communication systems. IEEE Commun Mag 52(1):210–216. https://doi.org/10.1109/MCOM.2014.6710085

    Article  Google Scholar 

  23. Incebacak D, Zilan R, Tavli B, Barcelo-Ordinas JM, Garcia-Vidal J (2015) Optimal data compression for lifetime maximization in wireless sensor networks operating in stealth mode. Ad Hoc Netw. https://doi.org/10.1016/j.adhoc.2014.07.019

    Article  Google Scholar 

  24. Bhat RV, Motani M, Lim TJ (2016) Distortion minimization in energy harvesting sensor nodes with compression power constraints. In: 2016 IEEE International Conference on Communications (ICC). IEEE, pp 1–6

  25. Vinodha R, Durairaj S (2020) Soft computing approach based energy and correlation aware cooperative data collection for wireless sensor network. J Ambient Intell Humaniz Comput 12:5297

    Article  Google Scholar 

  26. Zhang W, Fan R, Wen Y, Liu F (2018) Energy optimal wireless data transmission for wearable devices: a compression approach. IEEE Trans Veh Technol 67(10):9605–9618. https://doi.org/10.1109/TVT.2018.2859433

    Article  Google Scholar 

  27. Pielli C, Biason A, Zanella A, Zorzi M (2016) Joint optimization of energy efficiency and data compression in tdma-based medium access control for the IoT. 2016 IEEE Globecom Workshops (GC Wkshps). IEEE, Washington, DC, pp 1–6

    Google Scholar 

  28. Ukil A, Bandyopadhyay S, Pal A (2015) IoT data compression: sensor-agnostic approach. In: 2015 Data Compression Conference. IEEE, pp 303–312

  29. Elliott EO (1963) Estimates of error rates for codes on burst-noise channels. Bell Syst Tech J 42(5):1977–1997. https://doi.org/10.1002/j.1538-7305.1963.tb00955.x

    Article  Google Scholar 

  30. Kang MJ, Jeong S, Yoon I, Noh DK (2017) Energy-aware determination of compression for low latency in solar-powered wireless sensor networks. Int J Distrib Sens Networks 13(2):155014771769416. https://doi.org/10.1177/1550147717694168

    Article  Google Scholar 

  31. Castiglione P, Simeone O, Erkip E, Zemen T (2012) Energy management policies for energy-neutral source-channel coding. IEEE Trans Commun 60(9):2668–2678. https://doi.org/10.1109/TCOMM.2012.071212.110167

    Article  Google Scholar 

  32. Neely MJ (2010) Stochastic network optimization with application to communication and queueing systems. Morgan & Claypool

  33. Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. The MIT Press

  34. Tapparello C, Simeone O, Rossi M (2013) Dynamic compression-transmission for energy-harvesting multihop networks with correlated sources. IEEE/ACM Trans Netw 22(6):1729–1741

    Article  Google Scholar 

  35. Cui Y, Lau VKN, Wang R, Huang H, Zhang S (2012) A survey on delay-aware resource control for wireless systems—large deviation theory, stochastic Lyapunov drift, and distributed stochastic learning. IEEE Trans Inf Theory 58(3):1677–1701. https://doi.org/10.1109/TIT.2011.2178150

    Article  MathSciNet  MATH  Google Scholar 

  36. Puterman ML (2014) Markov decision processes: discrete stochastic dynamic programming. John Wiley & Sons, Inc.

  37. Altman E (1999) Constrained Markov decision processes. Chapman & Hall/CRC

  38. Salodkar N, Bhorkar A, Karandikar A, Borkar V (2008) An on-line learning algorithm for energy efficient delay constrained scheduling over a fading channel. IEEE J Sel Areas Commun 26(4):732–742. https://doi.org/10.1109/JSAC.2008.080514

    Article  Google Scholar 

  39. Watkins CJCH, Dayan P (1992) Q-learning. Mach Learn. https://doi.org/10.1007/BF00992698

    Article  MATH  Google Scholar 

  40. Mastronarde N, van der Schaar M (2011) Fast reinforcement learning for energy-efficient wireless communication. IEEE Trans Signal Process 59(12):6262–6266. https://doi.org/10.1109/TSP.2011.2165211

    Article  MathSciNet  MATH  Google Scholar 

  41. Miozzo M, Zordan D, Dini P, Rossi M (2014) SolarStat: modeling photovoltaic sources through stochastic Markov processes. In: 2014 IEEE international energy conference (ENERGYCON). IEEE, Dubrovnik, pp 688–695

  42. Zordan D, Martinez B, Vilajosana I, Rossi M (2014) On the performance of lossy compression schemes for energy constrained sensor networking. ACM Trans Sens Netw 11(1):1–34. https://doi.org/10.1145/2629660

    Article  Google Scholar 

  43. Schoellhammer T, Greenstein B, Osterweil E, Wimbrow M, Estrin D (2004) Lightweight temporal compression of microclimate datasets [wireless sensor networks]. In: 29th Annual IEEE international conference on local computer networks. IEEE (Comput. Soc.), Tampa, FL, pp 516–524

  44. Wang R, Zhang J, Song SH, Letaief KB (2016) Optimal QoS-aware channel assignment in D2D communications with partial CSI. IEEE Trans Wirel Commun 15(11):7594–7609. https://doi.org/10.1109/TWC.2016.2604813

    Article  Google Scholar 

  45. Peng S (1992) Stochastic Hamilton–Jacobi–Bellman equations. SIAM J Control Optim 30(2):284–304. https://doi.org/10.1137/0330018

    Article  MathSciNet  MATH  Google Scholar 

  46. Bertsekas DP, Gallager RG (1992) Data networks, 2nd edn. Prentice Hall

    MATH  Google Scholar 

  47. Lau VKN, Cui Y (2010) Delay-optimal power and subcarrier allocation for OFDMA systems via stochastic approximation. IEEE Trans Wirel Commun 9(1):227–233

    Article  Google Scholar 

  48. Liu A, Lau VKN (2014) Cache-enabled opportunistic cooperative MIMO for video streaming in wireless systems. IEEE Trans Signal Process 62(2):390–402. https://doi.org/10.1109/TSP.2013.2291211

    Article  MathSciNet  MATH  Google Scholar 

  49. Bertsekas DP (2012) Dynamic programming and optimal control, 4th edn. Athena Scientific

    MATH  Google Scholar 

  50. Aslani R, Hakami V, Dehghan M (2018) A token-based incentive mechanism for video streaming applications in peer-to-peer networks. Multimed Tools Appl 77(12):14625–14653. https://doi.org/10.1007/s11042-017-5051-9

    Article  Google Scholar 

  51. Gosavi A (2015) Simulation-based optimization parametric optimization techniques and reinforcement learning. Springer

    MATH  Google Scholar 

  52. Davidson J (1994) Stochastic limit theory: an introduction for econometricians, 1st edn. Oxford University Press

    Book  Google Scholar 

  53. Borkar VS (1997) Stochastic approximation with two time scales. Syst Control Lett 29(5):291–294. https://doi.org/10.1016/S0167-6911(97)90015-3

    Article  MathSciNet  MATH  Google Scholar 

  54. Borkar VS (2008) Stochastic approximation: a dynamical systems viewpoint. Hindustan Book Agency

  55. Borkar VS (2005) An actor-critic algorithm for constrained Markov decision processes. Syst Control Lett 54(3):207–213. https://doi.org/10.1016/j.sysconle.2004.08.007

    Article  MathSciNet  MATH  Google Scholar 

  56. Pokhrel SR, Verma S, Garg S, Sharma AK, Choi J (2021) An efficient clustering framework for massive sensor networking in industrial internet of things. IEEE Trans Ind Informatics 17(7):4917–4924. https://doi.org/10.1109/TII.2020.3006276

    Article  Google Scholar 

  57. He Y, Zhang Z, Yu FR, Zhao N, Yin H, Leung VCM, Zhang Y (2017) Deep-reinforcement-learning-based optimization for cache-enabled opportunistic interference alignment wireless networks. IEEE Trans Veh Technol 66(11):10433–10445. https://doi.org/10.1109/TVT.2017.2751641

    Article  Google Scholar 

  58. Biason A, Pielli C, Zanella A, Zorzi M (2018) Access control for IoT nodes with energy and fidelity constraints. IEEE Trans Wirel Commun 17(5):3242–3257. https://doi.org/10.1109/TWC.2018.2808520

    Article  Google Scholar 

  59. Arastouie N, Sabaei M, Hakami V, Soltanali S (2013) A novel trade–off between communication and computation costs for data aggregation in wireless sensor networks. Int J Ad Hoc Ubiquitous Comput 12(4):245–253. https://doi.org/10.1504/IJAHUC.2013.052865

    Article  Google Scholar 

  60. Borkar VS (2002) Convex analytic methods in Markov decision processes. In: Feinberg EA, Shwartz A (eds) Handbook of Markov decision processes. Springer, Boston, MA, pp 347–375

    Chapter  Google Scholar 

  61. Hakami V, Dehghan M (2016) Learning stationary correlated equilibria in constrained general-sum stochastic games. IEEE Trans Cybern 46(7):1640–1654. https://doi.org/10.1109/TCYB.2015.2453165

    Article  Google Scholar 

Download references

Funding

No funding was received to assist with the preparation of this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vesal Hakami.

Ethics declarations

Conflict of interest

All authors declare that they have no conflicts of interest that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Namjoonia, F., Sheikhi, M. & Hakami, V. Fast reinforcement learning algorithms for joint adaptive source coding and transmission control in IoT devices with renewable energy storage. Neural Comput & Applic 34, 3959–3979 (2022). https://doi.org/10.1007/s00521-021-06656-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-06656-6

Keywords

Navigation