Abstract
Energy harvesting and source coding are two key techniques that can be exploited to mitigate the device battery limitation in Internet of things (IoT). However, these mitigating techniques come with the expense of adding to the complexity of control and optimization in the digital communication chain. In this paper, to strike a balance between the opposing goals of energy-efficient communication, high-fidelity reconstruction at the IoT gateway, low packet drop ratio, and timeliness of data, we address the delay-constrained joint lossy compression and transmission control problem for an IoT device with rechargeable energy storage. Given the stochastic dynamics emanated from the random nature of the energy harvesting process as well as the fading channel, we formulate a stochastic optimization problem using the formalism of constrained Markov decision process (CMDP) and utilize the standard Lagrangian technique to recast the problem in the unconstrained form. To compute the optimal control policy, we propose a two-timescale stochastic approximation algorithm consisting of some reinforcement learning (RL) algorithms for estimating the CMDP’s value function and stochastic gradient descent for estimating the Lagrange multiplier. Specifically, we propose three RL procedures: one based on standard Q-learning, and two accelerated learning procedures, namely post-decision state (PDS) and virtual experience (VE) learning. These algorithms exploit the known system dynamics and batch updates to overcome the slowness caused by the asynchronous updating pattern in Q-learning. Simulation results demonstrate that the proposed PDS and VE learning algorithms speed up the convergence to the optimal control policy by one and two orders of magnitude, respectively.
Similar content being viewed by others
Availability of data and material
Data sharing is not applicable—no new data were generated.
References
Musaddiq A, Bin ZY, Hahm O, Yu H, Bashir AK, Kim SW (2018) A survey on resource management in IoT operating systems. IEEE Access. https://doi.org/10.1109/ACCESS.2018.2808324
Osouli Tabrizi H, Al-Turjman F (2020) AI for dynamic packet size optimization of batteryless IoT nodes: a case study for wireless body area sensor networks. Neural Comput Appl 32(20):16167–16178. https://doi.org/10.1007/s00521-020-04813-x
Zang W, Miao F, Gravina R, Sun F, Fortino G, Li Y (2020) CMDP-based intelligent transmission for wireless body area network in remote health monitoring. Neural Comput Appl 32(3):829–837. https://doi.org/10.1007/s00521-019-04034-x
Tavli B, Bagci I, Ceylan O (2010) Optimal data compression and forwarding in wireless sensor networks. IEEE Commun Lett 14(5):408–410. https://doi.org/10.1109/LCOMM.2010.05.092372
Liu H-S, Chuang C-C, Lin C-C, Chang R-I, Wang C-H, Hsieh C-C (2011) Data compression for energy efficient communication on ubiquitous sensor network. J Appl Sci Eng 14(3):245–254. https://doi.org/10.6180/jase.2011.14.3.08
Reddy V, Gayathri P (2020) Energy efficient data transmission in WSN thru compressive slender penetrative etiquette. J Ambient Intell Humaniz Comput 11(11):4681–4693. https://doi.org/10.1007/s12652-020-01724-6
Narendra N, Ponnalagu K, Ghose A, Tamilselvam S (2015) Goal-driven context-aware data filtering in IoT-based systems. In: 2015 IEEE 18th international conference on intelligent transportation systems. IEEE, pp 2172–2179
Kim D-Y, Jeong Y-S, Kim S (2017) Data-filtering system to avoid total data distortion in IoT networking. Symmetry (Basel) 9(1):16. https://doi.org/10.3390/sym9010016
Ferrer-Cid P, Barcelo-Ordinas JM, Garcia-Vidal J, Ripoll A, Viana M (2020) Multisensor data fusion calibration in IoT Air pollution platforms. IEEE Internet Things J 7(4):3124–3132. https://doi.org/10.1109/JIOT.2020.2965283
Zhou J, Hu L, Wang F, Lu H, Zhao K (2013) An efficient multidimensional fusion algorithm for IoT data based on partitioning. Tsinghua Sci Technol 18(4):369–378. https://doi.org/10.1109/TST.2013.6574675
Hao H, Wang M, Tang Y, Li Q (2019) Research on data fusion of multi-sensors based on fuzzy preference relations. Neural Comput Appl. https://doi.org/10.1007/s00521-018-3778-5
Neely MJ (2008) Dynamic data compression for wireless transmission over a fading channel. In: 2008 42nd annual conference on information sciences and systems. IEEE, pp 1210–1215
Neely MJ, Sharma A (2008) dynamic data compression with distortion constraints for wireless transmission over a fading channel
Centenaro M, Rossi M, Zorzi M (2016) Joint optimization of lossy compression and transport in wireless sensor networks. 2016 IEEE globecom workshops. IEEE, Washington, DC, pp 1–6
Pielli C, Stefanovic C, Popovski P, Zorzi M (2018) Joint compression, channel coding, and retransmission for data fidelity with energy harvesting. IEEE Trans Commun 66(4):1425–1439. https://doi.org/10.1109/TCOMM.2017.2785323
Yu Y, Krishnamachari B, Prasanna VK (2008) Data gathering with tunable compression in sensor networks. IEEE Trans Parallel Distrib Syst 19(2):276–287. https://doi.org/10.1109/TPDS.2007.70709
Vecchio M, Giaffreda R, Marcelloni F (2014) Adaptive lossless entropy compressors for tiny IoT devices. IEEE Trans Wirel Commun 13(2):1088–1100. https://doi.org/10.1109/TWC.2013.121813.130993
Tahir M, Farrell R (2009) Optimal communication-computation tradeoff for wireless multimedia sensor network lifetime maximization. In: 2009 IEEE Wireless Communications and Networking Conference. IEEE, Budapest, pp 1–6
Hu W, Zhang W, Hu H, Wen Y, Tseng K-J (2017) Toward joint compression-transmission optimization for green wearable devices: an energy-delay tradeoff. IEEE Internet Things J 4(4):1006–1018. https://doi.org/10.1109/JIOT.2017.2704605
Hakami V, Mostafavi S, Javan NT, Rashidi Z (2020) An optimal policy for joint compression and transmission control in delay-constrained energy harvesting IoT devices. Comput Commun 160(1):554–566. https://doi.org/10.1016/j.comcom.2020.07.005
Zordan D, Melodia T, Rossi M (2016) On the design of temporal compression strategies for energy harvesting sensor networks. IEEE Trans Wirel Commun 15(2):1336–1352. https://doi.org/10.1109/TWC.2015.2489200
Gunduz D, Stamatiou K, Michelusi N, Zorzi M (2014) Designing intelligent energy harvesting communication systems. IEEE Commun Mag 52(1):210–216. https://doi.org/10.1109/MCOM.2014.6710085
Incebacak D, Zilan R, Tavli B, Barcelo-Ordinas JM, Garcia-Vidal J (2015) Optimal data compression for lifetime maximization in wireless sensor networks operating in stealth mode. Ad Hoc Netw. https://doi.org/10.1016/j.adhoc.2014.07.019
Bhat RV, Motani M, Lim TJ (2016) Distortion minimization in energy harvesting sensor nodes with compression power constraints. In: 2016 IEEE International Conference on Communications (ICC). IEEE, pp 1–6
Vinodha R, Durairaj S (2020) Soft computing approach based energy and correlation aware cooperative data collection for wireless sensor network. J Ambient Intell Humaniz Comput 12:5297
Zhang W, Fan R, Wen Y, Liu F (2018) Energy optimal wireless data transmission for wearable devices: a compression approach. IEEE Trans Veh Technol 67(10):9605–9618. https://doi.org/10.1109/TVT.2018.2859433
Pielli C, Biason A, Zanella A, Zorzi M (2016) Joint optimization of energy efficiency and data compression in tdma-based medium access control for the IoT. 2016 IEEE Globecom Workshops (GC Wkshps). IEEE, Washington, DC, pp 1–6
Ukil A, Bandyopadhyay S, Pal A (2015) IoT data compression: sensor-agnostic approach. In: 2015 Data Compression Conference. IEEE, pp 303–312
Elliott EO (1963) Estimates of error rates for codes on burst-noise channels. Bell Syst Tech J 42(5):1977–1997. https://doi.org/10.1002/j.1538-7305.1963.tb00955.x
Kang MJ, Jeong S, Yoon I, Noh DK (2017) Energy-aware determination of compression for low latency in solar-powered wireless sensor networks. Int J Distrib Sens Networks 13(2):155014771769416. https://doi.org/10.1177/1550147717694168
Castiglione P, Simeone O, Erkip E, Zemen T (2012) Energy management policies for energy-neutral source-channel coding. IEEE Trans Commun 60(9):2668–2678. https://doi.org/10.1109/TCOMM.2012.071212.110167
Neely MJ (2010) Stochastic network optimization with application to communication and queueing systems. Morgan & Claypool
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. The MIT Press
Tapparello C, Simeone O, Rossi M (2013) Dynamic compression-transmission for energy-harvesting multihop networks with correlated sources. IEEE/ACM Trans Netw 22(6):1729–1741
Cui Y, Lau VKN, Wang R, Huang H, Zhang S (2012) A survey on delay-aware resource control for wireless systems—large deviation theory, stochastic Lyapunov drift, and distributed stochastic learning. IEEE Trans Inf Theory 58(3):1677–1701. https://doi.org/10.1109/TIT.2011.2178150
Puterman ML (2014) Markov decision processes: discrete stochastic dynamic programming. John Wiley & Sons, Inc.
Altman E (1999) Constrained Markov decision processes. Chapman & Hall/CRC
Salodkar N, Bhorkar A, Karandikar A, Borkar V (2008) An on-line learning algorithm for energy efficient delay constrained scheduling over a fading channel. IEEE J Sel Areas Commun 26(4):732–742. https://doi.org/10.1109/JSAC.2008.080514
Watkins CJCH, Dayan P (1992) Q-learning. Mach Learn. https://doi.org/10.1007/BF00992698
Mastronarde N, van der Schaar M (2011) Fast reinforcement learning for energy-efficient wireless communication. IEEE Trans Signal Process 59(12):6262–6266. https://doi.org/10.1109/TSP.2011.2165211
Miozzo M, Zordan D, Dini P, Rossi M (2014) SolarStat: modeling photovoltaic sources through stochastic Markov processes. In: 2014 IEEE international energy conference (ENERGYCON). IEEE, Dubrovnik, pp 688–695
Zordan D, Martinez B, Vilajosana I, Rossi M (2014) On the performance of lossy compression schemes for energy constrained sensor networking. ACM Trans Sens Netw 11(1):1–34. https://doi.org/10.1145/2629660
Schoellhammer T, Greenstein B, Osterweil E, Wimbrow M, Estrin D (2004) Lightweight temporal compression of microclimate datasets [wireless sensor networks]. In: 29th Annual IEEE international conference on local computer networks. IEEE (Comput. Soc.), Tampa, FL, pp 516–524
Wang R, Zhang J, Song SH, Letaief KB (2016) Optimal QoS-aware channel assignment in D2D communications with partial CSI. IEEE Trans Wirel Commun 15(11):7594–7609. https://doi.org/10.1109/TWC.2016.2604813
Peng S (1992) Stochastic Hamilton–Jacobi–Bellman equations. SIAM J Control Optim 30(2):284–304. https://doi.org/10.1137/0330018
Bertsekas DP, Gallager RG (1992) Data networks, 2nd edn. Prentice Hall
Lau VKN, Cui Y (2010) Delay-optimal power and subcarrier allocation for OFDMA systems via stochastic approximation. IEEE Trans Wirel Commun 9(1):227–233
Liu A, Lau VKN (2014) Cache-enabled opportunistic cooperative MIMO for video streaming in wireless systems. IEEE Trans Signal Process 62(2):390–402. https://doi.org/10.1109/TSP.2013.2291211
Bertsekas DP (2012) Dynamic programming and optimal control, 4th edn. Athena Scientific
Aslani R, Hakami V, Dehghan M (2018) A token-based incentive mechanism for video streaming applications in peer-to-peer networks. Multimed Tools Appl 77(12):14625–14653. https://doi.org/10.1007/s11042-017-5051-9
Gosavi A (2015) Simulation-based optimization parametric optimization techniques and reinforcement learning. Springer
Davidson J (1994) Stochastic limit theory: an introduction for econometricians, 1st edn. Oxford University Press
Borkar VS (1997) Stochastic approximation with two time scales. Syst Control Lett 29(5):291–294. https://doi.org/10.1016/S0167-6911(97)90015-3
Borkar VS (2008) Stochastic approximation: a dynamical systems viewpoint. Hindustan Book Agency
Borkar VS (2005) An actor-critic algorithm for constrained Markov decision processes. Syst Control Lett 54(3):207–213. https://doi.org/10.1016/j.sysconle.2004.08.007
Pokhrel SR, Verma S, Garg S, Sharma AK, Choi J (2021) An efficient clustering framework for massive sensor networking in industrial internet of things. IEEE Trans Ind Informatics 17(7):4917–4924. https://doi.org/10.1109/TII.2020.3006276
He Y, Zhang Z, Yu FR, Zhao N, Yin H, Leung VCM, Zhang Y (2017) Deep-reinforcement-learning-based optimization for cache-enabled opportunistic interference alignment wireless networks. IEEE Trans Veh Technol 66(11):10433–10445. https://doi.org/10.1109/TVT.2017.2751641
Biason A, Pielli C, Zanella A, Zorzi M (2018) Access control for IoT nodes with energy and fidelity constraints. IEEE Trans Wirel Commun 17(5):3242–3257. https://doi.org/10.1109/TWC.2018.2808520
Arastouie N, Sabaei M, Hakami V, Soltanali S (2013) A novel trade–off between communication and computation costs for data aggregation in wireless sensor networks. Int J Ad Hoc Ubiquitous Comput 12(4):245–253. https://doi.org/10.1504/IJAHUC.2013.052865
Borkar VS (2002) Convex analytic methods in Markov decision processes. In: Feinberg EA, Shwartz A (eds) Handbook of Markov decision processes. Springer, Boston, MA, pp 347–375
Hakami V, Dehghan M (2016) Learning stationary correlated equilibria in constrained general-sum stochastic games. IEEE Trans Cybern 46(7):1640–1654. https://doi.org/10.1109/TCYB.2015.2453165
Funding
No funding was received to assist with the preparation of this manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All authors declare that they have no conflicts of interest that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Namjoonia, F., Sheikhi, M. & Hakami, V. Fast reinforcement learning algorithms for joint adaptive source coding and transmission control in IoT devices with renewable energy storage. Neural Comput & Applic 34, 3959–3979 (2022). https://doi.org/10.1007/s00521-021-06656-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-06656-6