Abstract
The increasing demand for data collection from various Internet of Things (IoT) devices and limited energy of nodes in these networks led to the complex network conditions. Currently, the role of energy consumption for a large number of interconnected nodes in IoT is one of the research subjects. In many IoT applications, IoT devices are deployed in environments that are difficult to physically access. Therefore, an energy-efficient mechanism is important. This article proposes a new method for energy-efficient improvement based on machine learning from the point of view of cognitive networks, which is estimated using learning automata of network parameters. Then, transmission power of the network nodes to improve energy consumption is adjusted in a self-organized, self-aware and dynamic manner, and the behavior of the network nodes is adapted according to the current conditions of the network. One of the strengths of this method compared to the existing methods is that network conditions are estimated through parameters of Delay (D), Channel Status (S) and data rate, and this mechanism makes all decisions based on the network conditions. The results of the experiments show that examining energy-efficient from the perspective of cognitive network not only leads to the improvement of Quality of Service (QoS) parameters such as operational power and end to end delay in the network, but also an increase in the life of the network compared to other energy-efficient methods.
Similar content being viewed by others
Availability of data and material
Not applicable.
References
Naderi M, Mahdaee K, Rahmani P (2023) Hierarchical traffic light-aware routing via fuzzy reinforcement learning in software-defined vehicular networks. Peer-to-Peer Netw Appl 16(2):1174–1198. https://doi.org/10.1007/s12083-022-01424-2
Naderi M, Zargari F, Sadatpour V, Ghanbari M (2017) A 3-parameter routing cost function for improving opportunistic routing performance in VANETs. Wirel Pers Commun 97(1):1–15. https://doi.org/10.1007/s11277-017-4496-5
Naderi M, Zargari F, Ghanbari M (2019) Adaptive beacon broadcast in opportunistic routing for VANETs. Ad Hoc Netw 86:119–130. https://doi.org/10.1016/j.adhoc.2018.11.011
Naderi M, Ghanbari M (2023) Adaptively prioritizing candidate forwarding set in opportunistic routing in VANETs. Ad Hoc Netw 140:103048. https://doi.org/10.1016/j.adhoc.2022.103048
Naderi M, Chakareski J, Ghanbari M (2023) Hierarchical Q-learning-enabled neutrosophic AHP scheme in candidate relay set size adaption in vehicular networks. Comput Netw 235:109968. https://doi.org/10.1016/j.comnet.2023.109968
Jiang S, Huang Z, Ji Y (2020) Adaptive UAV-assisted geographic routing with Q-learning in VANET. IEEE Commun Lett 25(4):1358–1362. https://doi.org/10.1109/LCOMM.2020.3048250
Da Costa LA, Kunst R, Pignaton de Freitas E (2021) Q-FANET: improved Q-learning based routing protocol for FANETs. Comput Netw 198:108379. https://doi.org/10.1016/j.comnet.2021.108379
Quy VK, Nam VH, Linh DM, Ngoc LA (2022) Routing algorithms for MANET-I networks: comprehensive survey. Wirel Pers Commun 125(4):3501–3535. https://doi.org/10.1007/s11277-022-09722-x
Quy VK, Nam VH, Linh DM, Ban NT, Han ND (2021) A survey of QoS-aware routing protocols for the MANET-WSN convergence scenarios in IoT networks. Wirel Pers Commun 120(1):49–62. https://doi.org/10.1007/s11277-021-08433-z
Mahamat M, Jaber G, Bouabdallah A (2023) Achieving efficient energy-aware security in IoT networks: a survey of recent solutions and research challenges. Wirel Netw 29(2):787–808. https://doi.org/10.1007/s11276-022-03170-y
Aboubakar M, Kellil M, Roux P (2022) A review of IoT network management: current status and perspectives. J King Saud Univ Comput Inf Sci 34(7):4163–4176. https://doi.org/10.1016/j.jksuci.2021.03.006
SaiRamesh L, Sabena S, Selvakumar K (2022) Energy efficient service selection from IoT based on QoS using HMM with KNN and XGBOOST. Wirel Pers Commun 124(4):3591–3602. https://doi.org/10.1007/s11277-022-09527-y
Su Z, Feng W, Tang J, Chen Z, Fu Y, Zhao N, Wong KK (2022) Energy efficiency optimization for D2D communications underlaying UAV-assisted industrial IoT networks with SWIPT. IEEE Internet Things J 10(3):1990–2002. https://doi.org/10.1109/JIOT.2022.3142026
Praveen KV, Prathap PJ (2021) Energy efficient congestion aware resource allocation and routing protocol for IoT network using hybrid optimization techniques. Wirel Pers Commun 117(2):1187–1207. https://doi.org/10.1007/s11277-020-07917-8
Wu Q, Ding G, Xu Y, Feng S, Du Z, Wang J, Long K (2014) Cognitive internet of things: a new paradigm beyond connection. IEEE Internet Things J 1(2):129–143. https://doi.org/10.1109/JIOT.2014.2311513
Arora S, Batra I, Malik A, Luhach AK, Alnumay WS, Chatterjee P (2023) Seed: secure and energy efficient data-collection method for IoT network. Multimed Tools Appl 82(2):3139–3153. https://doi.org/10.1007/s11042-022-13614-4
Al-Ma’aitah M, Alwadain A, Saad A (2021) Transmission adaptive mode selection (TAMS) method for internet of things device energy management. Peer-to-Peer Netw Appl 14:2316–2326. https://doi.org/10.1007/s12083-020-00937-y
Urosevic U (2022) New solutions for increasing energy efficiency in massive IoT. SIViP 16(7):1861–1868. https://doi.org/10.1007/s11760-022-02145-y
Liu X, Jia M, Zhou M, Wang B, Durrani TS (2021) Integrated cooperative spectrum sensing and access control for cognitive industrial Internet of Things. IEEE Internet Things J 10(3):1887–1896. https://doi.org/10.1109/JIOT.2021.3137408
Bakshi M, Chowdhury C, Maulik U (2021) Energy-efficient cluster head selection algorithm for IoT using modified glow-worm swarm optimization. J Supercomput 77:6457–6475. https://doi.org/10.1007/s11227-020-03536-z
Thenmozhi R, Sakthivel P, Kulothungan K (2022) Hybrid multi-objective-optimization algorithm for energy efficient priority-based QoS routing in IoT networks. Wirel Netw 1–18. https://doi.org/10.1007/s11276-021-02848-z
Yang H, Zhong WD, Chen C, Alphones A, Xie X (2020) Deep-reinforcement-learning-based energy-efficient resource management for social and cognitive internet of things. IEEE Internet Things J 7(6):5677–5689. https://doi.org/10.1109/JIOT.2020.2980586
Yang L, Li M, Si P, Yang R, Sun E, Zhang Y (2020) Energy-efficient resource allocation for blockchain-enabled industrial Internet of Things with deep reinforcement learning. IEEE Internet Things J 8(4):2318–2329. https://doi.org/10.1109/JIOT.2020.3030646
Li Y, Sun Z, Han L, Mei N (2017) Fuzzy comprehensive evaluation method for energy management systems based on an internet of things. IEEE Access 5:21312–21322. https://doi.org/10.1109/ACCESS.2017.2728081
Lei F, Zhao S, Sun M, Zhou Z (2019) Energy-efficient boundary detection of continuous objects in internet of things sensing networks. IEEE Access 8:92007–92018. https://doi.org/10.1109/ACCESS.2019.2955708
Ashiquzzaman A, Lee H, Um TW, Kim J (2020) Energy-efficient IoT sensor calibration with deep reinforcement learning. IEEE Access 8:97045–97055. https://doi.org/10.1109/ACCESS.2020.2992853
Liu Y, Liu A, Hu Y, Li Z, Choi YJ, Sekiya H, Li J (2016) FFSC: an energy efficiency communications approach for delay minimizing in internet of things. IEEE Access 4:3775–3793. https://doi.org/10.1109/ACCESS.2016.2588278
Yang Z, Xu W, Pan Y, Pan C, Chen M (2017) Energy efficient resource allocation in machine-to-machine communications with multiple access and energy harvesting for IoT. IEEE Internet Things J 5(1):229–245. https://doi.org/10.1109/JIOT.2017.2778766
Multanen J, Kulttala H, Tervo K, Jääskeläinen P (2020) Energy efficient low latency multi-issue cores for intelligent always-on IoT applications. J Signal Process Syst 92:1057–1073. https://doi.org/10.1007/s11265-020-01578-3
Tomazzoli C, Scannapieco S, Cristani M (2020) Internet of things and artificial intelligence enable energy efficiency. J Ambient Intell Humaniz Comput 14(5):4933–4954. https://doi.org/10.1007/s12652-020-02151-3
Yang H, Alphones A, Zhong WD, Chen C, Xie X (2019) Learning-based energy-efficient resource management by heterogeneous RF/VLC for ultra-reliable low-latency industrial IoT networks. IEEE Trans Industr Inf 16(8):5565–5576. https://doi.org/10.1109/TII.2019.2933867
Raj Kumar NP, Bala GJ (2022) A cognitive knowledged energy-efficient path selection using centroid and ant-colony optimized hybrid protocol for WSN-assisted IoT. Wirel Pers Commun 124(3):1993–2028. https://doi.org/10.1007/s11277-021-09440-w
Omidkar A, Khalili A, Nguyen HH, Shafiei H (2022) Reinforcement-learning-based resource allocation for energy-harvesting-aided D2D communications in IoT networks. IEEE Internet Things J 9(17):16521–16531. https://doi.org/10.1109/JIOT.2022.3151001
Javadpour A, Wang G, Rezaei S (2020) Resource management in a peer-to-peer cloud network for IOT. Wirel Pers Commun 115:2471–2488. https://doi.org/10.1007/s11277-020-07691-7
Javadpour A, Nafei A, Ja’fari F, Pinto P, Zhang W, Sangaiah AK (2022) An intelligent energy-efficient approach for managing IOE tasks in cloud platforms. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-022-04464-x
Javadpour A, Sangaiah AK, Pinto P, Ja’fari F, Zhang W, Abadi AM, Ahmadi H (2022) An energy-optimized embedded load balancing using DVFS computing in cloud data centers. Comput Commun. https://doi.org/10.1016/j.comcom.2022.10.019
Dattatraya KN, Rao KR (2022) Hybrid based cluster head selection for maximizing network lifetime and energy efficiency in WSN. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2019.04.003
Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT press
Thathachar MA, Sastry PS (2003) Networks of learning automata: techniques for online stochastic optimization. Springer Science & Business Media
Rahmani P, Javadi HHS, Bakhshi H, Hosseinzadeh M (2018) TCLAB: a new topology control protocol in cognitive MANETs based on learning automata. J Netw Syst Manage 26:426–462. https://doi.org/10.1007/s10922-017-9422-3
Rahmani P, Javadi HHS, Bakhshi H, Hosseinzadeh M (2018) Cog-MAC protocol: channel allocation in cognitive ad hoc networks based on the game of learning automata. Wirel Pers Commun 103:2285–2316. https://doi.org/10.1007/s11277-018-5911-2
Rahmani P, Javadi HHS (2019) Topology control in MANETs using the Bayesian pursuit algorithm. Wirel Pers Commun 106:1089–1116. https://doi.org/10.1007/s11277-019-062054
Rezvanian A, Moradabadi B, Ghavipour M, Khomami MMD, Meybodi MR (2019) Learning automata approach for social networks, vol 820. Springer, Berlin
Author information
Authors and Affiliations
Contributions
All authors contributed equally to this manuscript.
Corresponding author
Ethics declarations
Ethics approval
The paper is original, and any other publishing house is not considering it for publication. The paper reflects the author's own research and analysis in a truthful and complete manner. All sources used are properly disclosed (correct citation).
Consent for publication
Not applicable.
Consent to participate
Not applicable.
Competing interests
There is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix
Proofs of Convergence for CALA algorithm: We can write the CALA algorithm in the form.
-
1.
\(Z\left(k+1\right)=Z\left(k\right)+\lambda G\left(Z\left(k\right),\xi \left(k\right)\right)\)
where: \(Z(k) = {\left[\mu \left(K\right) \sigma \left(k\right)\right]}^{T}\) is the state at k, \(\xi \left(k\right)={\left[x\left(k\right) {\beta }_{x\left(k\right) }{\beta }_{\mu \left(k\right)}\right]}^{T}\) is random and the probability of its taking different values depends on Z(k), G symbolically represents the updating of two components of Z(k) in formulate updating \(\mu \left(k\right)\ and\ \sigma \left(k\right).\)
The algorithm updates \(\mu \left(k\right) , \sigma \left(k\right)\) parameters of the action probability distribution at k.
-
2.
\(\left(k+1\right)= \mu \left(k\right)+ \lambda {F}_{1}\left(\mu \left(K\right),\sigma \left(K\right),X\left(K\right),{\beta }_{X\left(K\right)},{\beta }_{\mu \left(K\right)}\right)\), \(\sigma \left(K+1\right)= \sigma \left(K\right)+ \lambda {F}_{2}\left(\mu \left(K\right),\sigma \left(K\right),X\left(K\right),{\beta }_{X\left(K\right)}, {\beta }_{\mu \left(K\right)}\right)-\lambda K\left[\sigma \left(k\right)-{\sigma }_{l}\right]\)
where: \({F}_{1}\left(\mu ,\sigma ,X,{\beta }_{X},{\beta }_{\mu }\right)=\left(\frac{{\beta }_{X}-{\beta }_{\mu }}{\phi \left(\sigma \right)}\right)(\frac{X-\mu }{\phi \left(\sigma \right)})\), \({F}_{2}\left(\mu ,\sigma ,X,{\beta }_{X},{\beta }_{\mu }\right)=(\frac{{\beta }_{X}-{\beta }_{\mu }}{\phi \left(\sigma \right)}) [{\left(\frac{X-\mu }{\phi \left(\sigma \right)}\right)}^{2}-1]\).
x(k) is action selected at k and \({\beta }_{X\left(K\right)}\), \({\beta }_{\mu \left(K\right)}\) are the reinforcements for the two actions x(k) and \(\mu (k)\) respectively, \({F}_{1}\ and\ {F}_{2}\) are functions with five variables.
f(x) Denote the reward probability function.
-
3.
\(f\left(x\right)=E\left[{\beta }_{x\left(k\right)}\left|x\left(k\right)=x\right)\right]\)
where \({\beta }_{x(k)}\) is reinforcement and \(x\left(k\right)\) is action selected at k.
-
4.
\(\mathrm{J}\left(\mu ,\sigma \right)=\int f\left(x\right) d N\left(\mu ,\sigma \right)=\int f\left(x\right) d N\left(\mu ,\sigma \right)\ dx\)
where N (\(\mu,\ \sigma\)) = \(\frac{1}{\sigma \sqrt{2\pi }} {e}^{-\frac{{\left(x-\mu \right)}^{2}}{2{\sigma }^{2}}}\) is the normal density. (In general, d N (a, b) denote integration with respect to the normal distribution with mean a and variance \({b}^{2}\)). With a simple coordinate transformation, we can rewrite formula 4 as:
-
5.
\(\mathrm{J}\left(\mu ,\sigma \right)=\int f\left(\sigma x+\mu \right)\ d\ N\left(\mathrm{0,1}\right)\), Assuming that \(\mu =0\), \(\sigma =1\)
The g is a function of X, have two Components, \({g}_{1}, {g}_{2}\)
We can equivalently denote is as a function of two variables: \(\mu , \sigma\), we can calculate \({g}_{1}, {g}_{2}\) as follows:
-
6.
\({g}_{1}\left(\mu ,\sigma \right)\) = E[\({F}_{1}(\mu (K),\sigma (K),X(K),\beta \_X(K) ,\beta \_\mu (K)|\mu (k)=\mu , \sigma (k)= \sigma )\)] = \(\int \left\{E\left[\frac{{\beta }_{X}-{\beta }_{\mu }}{\phi \left(\sigma \right)}\right]|x\left(k\right)=x, \mu \left(k\right)=\mu , \sigma \left(k\right)=\sigma \right\}\left(\frac{X-\mu }{\phi \left(\sigma \right)}\right)\) d N(\(\mu ,\phi \left(\sigma \right)\)) = \(\int [(\frac{f\left(x\right)-f\left(\mu \right)}{\phi \left(\sigma \right)}]\))(\(\frac{X-\mu }{\phi \left(\sigma \right)}\)) d N(\(\mu ,\phi (\sigma\)) = \(\int \frac{f\left(x\right)}{\phi \left(\sigma \right)}\) (\(\frac{X-\mu }{\phi \left(\sigma \right)}\)) d N(\(\mu ,\phi (\sigma\))
The last step above follows because \(f(\mu )\) is independent of x and hence comes out of the integral and the rest of integral in the term is zero, with respect \(\mu\), we now get.
-
7.
\({g}_{1}\left(\mu ,\sigma \right)=\frac{\partial J}{\partial \mu }\left(\mu ,\phi \left(\sigma \right)\right)\)
where \(\frac{\partial J}{\partial \mu}\) is partial derivative of the ratio J function and \(\mu\) and \(\phi \left(\sigma \right) ={\sigma }_{l}\hspace{0.33em}for\hspace{0.33em}\sigma \le {\sigma }_{l},\ \sigma \hspace{0.33em}for\hspace{0.33em}\sigma \succ {\sigma }_{l}\succ 0\)
In a similar fashion, we can show that
-
8.
\({g}_{2}\left(\mu ,\sigma \right)= E\left[{F}_{2}\left(\mu \left(K\right),\sigma \left(K\right),X\left(K\right),{\beta }_{X\left(K\right)}, {\beta }_{\mu \left(K\right)}\right)-K\left(\sigma \left(k\right)-{\sigma }_{l}\right)\left|\mu \left(k\right)= \mu , \sigma \left(k\right)= \sigma \right.\right]\) = \(\frac{\partial J}{\partial \sigma }\) (\(\mu ,\phi \left(\sigma \right)\)) – K[\(\sigma -{\sigma }_{l}\)]
where \(\frac{\partial J}{\partial \sigma}\) is partial derivative of the ratio J function and \(\sigma ,\) K is constant. Having calculated the g function, the approximating ODE would be \(\dot{Z}=g\left(Z\right).\) Since the state here has two components, namely \(\mu\ and\ \sigma\) using formulas 7 and 8 the approximating ODE for CALA algorithm is:
-
9.
\(\frac{d\mu }{dt} = \frac{\partial J}{\partial \mu }\) (\(\mu ,\phi \left(\sigma \right)\))
-
10.
\(\frac{d\sigma }{dt} = \frac{\partial J}{\partial \sigma }\) (\(\mu ,\phi \left(\sigma \right)\)) – K [\(\sigma -{\sigma }_{l}\)]
This completes calculation of the approximating ODE for CALA algorithm to prove that this approximating ODE, all we need to do is to verify assumptions A1 to A4.
Assumptions
We make following assumptions on the system.
-
A1.
\(\left\{({X}_{k}^{b} , {\xi }_{k}^{b} , k\ge 1)\right\}\) is a Markov process.
-
A2.
For any appropriate Borel set B.
\(\mathrm{Prob}\left[{\xi }_{k}^{b} \in B \left|{X}_{k}^{b} , {\xi }_{k-1}^{ b} \right.\right]=Prob\left[{\xi }_{k}^{b} \in B \left|{X}_{k}^{b}\right.\right].\)
That is, conditioned on \({X}_{k}^{b} , {\xi }_{k}^{b}\) is independent of\({\xi }_{k-1}^{ b}\).
-
A3.
Define g: \({R}^{N} \to {R}^{N}\) by g(x) = E [G (\({X}_{k},{\xi }_{k}\))\(\left|{X}_{k}=x\right.\)]
We assume that g(.) is independent of k and that is globally Lipschutz.
-
A4.
Define \({\theta }_{k}^{b}\) = G (\({X}_{k}^{b}\),\({\xi }_{k}^{b}\)) – g (\({X}_{k}^{b}\))
We assume that E \({\Vert {\theta }_{k}^{b}\Vert }^{2}<M < \infty\), for some M and \(\forall k.\)
From the structure of the Algorithm it is immediately obvious that assumptions A1 and A2 are satisfied. To satisfy A3 and A4, we make the following additional assumptions on the unknown reward function, f(.).
-
B1.
F(.) is a bounded and is continuously differentiable, Let L denote the bound of F(.).
-
B2.
The derivative of f, namely, \({f}^{\prime}\left(.\right)\), is globally Lipschutz, that is, \(\left|{f}^{\prime}\left(x\right)- {f}^{\prime}\left(y\right)\right| \le {K}_{0}\Vert x-y\Vert\)
-
B3.
Define zero mean random variables \({\tau }_{x}= {\beta }_{x}-f\left(x\right)\). We assume that Var (\({\tau }_{x}\))\(\le {\sigma }_{M}^{2}\) for some \({\sigma }_{M}<\infty\) and that E [\({\tau }_{x}, {\tau }_{y}\left|x,y\right.\)] = 0 for x \(\ne y.\)
Under the assumptions B1—B3, we verify A3 – A4. under the integral, we get
-
11.
\(\frac{\partial J}{\partial \mu }\left(\mu , \sigma \right) = \int {f}^{\prime}\left(\sigma x+\mu \right)x\ d\ N\ \left(\mathrm{0,1}\right)\)
-
12.
\(\frac{\partial J}{\partial \sigma }\left(\mu , \sigma \right)= \int {f}^{\prime}\left(\sigma x+\mu \right)x\ d\ N\ \left(\mathrm{0,1}\right)\)
Now we have:
-
13.
\(\left|{g}_{1}\left({\mu }_{1},{\sigma }_{1}\right)- {g}_{1}\left({\mu }_{2},{\sigma }_{2}\right)\right| = \left|\frac{\partial J}{\partial \mu }\left({\mu }_{1},{\sigma }_{1}\right)-\frac{\partial J}{\partial \mu }\left({\mu }_{2},{\sigma }_{2}\right)\right| \le \int \left|{f}^{\mathrm{^{\prime}}}\left({\left(\sigma \right)}_{1}x+{\mu }_{1}\right)- {f}^{\mathrm{^{\prime}}}\left({\sigma }_{2}x + {\mu }_{2} \right)\right|\) d N (0,1) \(\le {K}_{0} \int \left|{\sigma }_{1}x + {\mu }_{1}- {\sigma }_{2}x- {\mu }_{2} \right| d N \left(\mathrm{0,1}\right) \le {K}_{0}\left|{\mu }_{1}- {\mu }_{2}\right| \int d N \left(\mathrm{0,1}\right)+ { K}_{0}\left|{\sigma }_{1}- {\sigma }_{2}\right| \int \left|x\right| d N \left(\mathrm{0,1}\right) {K}_{1} \Vert \left({\mu }_{1},{\sigma }_{1}\right)-\left({\mu }_{2},{\sigma }_{2}\right)\Vert\)
where \({K}_{1}\) depend on \({K}_{0}\) and first absolute moment of standard normal distribution. Hence can show that
-
14.
\(\left|{g}_{2}\left({\mu }_{1},{\sigma }_{1}\right)- {g}_{2}\left({\mu }_{2},{\sigma }_{2}\right)\right| \le {{K}^{\mathrm{^{\prime}}}}_{2} \Vert \left({\mu }_{1},{\sigma }_{1}\right)- \left({\mu }_{2},{\sigma }_{2}\right)\Vert\) + K \(\left|{\sigma }_{1}- {\sigma }_{2}\right| \le {K}_{2 }\Vert \left({\mu }_{1},{\sigma }_{1}\right)-\left({\mu }_{2},{\sigma }_{2}\right)\Vert\)
The proves that g is globally Lipschitz and thus verify A3, we note here that if instead of assumption B2. We had assumed that \({f}^{\prime}\) in Lipschitz on compact sets then the above proves that g is Lipschitz on compact sets.
-
15.
E [\({G}_{1}{\left(X,\xi \right)}^{2}\left|X\right.\)] = E [\({\left(\frac{{\beta }_{X}-{\beta }_{\mu }}{\phi \left(\sigma \right)}.\frac{X-\mu }{\phi \left(\sigma \right)}\right)}^{2}\left|\mu ,\sigma \right.\)] = E [E [\({\left(\frac{{\beta }_{X}-{\beta }_{\mu }}{\phi \left(\sigma \right)}\right)}^{2} {\left(\frac{X-\mu }{\phi \left(\sigma \right)}\right)}^{2}\left|x,\mu ,\sigma \right.\)]\(\left|\mu ,\sigma \right.\)] = E [\({\left(\frac{X-\mu }{\phi \left(\sigma \right)}\right)}^{2}\) E[\({\left(\frac{{\beta }_{X}-{\beta }_{\mu }}{\phi \left(\sigma \right)}\right)}^{2} \left|x,\mu ,\sigma \right.\)] \(\left|\mu ,\sigma \right.\)] \(\le\) E [\({\left(\frac{X-\mu }{\phi \left(\sigma \right)}\right)}^{2} \frac{1}{{\phi }^{2}\left(\sigma \right)}\) [\({\left(f\left(x\right)-f\left(\mu \right)\right)}^{2}\) + 2 \({\left({\sigma }_{M}\right)}^{2}\)] \(\left|\mu ,\sigma \right.\)] = \(\int {\left(\frac{X-\mu }{\phi \left(\sigma \right)}\right)}^{2}\frac{1}{{\phi }^{2}\left(\sigma \right)}\) [\({\left(f\left(x\right)-f\left(\mu \right)\right)}^{2}\) + 2 \({\left({\sigma }_{M}\right)}^{2}\)] d N(\(\mu ,\phi \left(\sigma \right)\)) \(\le \frac{4{L}^{2}+2{{\sigma }^{2}}_{M}}{{\phi }^{2}\left(\sigma \right)} \int {y}^{2} d N\left(\mathrm{0,1}\right)\)
This shows that E [\({G}_{1}^{2}\) (X,\(\xi\))\(\left|X\right.\)] \(\le {K}_{4}\) for some constant \({K}_{4}\) which is independent of X, now we have, by definition of J (.,.),
-
16.
\(\left|\frac{\partial J}{\partial \mu } \left(\mu ,\sigma \right)\right|\) = \(\left|\frac{\partial }{\partial \mu } \int f\left(x\right) d N\left(\mu ,\phi \left(\sigma \right)\right)\right|\)= \(\left|\int f\left(x\right)\frac{x- \mu }{{\phi }^{2}\left(\sigma \right)} d N\left(\mu ,\phi \left(\sigma \right)\right)\right| \le\) L \(\int \left|x\right| d N\left(\mathrm{0,1}\right)\).
Which shows \({g}_{1}\) is also bounded. assumption A4 is satisfied, since all assumptions are satisfied, we can conclude that is the approximating ODE for the CALA algorithm. The main features of the algorithm that result in the specific from of this ODE are as follows.
-
17.
J (\(\mu ,\sigma\)) = \(\oint f\left(x\right)d\ N\left(\mu ,\sigma \right)\) = \(\oint \frac{1}{\sigma \sqrt{2\pi }} {e}^{-\frac{{\left(x-\mu \right)}^{2}}{2{\sigma }^{2}}}\)
where \(N\left(\mu ,\sigma \right)\) denotes the gaussian density Function with mean \(\mu\) and standard deviation \(\sigma\). Let
-
18.
\({F}_{1} \left(\mu , \sigma ,x,\beta ,{\beta }^{\prime}\right)= \frac{(\beta - {\beta }^{\prime})}{\phi (\sigma )}\cdot \frac{(x - \mu )}{\phi (\sigma )}\)
-
19.
\({F}_{2}\left(\mu , \sigma ,x,\beta ,{\beta }^{\prime}\right) = \frac{(\beta - {\beta }^{\prime})}{\phi (\sigma )}\left[{(\frac{(x - \mu )}{\phi (\sigma )})}^{2}-1\right]\)
Where that these functions capture the prominent terms in the learning algorithm by formula 24 paper.
We have
-
20.
E [\({F}_{1}\left(\mu \left(K\right),\sigma \left(K\right),x\left(k\right),{\beta }_{x\left(k\right)},{\beta }_{\mu \left(k\right)}\right) \left|\mu \left(k\right)\right.= \mu , \sigma \left(k\right)= \sigma\)] = \(\frac{\partial J}{\partial \mu }\left(\mu ,\phi \left(\sigma \right)\right)\)
-
21.
E [\({F}_{2}\left(\mu \left(K\right),\sigma \left(K\right),x\left(k\right),{\beta }_{x\left(k\right)},{\beta }_{\mu \left(k\right)}\right) \left|\mu \left(k\right)= \mu , \sigma \left(k\right)= \sigma \right.\)] = \(\frac{\partial J}{\partial \sigma }\left(\mu ,\phi \left(\sigma \right)\right)\)
The above equations show that expectation of the changes \(\mu \left(K\right)\ and\ \sigma \left(K\right)\) conditioned on their current values, is proportional to the gradient of the average reinforcement, J. the ODE associated with the CALA algorithm is given by
-
22.
\(\frac{d\mu }{dt} = \frac{\partial J}{\partial \mu }\) (\(\mu ,\phi \left(\sigma \right)\)), \(\frac{d\sigma }{dt} = \frac{\partial J}{\partial \sigma }\) (\(\mu ,\phi \left(\sigma \right)\)) – K [\(\sigma -{\sigma }_{l}\)]
-
23.
Given any \(\Delta\ >0\ and\ any\ Compact\ Set\ \widetilde{K} \in R\), There exists a \({\sigma }^{*} >0\) such that \({SUP}_{\mu \in \widetilde{K}}\left|\frac{\partial J}{\partial \mu } \left(\mu ,\sigma \right)- {f}^{\prime}(\mu )\right| < \Delta\), for \(0 <\sigma < {\sigma }^{*}\), This result shows that the Convergence of \(\frac{\partial J}{\partial \mu }\) to \({f}^{\prime}\left(\mu \right)= \frac{df(u)}{du}\) as \(\sigma \to\) 0 is uniform over all \(\mu \epsilon \widetilde{K}\) for any Compact Set \(\widetilde{K}\).
-
24.
\(\frac{\partial J}{\partial \sigma }(\mu ,\phi \left(\sigma \right)-\mathrm{K}\left(\sigma - {\sigma }_{L}\right)=0\).
Putting all this together, we can conclude that by choosing a small value of \({\sigma }_{l} >0,\) a small step size \(\lambda >0 ,\) and sufficiently high K\(>0\), we can ensure that the iterates \(\mu \left(K\right)\) of the CALA algorithm will be close to maximum of the function f with high probability after a long [39].
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Rahmani, P., Arefi, M. Improvement of energy-efficient resources for cognitive internet of things using learning automata. Peer-to-Peer Netw. Appl. 17, 297–320 (2024). https://doi.org/10.1007/s12083-023-01565-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12083-023-01565-y