Online Learning and Optimization for Computation Offloading in D2D Edge Computing and Networks

Qiao, Guanhua; Leng, Supeng; Zhang, Yan

doi:10.1007/s11036-018-1176-y

Online Learning and Optimization for Computation Offloading in D2D Edge Computing and Networks

Published: 11 January 2019

Volume 27, pages 1111–1122, (2022)
Cite this article

Mobile Networks and Applications Aims and scope Submit manuscript

1126 Accesses
24 Citations
Explore all metrics

Abstract

This paper introduces a framework of device-to-device edge computing and networks (D2D-ECN), a new paradigm for computation offloading and data processing with a group of resource-rich devices towards collaborative optimization between communication and computation. However, the computation process of task intensive applications would be interrupted when capacity-limited battery energy run out. In order to tackle this issue, the D2D-ECN with energy harvesting technology is applied to provide a green computation network and guarantee service continuity. Specifically, we design a reinforcement learning framework in a point-to-point offloading system to overcome challenges of the dynamic nature and uncertainty of renewable energy, channel state and task generation rates. Furthermore, to cope with high-dimensionality and continuous-valued action of the offloading system with multiple cooperating devices, we propose an online approach based on Lyapunov optimization for computation offloading and resource management without priori energy and network information. Numerical results demonstrate that our proposed scheme can reduce system operation cost with low task execution time in D2D-ECN.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reinforcement learning for intelligent online computation offloading in wireless powered edge networks

Article 12 August 2022

Energy Efficient Computation Offloading for Energy Harvesting-Enabled Heterogeneous Cellular Networks (Workshop)

Dynamic Resource Allocation and Computation Offloading for Edge Computing System

References

Gubbi J, Buyya R, Marusic S, Palaniswami M (2013) Internet of things (IoT): a vision, architectural elements, and future directions. Future Generat Comput Syst 29(7):1645–1660
Article Google Scholar
ETSI ISG (2015) Mobile edge computing a key technology towards 5G. White Paper 11:1–16
Google Scholar
Pu L, Chen C, Xu J, Fu X (2016) D2D fogging: an energy-efficient and incentive-aware task offloading framework via network-assisted D2D collaboration. IEEE J Sel Areas Commun 34(12):3887–3901
Article Google Scholar
Liu F et al (2013) Gearing resource-poor mobile devices with powerful clouds: architectures, challenges, and applications. IEEE Wireless Commun 20(3):14–22
Article Google Scholar
Zhang K et al (2016) Energy-ffficient offloading for mobile edge computing in 5G heterogeneous networks. IEEE Access 4:5896–5907
Article Google Scholar
Shih Y-Y, Chung W-H, Pang A-C, Chiu T-C, Wei H-Y (2017) Enabling low-latency applications in fog-radio access networks. IEEE Netw 31(1):52–58
Article Google Scholar
Zhang K, Mao Y, Leng S, He Y, Zhang Y (2017) Mobile-edge computing for vehicular networks: a promising network paradigm with predictive off-loading. IEEE Vehi Tech Maga 12(2):36–44
Article Google Scholar
Wang X, Leng S, Yang K (2017) Social-aware edge caching in fog radio access networks. IEEE Access 5:8492–8501
Article Google Scholar
Chen X, Pu L, Gao L, Wu W, Wu D (2017) Exploiting massive D2D collaboration for energy-efficient mobile edge computing. IEEE Commun Mag 24(4):64–71
Google Scholar
Ti N, Le L (2017) Computation offloading leveraging computing resources from edge cloud and mobile peers. In: Proceedings of the IEEE Int. Commun. Confe (ICC)
Meng X, Wang W, Zhang Z (2017) Delay-constrained hybrid computation offloading with cloud and fog computing. IEEE Access 5:21355–21367
Article Google Scholar
Garlatova M, Wallwater A, Zussman G (2011) Networking low-power energy harvesting devices: measurements and algorithms. In: Proceedings of the IEEE Conf. Comput. Commun. (INFOCOM), pp 1602–1610
Dhillon H, Li Y, Nuggehalli P, Pi Z, Andrews J (2014) Fundamentals of heterogeneous cellular networks with energy harvesting. IEEE Trans Wireless Commun 13(5):2782–2797
Article Google Scholar
Mao Y, Zhang J, Letaief KB (2016) Dynamic computation offloading for Mobile-Edge computing with energy harvesting devices. IEEE J Sel Areas Commun 34(12):3590–3605
Article Google Scholar
Fan B, Leng S, Yang K (2016) A dynamic bandwidth allocation algorithm in mobile networks with big data of users and networks. IEEE Netw Maga 30(1):6–10
Article Google Scholar
Mastronarde N, Schaar M (2011) Fast reinforcement learning for energy-efficient wireless communication. IEEE Trans Sig Process 50(12):6262–6266
Article MathSciNet Google Scholar
Wei Y, Yu FR, Song M, Han Z (2017) User scheduling and resource allocation in HetNets with hybrid energy supply: an actor-critic reinforcement learning approach. IEEE Trans Wire Commun 17(1):680–692
Article Google Scholar
Xu J, Chen L, Ren S (2017) Online learning for offloading and autoscaling in energy harvesting mobile edge computing. IEEE Trans Cogn Commun Netw 3(3):361–373
Article Google Scholar
Mao Y, You C, Zhang J, Huang K, Letaief KB (2017) A servey on mobile edge computing: the communication perspective. IEEE Commun Survey Tuts 19(4):2322–2358
Article Google Scholar
Miettinen AP, Nurminen JK (2010) Energy efficiency of mobile clients in cloud computing. In: Proceedings of the USENIX Conf. Hot Topics Cloud Comput. (HotCloud), Boston, MA, USA, pp 1–7
Wang Q, Leng S, Fu H, Zhang Y (2012) An IEEE 802.11p-based multichannle MAC scheme with channel coordination for vehicular Ad Hoc networks. IEEE Trans Intell Trans Sys 13(2):449–458
Article Google Scholar
Rajan D, Sabharwal A, Aazhang B (2004) Delay-bounded packet scheduling of bursty traffic over wireless channels. IEEE Trans Inform Theory 50(1):125–144
Article MathSciNet Google Scholar
Burd TD, Brodersen RW (1996) Processor design for portable systems. Kluwer J VLSI Signal Process Syst 13(2):203–221
Article Google Scholar
Altman E (1999) Constrained Markov Decision Process Chapman and Hall/CRC
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge
MATH Google Scholar
Salodkar N, Bhorkar A, Karandikar A, Borkar VS (2008) An on-line learning algorithm for energy efficient delay constrained scheduling over a fading channel. IEEE J Sel Areas Commun 26(4):732–742
Article Google Scholar
Neely MJ (2010) Stochastic network optimization with application to communication and queueing systems, San rafael, CA USA: Morgan claypool
Gibilisco P, Hiai F, Petz D (2009) Quantum covariance, quantum fisher information, and the uncertainty relations. IEEE Trans Inform Theory 55(1):439–443
Article MathSciNet Google Scholar
Gorlatova M, Wallwater A, Zussma G (2012) Networking low-power energy harvesting devices: Measurements and algorithms. IEEE Trans Mobile Comput 12(9):1853–1865
Article Google Scholar
Mitchell TM (1997) Machine learning McGraw-Hill

Download references

Acknowledgments

This work is supported by the joint fund of the Ministry of Education of China and China Mobile (MCM 20160304), the Fundamental Research Funds for the Central Universities, China (ZYGX2016Z011), and EU H2020 Project COSAFE (MSCA-RISE-2018-824019).

Author information

Authors and Affiliations

School of Information & Communication Engineering, University of Electronic Science and Technology of China, Chengdu, China
Guanhua Qiao & Supeng Leng
Department of Informatics, University of Oslo, Oslo, Norway
Yan Zhang

Authors

Guanhua Qiao
View author publications
You can also search for this author in PubMed Google Scholar
Supeng Leng
View author publications
You can also search for this author in PubMed Google Scholar
Yan Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Supeng Leng.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix: Convergence and optimality analysis of algorithm 1

Given the Lagrange multiplier η, Algorithm 1 can converge to an optimal state-action value (e.g, Q(t), to approximate Q^∗) after finite number of iterations as it satisfies following conditions [30]: i) the state transition $p(\left . {s^{\prime }} \right |s,a)$ in Eq. 10 is stationary under any observed state s ∈ S with the corresponding action a ∈ A; ii) the Lagrange function ρ in Eq. 13 is bounded, that is the unconstrained cost satisfies $\left | \rho (s,a)\right | < \rho _{max}$ for each possible state-action pair (s,a), and ρ_max is a constant; (iii) each state s and Lagrange function could be infinitely explored at the each episode. As a consequence, the Q^η converges to the optimal Q^∗η with probability 1. Moreover, it is worth mentioning that the executive time conducted by interactions between agent and external network environment have considerable impact on the feasibility and efficiency of Q-learning algorithm.

On the other hand, we impose the following additional conditions on the learning rate α(t) and β(t) to guarantee convergence of Theorem 1 [26],

$$ \left\{ \begin{array}{l} \sum\nolimits_{t = 0}^{\infty} {\alpha (t)} = \infty ,\sum\nolimits_{t = 0}^{\infty} {\beta (t)} = \infty \\ \sum\nolimits_{t = 0}^{\infty} {{\alpha^{2}}(t)} \le \infty ,\sum\nolimits_{t = 0}^{\infty} {{\beta^{2}}(t)} \le \infty \\ \underset{t \to \infty }{\lim } \frac{{\beta (t)}}{{\alpha (t)}} \to 0 \end{array} \right. $$

(28)

Consequently, Eq. 19 can ensure to obtain optimal policy in finite iterations. ■

Appendix B: Proof for theorem 1

Based on the inequality $\max {[x,0]^{2}} \le {x^{2}}$, the battery energy state transition can be reformulated as

$$\begin{array}{@{}rcl@{}} {[b(t + 1)]^{2}} &\le& {[b(t)]^{2}} + {[e(t)]^{2}} + {[g(t)]^{2}}\\ &&- 2b(t)[g(t) - e(t)] \end{array} $$

(29)

Then, substituting the Lyapunov function (23) and the above inequality (29) into the Lyapunov drift function (24), we obtain

$$\begin{array}{@{}rcl@{}} E\left\{ {L(t + 1) - L(t)} \right\} &\le& \frac{1}{2}E\left\{ {\left({\left. {{{[e(t)]}^{2}} + {{[g(t)]}^{2}}} \right|b(t)} \right)} \right.\\ &&-\left({\left. {2b(t)[g(t) - e(t)]} \right|b(t)} \right) \end{array} $$

(30)

Taking the renewable energy arrival rate and energy utilization to hold for the inequalities $\left | {e(t)} \right | \le {e_{\max }}$ and $\left | {g(t)} \right | \le {g_{\max }}$, we define the following equation

$$ \vartheta = \frac{1}{2}E\left\{ {\left({\left. {{{[{e_{\max }}]}^{2}} + {{[{g_{\max }}]}^{2}}} \right|b(t)} \right)} \right. $$

(31)

Therefore, adding the $\pi E\left \{ {c(t)} \right \}$ into each side of the inequality (30), which can be rewritten as the Eq. 26. ■

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qiao, G., Leng, S. & Zhang, Y. Online Learning and Optimization for Computation Offloading in D2D Edge Computing and Networks. Mobile Netw Appl 27, 1111–1122 (2022). https://doi.org/10.1007/s11036-018-1176-y

Download citation

Published: 11 January 2019
Issue Date: June 2022
DOI: https://doi.org/10.1007/s11036-018-1176-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Online Learning and Optimization for Computation Offloading in D2D Edge Computing and Networks

Abstract

Access this article

Similar content being viewed by others

Reinforcement learning for intelligent online computation offloading in wireless powered edge networks

Energy Efficient Computation Offloading for Energy Harvesting-Enabled Heterogeneous Cellular Networks (Workshop)

Dynamic Resource Allocation and Computation Offloading for Edge Computing System

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Appendices

Appendix: Convergence and optimality analysis of algorithm 1

Appendix B: Proof for theorem 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Online Learning and Optimization for Computation Offloading in D2D Edge Computing and Networks

Abstract

Access this article

Similar content being viewed by others

Reinforcement learning for intelligent online computation offloading in wireless powered edge networks

Energy Efficient Computation Offloading for Energy Harvesting-Enabled Heterogeneous Cellular Networks (Workshop)

Dynamic Resource Allocation and Computation Offloading for Edge Computing System

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Appendices

Appendix: Convergence and optimality analysis of algorithm 1

Appendix B: Proof for theorem 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation