Online computation offloading with double reinforcement learning algorithm in mobile edge computing

https://doi.org/10.1016/j.jpdc.2022.09.006Get rights and content

Highlights

  • An online computing offload model for mobile edge computing system.

  • Based on double DQN and DDPG to reduce delay and energy consumption.

  • An adaptive prioritized experience replay algorithm to improve training efficiency.

Abstract

Smart mobile devices have recently emerged as a promising computing platform for computation tasks. However, the task performance is restricted by the computing power and battery capacity of mobile devices. Mobile edge computing, an extension of cloud computing, solves this problem well by providing computational support to mobile devices. In this paper, we discuss a mobile edge computing system with a server and multiple mobile devices that need to perform computation tasks with priorities. The limited resources of the mobile edge computing server and mobile device make it challenging to develop an offloading strategy to minimize both delay and energy consumption in the long term. To this end, an online algorithm is proposed, namely, the double reinforcement learning computation offloading (DRLCO) algorithm, which jointly decides the offloading decision, the CPU frequency, and transmit power for computation offloading. Concretely, we first formulate the power scheduling problem for mobile users to minimize energy consumption. Inspired by reinforcement learning, we solve the problem by presenting a power scheduling algorithm based on the deep deterministic policy gradient (DDPG). Then, we model the task offloading problem to minimize the delay of tasks and propose a double Deep Q-networks (DQN) based algorithm. In the decision-making process, we fully consider the influence of task queue information, channel state information, and task information. Moreover, we propose an adaptive prioritized experience replay algorithm to improve the model training efficiency. We conduct extensive simulations to verify the effectiveness of the scheme, and the simulation results show that compared with the conventional schemes, our method reduces the delay by 48% and the energy consumption by 53%.

Introduction

With the development of wireless communication and the Internet of Things (IoT), smart mobile devices (MDs) have become a new mobile computing platform on which applications such as video surveillance, face recognition, and natural language processing are widely deployed [36], [14]. These pose strict requirements on the computational power of MDs, especially for computation-intensive applications. The contradiction between resource-constrained MDs and computation-intensive applications becomes the bottleneck when providing satisfactory quality of experience (QoE) [19].

As an effective method to solve the above-mentioned problem, recently various computation offloading schemes have been proposed, which migrate the computation tasks to other devices or platforms for execution [5]. The cloud computing systems, for example, would transfer all or part of the computation tasks to the cloud server to alleviate the heavy burden on the MDs. The main drawback of this cloud-based offloading approach is that it usually causes an unacceptable transmission delay as the cloud is usually far away from the clients. In contrast, mobile edge computing (MEC) deploys servers or micro-servers close to the MDs to reduce the transmission delay [24]. Therefore, MEC has become a promising computing paradigm for the various mobile applications [21].

There has been some research on the problem of computation offloading in MEC systems, most of which aim at enhancing the users' QoE. Viewed as a convex optimization problem, the computation offloading ratio, the processor clock rate, the bandwidth allocation, as well as the transmit power are taken into account for the optimization which minimizes the weighted sum of the execution delay and the energy consumption for the computation offloading problem in [11]. To improve the delay performance, the general dependency among tasks is analyzed and a genetic algorithm-based solution was adopted in [2]. Recently, the computation offloading in UAV-assisted multi-user MEC scenarios is also discussed by [38].

In the previous works, however, all the tasks are assumed to be equally important, and they neglect the fact that the computation tasks are of different importance to users. For example, in MDs, security tasks (such as road detection, and vehicle detection) have the highest priority, followed by real-time tasks (such as games, AR / VR), and the lowest priority for non-real-time tasks (such as user behavior analysis task). Moreover, the queue waiting and execution delay on the MEC servers should also be considered when solving the optimal offloading problem. A high-performance computation offloading scheme that tails for the priority-based tasks and resource-constraint MEC servers is highly required.

The challenges of computation offloading in MEC systems lie in three folds: 1) The mobile network with edge servers is stochastic and dynamic. The task execution is affected by multiple factors, such as the channel state information (CSI), and the task queue state information. So the cost and performance of computation tasks change with the states and the execution modes of MDs and edge servers; 2) The energy consumption of MDs is constrained by the battery capacity. Hence the transmit power and CPU frequency of the MDs should be reasonably scheduled to save energy when offloading the computation tasks; 3) As the states evolve in a continuous space, one offloading decision will influence the latter one. Thus, the delay-energy tradeoff optimization problem is a long-term mixed-integer linear programming problem (LT-MILP), which has been proven to be NP-hard [20].

In this paper, we propose an efficient computation offloading scheme, called double reinforcement learning computation offloading (DRLCO), in resource-constraint MEC systems. Our scheme includes two aspects of optimization objectives: 1) To minimize the weighted sum of the execution delay and the energy consumption of computation tasks in the long-term; 2) To improve the users' QoE by reducing the delay of important tasks. Both the MDs and MEC servers are assumed resource-limited, and the MEC server can provide computational support for multi-MDs to handle tasks with priorities. Due to the stochastic and dynamic nature of the MEC system, we reformulate the computation offloading problem as a Markov Decision Process (MDP) problem and solve it by utilizing reinforcement learning techniques. The main contributions of this work are summarized as follows:

  • We consider a scenario where both the MEC server and MDs are resource-constraint. And we develop a computation offloading model, including the offloading controller, the task execution queue on the MEC server and MD, and the task transmission queue between the MEC server and MD. Besides, different types of tasks have different execution priorities in the MEC system.

  • In this paper, the optimization goal of computation offloading is defined as reducing task execution delay and mobile device energy consumption, and then a double reinforcement learning computation offloading algorithm is proposed. Specifically, we decompose the computation offloading process into power scheduling process and task offloading process. In the process of power scheduling, we propose a Deep Deterministic Policy Gradient (DDPG) based approach to reduce the energy consumption of MDs by scheduling the transmit power and CPU frequency of the MDS. In the process of task offloading, we propose a double Deep Q Network (DQN) based approach to reduce the execution delay of tasks by making offloading decisions for computing tasks. And an adaptive prioritized experience replay algorithm is proposed to improve the model training efficiency. Besides, the task queues are arranged and sorted according to their priorities to reduce the waiting delay and improve the users' QoE.

  • We conduct extensive experiments to evaluate the performance of the proposed algorithms. The simulation results verify that our approach outperforms other state-of-the-art schemes. It reduces the delay by 48% and the energy consumption by 53% compared with other schemes.

The rest of this paper is organized as follows. Section 2 discusses the related works. Section 3 presents the system model. Section 4 formulates the cost minimization problem and reformulates it as an MDP problem. Section 5 presents the detailed Double DQN based approach and DDPG based approach, which obtains the optimal tasks offloading policy and schedules the transmit power and CPU frequency of the MDs. Section 6 evaluates the performance of the proposed method based on extensive simulations. Finally, section 7 concludes the paper and gives some future directions.

Section snippets

Related work

Existing research on computation offloading in MEC systems can be roughly classified into three categories according to the optimization objectives, i.e., the delay-optimal computation offloading, the energy-optimal computation offloading, and the energy-delay tradeoff computation offloading.

For delay-sensitive applications, improving the delay performance is the main objective of computation offloading. Liu et al. [23] proposed a one-dimensional search algorithm to minimize the total delay.

System model

In this section, we first introduce the system model studied in this paper, and then elaborate on the computation model and the tasks queue model.

Problem formulation

In this section, we first introduce the execution cost of a task and formulate the execution cost minimization (ECM) problem. Then, we define the power scheduling and task offloading problems based on the MDP model [7].

Reinforcement learning for computation offloading

This section presents a reinforcement learning-based approach for the computation offloading in the MEC system. We first briefly introduce and analyze existing reinforcement learning methods, then we describe the DDPG algorithm which solves the power scheduling problem [22], and the Double DQN algorithm that solves the task offloading problem in detail [34].

Performance analysis

In this section, we first introduce the simulation setup and the baselines. Then the performance of our computation offloading policy is evaluated through extensive simulations. The schemes are implemented in Python 3.6 and experiments are run on a computer with Intel Core i5 CPU, 2.9 GHz, NVIDIA GeForce GTX 1660 GPU, 16 G RAM under Windows 10.

Conclusion

In this article, we model a computation offloading framework for the resource-constrained MEC system. Then, we take task execution delay and energy consumption as optimization objectives which are key factors to measure the QoE of mobile users. To minimize both the execution delay and energy consumption, we propose a double reinforcement learning computing offloading algorithm that can jointly schedule CPU frequency, transmission power, and task offloading mode. As compared with other benchmark

CRediT authorship contribution statement

This paper studies the problem of computation offloading in resource constrained MEC systems, and proposes a double reinforcement learning computation offloading algorithm. Experiments show that our method reduces the delay by 48% and the energy consumption by 53%. In this work, the main contribution of Liao is experiments and write paper. The experiment was conducted under the guidance of Lai. In addition, Yang and Zeng provided technical guidance and financial support.

We declare that the

Declaration of Competing Interest

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled “Online Computation Offloading with Double Reinforcement Learning Algorithm in Mobile Edge Computing”.

Acknowledgements

This work is partially supported by the Natural Science Foundation of Guandong (2021A1515011578), the Funding of Longyan Institute of Industry and Education Integration, Xiamen University (20210302), Natural Science Foundation of China (61872154), the Natural Science Foundation of Fujian (2018J01097), and Shenzhen Basic Research Program (JCYJ20190809161603551).

Linbo Liao received the B.S degree in Department of Internet of Things Engineering from Nanchang Hangkong University in 2019, the master degree from School of Informatics, Xiamen University in 2022. His research interests are edge and distributed computing and machine learning.

References (39)

  • J. Gubbi et al.

    Internet of things (iot): a vision, architectural elements, and future directions

    Future Gener. Comput. Syst.

    (2013)
  • F. Song et al.

    Offloading dependent tasks in multi-access edge computing: a multi-objective reinforcement learning approach

    Future Gener. Comput. Syst.

    (2022)
  • A.A. Al-Habob et al.

    Collision-free sequential task offloading for mobile edge computing

    IEEE Commun. Lett.

    (2019)
  • A.A. Al-Habob et al.

    Task scheduling for mobile edge computing using genetic algorithm and conflict graphs

    IEEE Trans. Veh. Technol.

    (2020)
  • T. Alfakih et al.

    Task offloading and resource allocation for mobile edge computing by deep reinforcement learning based on sarsa

    IEEE Access

    (2020)
  • P.A. Apostolopoulos et al.

    Cognitive data offloading in mobile edge computing for Internet of things

    IEEE Access

    (2020)
  • S. Barbarossa et al.

    Communicating while computing: distributed mobile cloud computing over 5g heterogeneous networks

    IEEE Signal Process. Mag.

    (2014)
  • R. Bellman

    Dynamic programming

    Science

    (1966)
  • D.S. Bernstein et al.

    The complexity of decentralized control of Markov decision processes

    Math. Oper. Res.

    (2002)
  • T.D. Burd et al.

    Processor design for portable systems

    J. VLSI Signal Process. Syst. Signal Image Video Technol.

    (1996)
  • N. Chen et al.

    When learning joins edge: real-time proportional computation offloading via deep reinforcement learning

  • X. Chen et al.

    Efficient multi-user computation offloading for mobile-edge cloud computing

    IEEE/ACM Trans. Netw.

    (2015)
  • X. Chen et al.

    Efficient resource allocation for relay-assisted computation offloading in mobile-edge computing

    IEEE Int. Things J.

    (2019)
  • Y. Ding et al.

    A code-oriented partitioning computation offloading strategy for multiple users and multiple mobile edge computing servers

    IEEE Trans. Ind. Inform.

    (2019)
  • T.Q. Dinh et al.

    Offloading in mobile edge computing: task allocation and computational frequency scaling

    IEEE Trans. Commun.

    (2017)
  • S. Hu et al.

    Dynamic request scheduling optimization in mobile edge computing for iot applications

    IEEE Int. Things J.

    (2019)
  • L. Huang et al.

    Deep reinforcement learning for online computation offloading in wireless powered mobile-edge computing networks

    IEEE Trans. Mob. Comput.

    (2019)
  • L.P. Kaelbling et al.

    Reinforcement learning: a survey

    J. Artif. Intell. Res.

    (1996)
  • M. Kamoun et al.

    Joint resource allocation and offloading strategies in cloud enabled cellular networks

  • Cited by (30)

    View all citing articles on Scopus

    Linbo Liao received the B.S degree in Department of Internet of Things Engineering from Nanchang Hangkong University in 2019, the master degree from School of Informatics, Xiamen University in 2022. His research interests are edge and distributed computing and machine learning.

    Yongxuan Lai received the bachelor degree at Management of Information System from Renmin University of China in 2004, the Ph.D. degree in Computer Science from Renmin University of China in 2009. He is currently a professor in Software Engineering Department, School of Informatics, Xiamen University, China. He is also the dean of School of Mathematics and Information Engineering, Longyan University, China. He was an visiting scholar during Sep. 2017 - Sep. 2018 at University of Queensland, Australia. His research interests include network data management, intelligent transportation systems, and big data management and analysis.

    Fan Yang received his Ph.D. degree in Control Theory and Control Engineering from Xiamen University in 2009. He is currently an associate professor in the Department of Automation at Xiamen University. His research interests include feature selection, ensemble learning, and intelligent transportation systems.

    Wenhua Zeng received his Ph.D. degree in Industrial Automation from Zhejiang University in 1989. He is currently a professor in the Department of Software Engineering at Xiamen University. His research interests include embedded system, embedded software, internet of things, and cloud computing.

    View full text