Online computation offloading with double reinforcement learning algorithm in mobile edge computing

doi:10.1016/j.jpdc.2022.09.006

Journal of Parallel and Distributed Computing

Volume 171, January 2023, Pages 28-39

https://doi.org/10.1016/j.jpdc.2022.09.006 Get rights and content

Highlights

•
An online computing offload model for mobile edge computing system.
•
Based on double DQN and DDPG to reduce delay and energy consumption.
•
An adaptive prioritized experience replay algorithm to improve training efficiency.

Abstract

Smart mobile devices have recently emerged as a promising computing platform for computation tasks. However, the task performance is restricted by the computing power and battery capacity of mobile devices. Mobile edge computing, an extension of cloud computing, solves this problem well by providing computational support to mobile devices. In this paper, we discuss a mobile edge computing system with a server and multiple mobile devices that need to perform computation tasks with priorities. The limited resources of the mobile edge computing server and mobile device make it challenging to develop an offloading strategy to minimize both delay and energy consumption in the long term. To this end, an online algorithm is proposed, namely, the double reinforcement learning computation offloading (DRLCO) algorithm, which jointly decides the offloading decision, the CPU frequency, and transmit power for computation offloading. Concretely, we first formulate the power scheduling problem for mobile users to minimize energy consumption. Inspired by reinforcement learning, we solve the problem by presenting a power scheduling algorithm based on the deep deterministic policy gradient (DDPG). Then, we model the task offloading problem to minimize the delay of tasks and propose a double Deep Q-networks (DQN) based algorithm. In the decision-making process, we fully consider the influence of task queue information, channel state information, and task information. Moreover, we propose an adaptive prioritized experience replay algorithm to improve the model training efficiency. We conduct extensive simulations to verify the effectiveness of the scheme, and the simulation results show that compared with the conventional schemes, our method reduces the delay by 48% and the energy consumption by 53%.

Introduction

With the development of wireless communication and the Internet of Things (IoT), smart mobile devices (MDs) have become a new mobile computing platform on which applications such as video surveillance, face recognition, and natural language processing are widely deployed [36], [14]. These pose strict requirements on the computational power of MDs, especially for computation-intensive applications. The contradiction between resource-constrained MDs and computation-intensive applications becomes the bottleneck when providing satisfactory quality of experience (QoE) [19].

As an effective method to solve the above-mentioned problem, recently various computation offloading schemes have been proposed, which migrate the computation tasks to other devices or platforms for execution [5]. The cloud computing systems, for example, would transfer all or part of the computation tasks to the cloud server to alleviate the heavy burden on the MDs. The main drawback of this cloud-based offloading approach is that it usually causes an unacceptable transmission delay as the cloud is usually far away from the clients. In contrast, mobile edge computing (MEC) deploys servers or micro-servers close to the MDs to reduce the transmission delay [24]. Therefore, MEC has become a promising computing paradigm for the various mobile applications [21].

There has been some research on the problem of computation offloading in MEC systems, most of which aim at enhancing the users' QoE. Viewed as a convex optimization problem, the computation offloading ratio, the processor clock rate, the bandwidth allocation, as well as the transmit power are taken into account for the optimization which minimizes the weighted sum of the execution delay and the energy consumption for the computation offloading problem in [11]. To improve the delay performance, the general dependency among tasks is analyzed and a genetic algorithm-based solution was adopted in [2]. Recently, the computation offloading in UAV-assisted multi-user MEC scenarios is also discussed by [38].

In the previous works, however, all the tasks are assumed to be equally important, and they neglect the fact that the computation tasks are of different importance to users. For example, in MDs, security tasks (such as road detection, and vehicle detection) have the highest priority, followed by real-time tasks (such as games, AR / VR), and the lowest priority for non-real-time tasks (such as user behavior analysis task). Moreover, the queue waiting and execution delay on the MEC servers should also be considered when solving the optimal offloading problem. A high-performance computation offloading scheme that tails for the priority-based tasks and resource-constraint MEC servers is highly required.

The challenges of computation offloading in MEC systems lie in three folds: 1) The mobile network with edge servers is stochastic and dynamic. The task execution is affected by multiple factors, such as the channel state information (CSI), and the task queue state information. So the cost and performance of computation tasks change with the states and the execution modes of MDs and edge servers; 2) The energy consumption of MDs is constrained by the battery capacity. Hence the transmit power and CPU frequency of the MDs should be reasonably scheduled to save energy when offloading the computation tasks; 3) As the states evolve in a continuous space, one offloading decision will influence the latter one. Thus, the delay-energy tradeoff optimization problem is a long-term mixed-integer linear programming problem (LT-MILP), which has been proven to be NP-hard [20].

In this paper, we propose an efficient computation offloading scheme, called double reinforcement learning computation offloading (DRLCO), in resource-constraint MEC systems. Our scheme includes two aspects of optimization objectives: 1) To minimize the weighted sum of the execution delay and the energy consumption of computation tasks in the long-term; 2) To improve the users' QoE by reducing the delay of important tasks. Both the MDs and MEC servers are assumed resource-limited, and the MEC server can provide computational support for multi-MDs to handle tasks with priorities. Due to the stochastic and dynamic nature of the MEC system, we reformulate the computation offloading problem as a Markov Decision Process (MDP) problem and solve it by utilizing reinforcement learning techniques. The main contributions of this work are summarized as follows:

•
We consider a scenario where both the MEC server and MDs are resource-constraint. And we develop a computation offloading model, including the offloading controller, the task execution queue on the MEC server and MD, and the task transmission queue between the MEC server and MD. Besides, different types of tasks have different execution priorities in the MEC system.
•
In this paper, the optimization goal of computation offloading is defined as reducing task execution delay and mobile device energy consumption, and then a double reinforcement learning computation offloading algorithm is proposed. Specifically, we decompose the computation offloading process into power scheduling process and task offloading process. In the process of power scheduling, we propose a Deep Deterministic Policy Gradient (DDPG) based approach to reduce the energy consumption of MDs by scheduling the transmit power and CPU frequency of the MDS. In the process of task offloading, we propose a double Deep Q Network (DQN) based approach to reduce the execution delay of tasks by making offloading decisions for computing tasks. And an adaptive prioritized experience replay algorithm is proposed to improve the model training efficiency. Besides, the task queues are arranged and sorted according to their priorities to reduce the waiting delay and improve the users' QoE.
•
We conduct extensive experiments to evaluate the performance of the proposed algorithms. The simulation results verify that our approach outperforms other state-of-the-art schemes. It reduces the delay by 48% and the energy consumption by 53% compared with other schemes.

The rest of this paper is organized as follows. Section 2 discusses the related works. Section 3 presents the system model. Section 4 formulates the cost minimization problem and reformulates it as an MDP problem. Section 5 presents the detailed Double DQN based approach and DDPG based approach, which obtains the optimal tasks offloading policy and schedules the transmit power and CPU frequency of the MDs. Section 6 evaluates the performance of the proposed method based on extensive simulations. Finally, section 7 concludes the paper and gives some future directions.

Section snippets

Related work

Existing research on computation offloading in MEC systems can be roughly classified into three categories according to the optimization objectives, i.e., the delay-optimal computation offloading, the energy-optimal computation offloading, and the energy-delay tradeoff computation offloading.

For delay-sensitive applications, improving the delay performance is the main objective of computation offloading. Liu et al. [23] proposed a one-dimensional search algorithm to minimize the total delay.

System model

In this section, we first introduce the system model studied in this paper, and then elaborate on the computation model and the tasks queue model.

Problem formulation

In this section, we first introduce the execution cost of a task and formulate the execution cost minimization (ECM) problem. Then, we define the power scheduling and task offloading problems based on the MDP model [7].

Reinforcement learning for computation offloading

This section presents a reinforcement learning-based approach for the computation offloading in the MEC system. We first briefly introduce and analyze existing reinforcement learning methods, then we describe the DDPG algorithm which solves the power scheduling problem [22], and the Double DQN algorithm that solves the task offloading problem in detail [34].

Performance analysis

In this section, we first introduce the simulation setup and the baselines. Then the performance of our computation offloading policy is evaluated through extensive simulations. The schemes are implemented in Python 3.6 and experiments are run on a computer with Intel Core i5 CPU, 2.9 GHz, NVIDIA GeForce GTX 1660 GPU, 16 G RAM under Windows 10.

Conclusion

In this article, we model a computation offloading framework for the resource-constrained MEC system. Then, we take task execution delay and energy consumption as optimization objectives which are key factors to measure the QoE of mobile users. To minimize both the execution delay and energy consumption, we propose a double reinforcement learning computing offloading algorithm that can jointly schedule CPU frequency, transmission power, and task offloading mode. As compared with other benchmark

CRediT authorship contribution statement

This paper studies the problem of computation offloading in resource constrained MEC systems, and proposes a double reinforcement learning computation offloading algorithm. Experiments show that our method reduces the delay by 48% and the energy consumption by 53%. In this work, the main contribution of Liao is experiments and write paper. The experiment was conducted under the guidance of Lai. In addition, Yang and Zeng provided technical guidance and financial support.

We declare that the

Declaration of Competing Interest

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled “Online Computation Offloading with Double Reinforcement Learning Algorithm in Mobile Edge Computing”.

Acknowledgements

This work is partially supported by the Natural Science Foundation of Guandong (2021A1515011578), the Funding of Longyan Institute of Industry and Education Integration, Xiamen University (20210302), Natural Science Foundation of China (61872154), the Natural Science Foundation of Fujian (2018J01097), and Shenzhen Basic Research Program (JCYJ20190809161603551).

Linbo Liao received the B.S degree in Department of Internet of Things Engineering from Nanchang Hangkong University in 2019, the master degree from School of Informatics, Xiamen University in 2022. His research interests are edge and distributed computing and machine learning.

References (39)

J. Gubbi et al.
Internet of things (iot): a vision, architectural elements, and future directions
Future Gener. Comput. Syst.
(2013)
F. Song et al.
Offloading dependent tasks in multi-access edge computing: a multi-objective reinforcement learning approach
Future Gener. Comput. Syst.
(2022)
A.A. Al-Habob et al.
Collision-free sequential task offloading for mobile edge computing
IEEE Commun. Lett.
(2019)
A.A. Al-Habob et al.
Task scheduling for mobile edge computing using genetic algorithm and conflict graphs
IEEE Trans. Veh. Technol.
(2020)
T. Alfakih et al.
Task offloading and resource allocation for mobile edge computing by deep reinforcement learning based on sarsa
IEEE Access
(2020)
P.A. Apostolopoulos et al.
Cognitive data offloading in mobile edge computing for Internet of things
IEEE Access
(2020)
S. Barbarossa et al.
Communicating while computing: distributed mobile cloud computing over 5g heterogeneous networks
IEEE Signal Process. Mag.
(2014)
R. Bellman
Dynamic programming
Science
(1966)
D.S. Bernstein et al.
The complexity of decentralized control of Markov decision processes
Math. Oper. Res.
(2002)
T.D. Burd et al.
Processor design for portable systems
J. VLSI Signal Process. Syst. Signal Image Video Technol.
(1996)

N. Chen et al.

When learning joins edge: real-time proportional computation offloading via deep reinforcement learning

X. Chen et al.

Efficient multi-user computation offloading for mobile-edge cloud computing

IEEE/ACM Trans. Netw.

(2015)

X. Chen et al.

Efficient resource allocation for relay-assisted computation offloading in mobile-edge computing

IEEE Int. Things J.

(2019)

Y. Ding et al.

A code-oriented partitioning computation offloading strategy for multiple users and multiple mobile edge computing servers

IEEE Trans. Ind. Inform.

(2019)

T.Q. Dinh et al.

Offloading in mobile edge computing: task allocation and computational frequency scaling

IEEE Trans. Commun.

(2017)

S. Hu et al.

Dynamic request scheduling optimization in mobile edge computing for iot applications

IEEE Int. Things J.

(2019)

L. Huang et al.

Deep reinforcement learning for online computation offloading in wireless powered mobile-edge computing networks

IEEE Trans. Mob. Comput.

(2019)

L.P. Kaelbling et al.

Reinforcement learning: a survey

J. Artif. Intell. Res.

(1996)

M. Kamoun et al.

Joint resource allocation and offloading strategies in cloud enabled cellular networks

Cited by (30)

Deadline-aware task offloading in vehicular networks using deep reinforcement learning
2024, Expert Systems with Applications
Smart vehicles have a rising demand for computation resources, and recently vehicular edge computing has been recognized as an effective solution. Edge servers deployed in roadside units are capable of accomplishing tasks beyond the capacity which is embedded inside the vehicles. However, the main challenge is to carefully select the tasks to be offloaded considering the deadlines, and in order to reduce energy consumption, while delivering a good performance. In this paper, we consider a vehicular edge computing network in which multiple cars are moving at non-constant speed and produce tasks at each time slot. Then, we propose a task offloading algorithm, aware of the vehicle’s direction, based on Rainbow, a deep Q-learning algorithm combining several independent improvements to the deep Q-network algorithm. This is to overcome the conventional limits and to reach an optimal offloading policy, by effectively incorporating the computation resources of edge servers to jointly minimize average delay and energy consumption. Real-world traffic data is used to evaluate the performance of the proposed approach compared to other algorithms, in particular deep Q-network, double deep Q-network, and deep recurrent Q-network. Results of the experiments show an average reduction of 18% and 15% in energy consumption and delay, respectively, when using the proposed Rainbow deep Q-network based algorithm in comparison to the state-of-the-art. Moreover, the stability and convergence of the learning process have significantly improved by adopting the Rainbow algorithm.
Improved binary marine predator algorithm-based digital twin-assisted edge-computing offloading method
2024, Future Generation Computer Systems
A vast number of mobile internet-of-things (IoT) devices are connected to the internet, and they constantly generate new computing tasks. Owing to an IoT device’s limited resources and restricted computing power, heavy computing tasks are generally offloaded to edge servers, where a digital twin (DT) of the IoT device is synchronized using the current information from the physical device, followed by completing the computing task and returning the results. Notably, this strategy must be completed within time constraints while ensuring that the device is not overtaxed by its computations. The problem is that offloading solution is not efficient enough to meet time and energy requirements for critical tasks. Simultaneously, the success rate of offloading solutions affects computing efficiency. To address these problems, this study formulates a mathematical model of energy consumption optimization under time constraints and proposes a binary problem-enabled marine predator algorithm. Simulation results demonstrate that our proposed method effectively meets deadlines while reducing energy consumption, even with an increasing number of users.
Proactive auto-scaling technique for web applications in container-based edge computing using federated learning model
2024, Journal of Parallel and Distributed Computing
Edge computing has emerged as an attractive alternative to traditional cloud computing by utilizing processing, network, and storage resources close to end devices, such as Internet of Things (IoT) sensors. Edge computing is still in its infancy, and resource provisioning and service scheduling remain research concerns. Kubernetes is a container orchestration tool in distributed environments. Proactive auto-scaling techniques in Kubernetes improve utilization by allocating resources based on future workload prediction. However, prediction models run on central cloud nodes, necessitating data transfer between edge and cloud nodes, which increases latency and response time. We present FedAvg-BiGRU, a proactive auto-scaling method in edge computing based on FedAvg and multi-step prediction by a Bidirectional Gated Recurrent Unit (BiGRU). FedAvg is a technique for training machine learning models in a Federated Learning (FL) model. FL reduces network traffic by exchanging only model updates rather than raw data, relieving learning models of the need to store data on a centralized cloud server. In addition, a technique for determining the number of Kubernetes pods based on the Cool Down Time (CDT) concept has been developed, preventing contradictory scaling actions. To our knowledge, our work is the first to employ FL for proactive auto-scaling in cloud and edge computing. The results demonstrate that the FedAvg-BiGRU method has a slightly higher prediction error than the load centralized processing mode, although this difference is not statistically significant. At the same time, it reduces the amount of data transmission between the edge nodes and the cloud server.
Deep Reinforcement Learning-based scheduling for optimizing system load and response time in edge and fog computing environments
2024, Future Generation Computer Systems
Edge/fog computing, as a distributed computing paradigm, satisfies the low-latency requirements of ever-increasing number of IoT applications and has become the mainstream computing paradigm behind IoT applications. However, because large number of IoT applications require execution on the edge/fog resources, the servers may be overloaded. Hence, it may disrupt the edge/fog servers and also negatively affect IoT applications’ response time. Moreover, many IoT applications are composed of dependent components incurring extra constraints for their execution. Besides, edge/fog computing environments and IoT applications are inherently dynamic and stochastic. Thus, efficient and adaptive scheduling of IoT applications in heterogeneous edge/fog computing environments is of paramount importance. However, limited computational resources on edge/fog servers imposes an extra burden for applying optimal but computationally demanding techniques. To overcome these challenges, we propose a Deep Reinforcement Learning-based IoT application Scheduling algorithm, called DRLIS to adaptively and efficiently optimize the response time of heterogeneous IoT applications and balance the load of the edge/fog servers. We implemented DRLIS as a practical scheduler in the FogBus2 function-as-a-service framework for creating an edge–fog–cloud integrated serverless computing environment. Results obtained from extensive experiments show that DRLIS significantly reduces the execution cost of IoT applications by up to 55%, 37%, and 50% in terms of load balancing, response time, and weighted cost, respectively, compared with metaheuristic algorithms and other reinforcement learning techniques.
MCOTM: Mobility-aware computation offloading and task migration for edge computing in industrial IoT
2024, Future Generation Computer Systems
Mobility-aware devices are crucial components of Industrial Internet of Things (IIoT). However, they face limitations in terms of battery capacity and computation power, which restrict their ability to provide services requiring broad bandwidth and strong computation power for computation-intensive tasks. While offloading can strengthen device computation power, ineffective offloading decisions result from device mobility and limited adaptability to changes in environmental resources, or are not applicable to the current mobile edge computing (MEC) environment. In this paper, we address these challenges by proposing a mobility-aware computation offloading and task migration approach (MCOTM) based on trajectory and resource prediction to address this issue of mobility offloading, which minimizes task turnaround time and system energy consumption. Simultaneously, our approach enhances the decision agent continuously to decrease task migration rates. MCOTM uses Lagrange interpolation equations to determine the trajectory of mobile devices, and Long Short-Term Memory (LSTM) to track the time-varying resources characteristics in IIoT. These prediction results will be used to assist Deep Deterministic Policy Gradient (DDPG) for making online computation offloading, task migration and resource allocation decisions. Experimental results show that the proposed MCOTM effectively reduces task turnaround time by at least 42% and system energy consumption by 10% while maintaining a low task migration rate of around 50%, even with an increasing number of tasks.
A green, secure, and deep intelligent method for dynamic IoT-edge-cloud offloading scenarios
2023, Sustainable Computing: Informatics and Systems
To fulfill people's expectations for smart and user-friendly Internet of Things (IoT) applications, the quantity of processing is fast expanding, and task latency constraints are becoming extremely rigorous. On the other hand, the limited battery capacity of IoT objects severely affects the user experience. Energy Harvesting (EH) technology enables green energy to offer a continuous energy supply for IoT objects. It provides a solid assurance for the proper functioning of resource-constrained IoT objects when combined with the maturation of edge platforms and the development of parallel computing. The Markov Decision Process (MDP) and Deep Learning (DL) are used in this work to solve dynamic online/offline IoT-edge offloading scenarios. The suggested system may be used in both offline and online contexts and meets the user's quality of service expectations. Also, we investigate a blockchain scenario in which edge and cloud could work toward task offloading to address the tradeoff between limited processing power and high latency while ensuring data integrity during the offloading process. We provide a double Q-learning solution to the MDP issue that maximizes the acceptable offline offloading methods. During exploration, Transfer Learning (TL) is employed to quicken convergence by reducing pointless exploration. Although the recently promoted Deep Q-Network (DQN) may address this space complexity issue by replacing the huge Q-table in standard Q-learning with a Deep Neural Network (DNN), its learning speed may still be insufficient for IoT apps. In light of this, our work introduces a novel learning algorithm known as deep Post-Decision State (PDS)-learning, which combines the PDS-learning approach with the classic DQN. The system component in the proposed system can be dynamically chosen and modified to decrease object energy usage and delay. On average, the proposed technique outperforms multiple benchmarks in terms of delay by 4.5%, job failure rate by 5.7%, cost by 4.6%, computational overhead by 6.1%, and energy consumption by 3.9%.

View all citing articles on Scopus

Yongxuan Lai received the bachelor degree at Management of Information System from Renmin University of China in 2004, the Ph.D. degree in Computer Science from Renmin University of China in 2009. He is currently a professor in Software Engineering Department, School of Informatics, Xiamen University, China. He is also the dean of School of Mathematics and Information Engineering, Longyan University, China. He was an visiting scholar during Sep. 2017 - Sep. 2018 at University of Queensland, Australia. His research interests include network data management, intelligent transportation systems, and big data management and analysis.

Fan Yang received his Ph.D. degree in Control Theory and Control Engineering from Xiamen University in 2009. He is currently an associate professor in the Department of Automation at Xiamen University. His research interests include feature selection, ensemble learning, and intelligent transportation systems.

Wenhua Zeng received his Ph.D. degree in Industrial Automation from Zhejiang University in 1989. He is currently a professor in the Department of Software Engineering at Xiamen University. His research interests include embedded system, embedded software, internet of things, and cloud computing.

View full text

Online computation offloading with double reinforcement learning algorithm in mobile edge computing

Highlights

Abstract

Introduction

Section snippets

Related work

System model

Problem formulation

Reinforcement learning for computation offloading

Performance analysis

Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgements

Future Gener. Comput. Syst.

Future Gener. Comput. Syst.

Collision-free sequential task offloading for mobile edge computing

IEEE Commun. Lett.

Task scheduling for mobile edge computing using genetic algorithm and conflict graphs

IEEE Trans. Veh. Technol.

Task offloading and resource allocation for mobile edge computing by deep reinforcement learning based on sarsa

IEEE Access

Cognitive data offloading in mobile edge computing for Internet of things

IEEE Access

Communicating while computing: distributed mobile cloud computing over 5g heterogeneous networks

IEEE Signal Process. Mag.

Dynamic programming

Science

The complexity of decentralized control of Markov decision processes

Math. Oper. Res.

Processor design for portable systems

J. VLSI Signal Process. Syst. Signal Image Video Technol.

When learning joins edge: real-time proportional computation offloading via deep reinforcement learning

Efficient multi-user computation offloading for mobile-edge cloud computing

IEEE/ACM Trans. Netw.

Efficient resource allocation for relay-assisted computation offloading in mobile-edge computing

IEEE Int. Things J.

A code-oriented partitioning computation offloading strategy for multiple users and multiple mobile edge computing servers

IEEE Trans. Ind. Inform.

Offloading in mobile edge computing: task allocation and computational frequency scaling

IEEE Trans. Commun.

Dynamic request scheduling optimization in mobile edge computing for iot applications

IEEE Int. Things J.

Deep reinforcement learning for online computation offloading in wireless powered mobile-edge computing networks

IEEE Trans. Mob. Comput.

Reinforcement learning: a survey

J. Artif. Intell. Res.

Joint resource allocation and offloading strategies in cloud enabled cellular networks