Low load DIDS task scheduling based on Q-learning in edge computing environment

https://doi.org/10.1016/j.jnca.2021.103095Get rights and content

Abstract

Edge computing, as a new computing model, is facing new challenges in network security while developing rapidly. Due to the limited performance of edge nodes, the distributed intrusion detection system (DIDS), which relies on high-performance devices in cloud computing, needs to be improved to low load to detect packets nearby the network edge. This paper proposes a low load DIDS task scheduling method based on Q-Learning algorithm in reinforcement learning, which can dynamically adjust scheduling strategies according to network changes in the edge computing environment to keep the overall load of DIDS at a low level, while maintaining a balance between the two contradictory indicators of low load and packet loss rate. Simulation experiments show that the proposed method has better low-load performance than other scheduling methods, and indicators such as malicious feature detection rate are not significantly reduced.

Introduction

In the field of computer networks, traditional cloud computing (CC) transmits data at the edge of the network to the cloud server for centralized processing. However, with the rapid development of the Internet of Things (IoT) and 5G, various types of sensor devices at the edge of the network show explosive growth. These sensors generate a lot of data. In some scenarios with high real-time requirements, the single computing resource model based on cloud computing can no longer meet the needs of real-time, security, and low energy consumption of big data processing. Therefore, edge computing (EC) came into being(Shi et al., 2017).

Edge computing is where compute resources are placed closer to information-generation sources, to reduce network latency and bandwidth usage generally associated with cloud computing (Zhang et al., 2018a). However, with the transfer of data control to edge nodes that are not easy to be controlled physically, coupled with the complexity of edge computing mode, the multi-source heterogeneity of data and the limited resources of terminals, the traditional security protection mechanism in cloud computing environment is no longer applicable to edge computing environment (He et al., 2018). As a result, the breadth and difficulty of access control and threat protection in edge computing environments has increased significantly.

In cloud computing environments, intrusion detection systems (IDS) can be used for security protection (Zhao et al., 2020). With the surge in complexity of network traffic and intrusion behaviors, traditional single-host IDS cannot meet the requirements of efficient and accurate detection (Zhao et al., 2019), so the distributed intrusion detection system (DIDS) is widely used. DIDS consists of a scheduler and multiple detection engines. The scheduler distributes traffic to multiple detection engines for detection according to certain rules, which improves detection efficiency and prevents overall paralysis due to a single point of failure.

However, DIDS in cloud computing relies on high-performance hardware devices (Liu et al., 2020), but in edge computing, the processing power of edge nodes is limited. This will directly cause a single detection engine to fail to detect network traffic that exceeds its detection capabilities, which will cause missed detections when the network speed is high (Zhao et al., 2016). Network traffic containing malicious information that escapes security detection will pose a security threat to the entire network (Hui et al., 2019). If this problem is not resolved, it will be determined that edge computing will not develop in a healthy manner.

Under the premise of limited performance of a single detection engine, how the DIDS scheduler optimizes task al location to each performance-limited detection engine becomes an important issue (Zhao et al., 2015). The reasonable optimization decision of the scheduler will lead to the reduction of the overall PLR of DIDS.

Although there have been some researches related to task scheduling in the field of edge computing (Lin et al., 2018) (Diddigi et al., 2018), and even an IDS framework for edge computing environments has been proposed (Hui et al., 2019), as far as the problems studied in this paper are concerned, the following problems still need to be solved:

  • 1)

    How to make the scheduler make scientific decisions and distribute the detection tasks to each detection engine reasonably, so that the load of the entire system is in a low state. In this way, low-load DIDS can adapt to the edge computing environment;

  • 2)

    The processing performance of the detection engines of DIDS running at the edge of the network is likely to be different. The load capacity of each detection engine needs to be objectively evaluated so that the scheduler can make scientific decisions;

  • 3)

    Low load and low packet loss rate (PLR) are two conflicting indicators. How to find a balance between these two contradictions to make DIDS run in a low load environment and keep the PLR within an acceptable range.

Because DIDS is faced with fast and randomly changing network traffic, this will cause some machine learning algorithms to be unsuitable. But reinforcement learning (RL) can take correct actions based on changes in the environment to maximize the expected benefits. Some scholars have used it to build a random model in a dynamically changing environment and achieved meaningful results (Diddigi et al., 2018). As a model-free algorithm in reinforcement learning, Q-learning (QL) algorithm can find the optimal strategy in the model established by the Markov decision process (MDP), so it has gradually gained attention.

On this basis, this paper uses Markov decision process modeling, and uses reinforcement learning to conduct research through rigorous mathematical theoretical reasoning. The key contributions in this paper are provided as follows.

  • 1)

    A task scheduling method based on Q-learning is proposed. The method takes low load as the optimization goal and finds the optimal policy to keep the DIDS low load state through the value function. In this way, the scheduler can performs reasonable scheduling between detection engines of different performance and packets of different lengths according to this policy.

  • 2)

    In order to avoid the problem that excessively low load may cause PLR increase, we have established an adjustment mechanism that allows the scheduler to adjust the probability of detection engines of different efficiencies being assigned tasks based on changes in PLR, thereby achieving a balance between two contradictory indicators;

  • 3)

    A scientific evaluation methods for the processing performance of the detection engine and the load generated by the packets is proposed.

The rest of this paper is organized as follows. Section 2 discusses the previous studies related to this paper. The detailed method design is introduced in Section 3. Section 4 describes how the method balances low load and packet loss. Section 5 provides the experimental results and discussions of our proposed method. Finally, Section 6 concludes this paper.

Section snippets

Related work

In this section, we will review the related research on IDS task scheduling and reinforcement learning in the edge computing environment.

Model workflow

There are multiple detection engines with different performance levels in DIDS to provide detection for randomly arrived packets of different load levels. The workflow of the model is described below with reference to Fig. 1.

Because the next incoming packet load level is uncertain and the queue length is limited, for a DIDS with a fixed number of detection engines, the optimal decision needs to be made to minimize the overall load while maintaining the PLR at acceptable level.

Evaluation of performance and load

In order to

Q-learning algorithm

The Q-learning algorithm is a modeless reinforcement learning algorithm. It provides the learning capability for scheduler to select optimal actions according to historical experience (Tong et al., 2014). Firstly, we construct all states in the state space and all corresponding actions in the action space into a Q_table to store the Q value, which is Q(s,a), and then select the action that can get the most benefit based on the Q value. In the Sarsa algorithm of reinforcement learning, the

Balance of contradictory indicators

Although the above state-action value function strives to minimize the overall load of DIDS, low load and low PLR are two contradictory indicators. Excessive low load will increase the PLR, especially when there is a sudden surge in network traffic. Therefore, these two indicators need to be balanced. In order to achieve this goal, it is necessary to first calculate the relevant parameters required for balance.

Experiments and analysis of results

The experiment tests the proposed method on DIDS through the simulation environment. During the test, the proposed method is compared with the algorithms of DDEM in (Hui et al., 2019), SDMMF(Lin et al., 2018), and LB (Arian et al., 2017) in recent literature. The purpose of the test is to answer the following questions:

  • Compared with other methods, can the overall load of DIDS be effectively reduced?

  • Will the proposed method increase the PLR?

  • How do detection engines with different performances be

Conclusion

Aiming at the problem of limited processing performance of devices in edge environments, this paper first scientifically evaluates the processing performance of each DIDS detection engine and the load generated by different packets, and then proposes a DIDS task scheduling method based on Q-learning algorithm. This method can keep DIDS in balance between the two contradictory indicators of low load and PLR. Finally, this paper compares and verifies the proposed method through a simulation

Credit author statement

Xu Zhao: Problem model, Methodology, Data curation, Writing- Original draft preparation. Guangqiu Huang, Ling Gao, Maozhen Li: Writing-Reviewing and Editing, Supervision. Quanli Gao: Validation

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

The work was supported by the Natural Science Foundation of China (NSFC) under grants (71874134), Shaanxi Science and Technology Project (2019KRM153, 2020CGXNG-012), Shaanxi Key Industry Innovation Chain Project (2020ZDLGY07-05), Xi'an Science and Technology Bureau (201805030YD8CG14(8)), Department of education of Shaanxi Province (18JK0323), Social Science Foundation of Shaanxi Province (2019M026), Shaanxi Federation of Social Sciences (2019C080).

Xu Zhao is a PhD student in information management and information system from Xi'an University of Architecture &Technology of China. He is also an associate professor in the School of Computer Science, Xi'an Polytechnic University, and Shaanxi, China. In addition, he is also an academic visitor to a Brunel University London. He received the M.S. degree from Xi'an Electronic Technology University, Xi'an City, and Shaanxi Province, China in 2007. His research interest is Edge Computing and Cyber

References (36)

  • R.B. Diddigi et al.

    Novel sensor scheduling scheme for intruder tracking in energy efficient sensor networks[J]

    IEEE Wireless Commun. Lett.

    (2018)
  • R.H. Dong et al.

    An intrusion detection model for wireless sensor network based on information gain ratio and bagging algorithm[J]

    Int. J. Netw. Secur.

    (2018)
  • T. Ha et al.

    Suspicious flow forwarding for multiple intrusion detection systems on software-defined networks[J]

    IEEE Network

    (2016)
  • D. He et al.

    Security in the Internet of Things supported by mobile edge computing [J]

    IEEE Commun. Mag.

    (2018)
  • H. Hui et al.

    A new resource allocation mechanism for security of mobile edge computing system[J]

    IEEE Access

    (2019)
  • K. Kaur et al.

    Edge computing in the industrial internet of things environment: software-defined-networks-based edge-cloud interplay[J]

    IEEE Commun. Mag.

    (2018)
  • A. Lansheng Han

    Intrusion detection model of wireless sensor networks based on game theory and an autoregressive model

    Inf. Sci.

    (2019)
  • L. Lei et al.

    Joint computation offloading and multiuser scheduling using approximate dynamic programming in NB-IoT edge computing system[J]

    IEEE Internet Things J.

    (2019)
  • Cited by (13)

    • Task offloading of cooperative intrusion detection system based on Deep Q Network in mobile edge computing

      2022, Expert Systems with Applications
      Citation Excerpt :

      As the number of iterations increases, the optimal strategy is gradually approached, so the loss value tends to stabilize. In this experiment, compare DQN is compared with RL and QL algorithm in literature (Zhao et al., 2021; Zhao, Huang et al., 2021). The learning rate of the DQN algorithm is set to 0.001, and the weight coefficients βt,βe andβp of these three algorithms are set to 0.33 respectively to test the performance of the above algorithm in the balanced mode.

    • Context-aware distribution of fog applications using deep reinforcement learning

      2022, Journal of Network and Computer Applications
      Citation Excerpt :

      Therefore, the system optimisation problems are tacked by using RL (Alam et al., 2016; Xu et al., 2017; Haj-Ali et al., 2019; Lu et al., 2020; Rui et al., 2021a), which provides an agent to learn on-the-fly how to behave in an environment by taking actions and seeing the results. Model-free RL mechanisms are implemented when designing offloading (Dinh et al., 2018) and scheduling (Zhao et al., 2021) policies in edge computing. The Q-learning based algorithm in this work does not require that mobile users have prior knowledge of wireless channel information.

    • Research on lightweight anomaly detection of multimedia traffic in edge computing

      2021, Computers and Security
      Citation Excerpt :

      With the rise of edge computing, research on multimedia traffic in edge computing has gradually emerged. However, these studies mainly focus on video distribution and edge caching (Kanai et al., 2018), low-latency processing (Zhao et al., 2021) of multimedia data between the edge and cloud computing, session scheduling (Xijian et al., 2019) of streaming media and transmission control (Shafiq et al., 2020; Zhao et al., 2020) of multimedia traffic. However, the research on the identification and processing of multimedia traffic generated by heterogeneous devices has rarely found.

    View all citing articles on Scopus

    Xu Zhao is a PhD student in information management and information system from Xi'an University of Architecture &Technology of China. He is also an associate professor in the School of Computer Science, Xi'an Polytechnic University, and Shaanxi, China. In addition, he is also an academic visitor to a Brunel University London. He received the M.S. degree from Xi'an Electronic Technology University, Xi'an City, and Shaanxi Province, China in 2007. His research interest is Edge Computing and Cyber Security.

    Guangqiu Huang is now a professor and doctoral supervisor in the School of Management, Xi'an University of Architecture and Technology, Xi'an, China. He received the B.S. and the M.S. degree from Xi'an University of Architecture & Technology, Xi'an, China, and the Ph.D. degree from Northeast University, Shenyang, China, all in mining engineering. His research involves information management and computer intelligence. He is the consultant expert of the Government of Xi'an City, the assessment expert of National Natural Science Foundation.

    Ling Gao is currently a professor of computer science at Xi'an Polytechnic University, Xi'an, China. He received his BS degree in Computer Science from Hunan University and his MS degree in Computer Science from Northwest University, in 1985 and 1988 respectively. In 2005, he received PhD degree in Computer Science from Xi'an Jiaotong University, Xi'an, China. His research includes Network Security and Management, Embedded Internet Service. Prof. Gao is the director of China Higher Educational Information Academy, vice chairman of China Computer Federation Network and Data Communications Technical Committee, member of IEEE, CAET (China Association for Educational Technology) director.

    Maozhen Li is currently a professor of College of Engineering, Design and Physical Sciences. Software Engineering at Brunel University London. He received the Ph.D. degree from Chinese Academy of Sciences in 1997. As a research assistant, he completed the postdoctoral research in Cardiff University in 2002. Prof. Li is the fellow of the British Computer Society and the Member of IET, IEEE. His research includes High Performance Computing, Knowledge and Data Engineering.

    QUANLI GAO received the B.S. degree in information and computer science in 2010, and the Ph.D. degree from Northwest University, China, in 2017. He is currently an associate professor with Xi'an Polytechnic University. He has participated in several national research projects. His research interests include recommender system and machine learning.

    View full text