Elsevier

Computer Networks

Volume 169, 14 March 2020, 107049
Computer Networks

HELAD: A novel network anomaly detection model based on heterogeneous ensemble learning

https://doi.org/10.1016/j.comnet.2019.107049Get rights and content

Abstract

Network traffic anomaly detection is an important technique of ensuring network security. However, there are usually three problems with existing machine learning based anomaly detection algorithms. First, most of the models are built for stale data sets, making them less adaptable in real-world environments; Second, most of the anomaly detection algorithms do not have the ability to learn new models again based on changes in the attack environment; Third, from the perspective of data multi-dimensionality, a single detection algorithm has a peak value and cannot be well adapted to the needs of a complex network attack environment. Thus, we propose a new anomaly detection framework, and this framework is based on the organic integration of multiple deep learning techniques. In the first step, we used the Damped Incremental Statistics algorithm to extract features from network traffic; Second, we train Autoencoder with a small amount of label data; Third, we use Autoencoder to mark the abnormal score of network traffic; Fourth, the data with the abnormal score label is used to train the LSTM; Finally, the weighted method is used to get the final abnormal score. The experimental results show that our HELAD algorithm has better adaptability and accuracy than other state of the art algorithms.

Introduction

The importance of intrusion detection systems (IDS) is critical because networks can be vulnerable to attacks from internal and external intruders [1], [2]. Network traffic anomalies will lead to a decline in network communication performance and network service interruption. The definition of network anomalies is that the current network traffic is seriously deviating from normal traffic. Network anomalies are mainly caused by malicious network attacks, e.g. Denial of Service (DoS), Distributed Denial of Service (DDoS), port scan, worm propagation, etc., as well as network configuration errors and other exception [3] caused by the interruption of the line. As a detection system put in place to monitor computer networks, IDS has been in use since 1980s [4]. By analysing patterns of captured data from a network, IDS helps to detect threats [5]. Traffic anomaly detection has always been the research direction of network security academics and industry, and many related detection methods and systems have been developed.

The constant change of the attack mode makes it more difficult to solve the traffic anomaly detection problem. Traditional intrusion detection tools, such as rule-based Snort [6], are no longer able to meet the growing demand for network security. We need to design a smarter intrusion detection tool. This anomaly detection tool requires the ability to learn dynamically and requires environmental adaptation to defend against unknown attacks.

The current mainstream method of traffic anomaly detection is machine learning. Our design choices will be analyzed from different categories of machine learning.

(1) Machine learning methods can be divided into shallow machine learning and deep learning according to the number of layers of neural networks involved: Shallow machine learning [7] has the advantage of short training time, and deep learning [8] has stronger representation ability. The trend of GPU [9], [10] acceleration methods allow us to choose a deep learning approach.

(2) Specific classification tasks can be divided into single classifiers and ensemble learning classifiers: The idea of ensemble learning is to improve machine learning performance by combining multiple models, which is better than a single model in common sense. Ensemble learning is the research hotspot of machine learning in the field of traffic anomaly detection [11], [12]. The ensemble learning model is superior to the single model in both predictive power and generalization ability, so ensemble learning is introduced in the design of our model.

(3) Choose supervised learning or unsupervised learning: Supervised learning [13], [14] interacts with the external environment through a label-guided approach, so that the trained model can better integrate into human domain knowledge. Therefore, it is necessary to incorporate a supervised learning model.

Based on the above design choices, we have the following challenges:

(1) Ensemble learning is well used in the field of anomaly detection. However, deep learning method has not been used as component learners [61] of heterogeneous ensemble learning in network intrusion detection.

(2) Algorithms based on supervised learning require a large number of labeled training data to obtain good detection results. However, real network traffic data lacks of a large number of truly labeled data sets, which makes it difficult to use supervised deep learning.

(3) The attack environment of the network changes constantly. If the model does not have the ability to relearn, the performance of the detector will decrease.

(4) Many of the deep learning based anomaly detection models have not been evaluated with real traffic data. The main manifestation: the training data of anomaly detection model comes from the idealized data set, for example, the data set has been used for too long or it is just generated by the attack tool.

Inspired by the above observations, this paper attempts to absorb the advantages of heterogeneous ensemble learning and deep learning techniques and propose a more effective method.

To summarize, our main contributions in this paper are listed as follows:

• We have integrated various deep learning techniques and proposed the Heterogeneous Ensemble Learning Anomaly Detection (HELAD) algorithm framework. This framework is composed of four parts: feature dimension reduction, abnormal score generation, abnormal score prediction, and anomaly detection result combination. Each module can choose the appropriate technology according to its own design.

• We apply ensemble learning to anomaly detection. Specifically, the unsupervised Autoencoder and the supervised Long Short-Term Memory (LSTM) are combined in a heterogeneous way. The Autoencoder gains the profile of normal network traffic as one of the base learners, and provides learned RMSE as the label needed to train the LSTM. The LSTM can detect continuous attacks well, as it can record historical information and predict whether the attack will occur next time. In order to be able to use the supervised LSTM, we introduce the concept of a temporary label (TL), which is generated by an unsupervised Autoencoder.

• We introduce the concept of retraining time slices to retrain the model. This time slice is the time required to train the anomaly detection model in the previous round. We design dynamic thresholds and integrate learning parameters in the model so that the anomaly detection effect does not degrade.

• To evaluate the HELAD model, we conduct experiments on the latest data sets that reflect the real environment. And, we further evaluate our algorithm by comparing it with different state of the art algorithms. The experimental results consistently prove the superiority and competitiveness of our proposed model.

The rest of the paper is organized as follows. Section 2 presents the related work. We present our system model and problem formulation in Section 3. The model training and strategy optimization are presented in Section 4. We do the experiment and evaluate the performance using real traffic trace data in Section 5. We discuss some of the details of our models and experimental methods in Section 6. Finally, we conclude our work in Section 7.

Section snippets

Related work

In this section we review some literature work. Network intrusion detection is a classic network security issue. We focus on analyzing the related work from four aspects: traditional statistics, machine learning, deep learning and ensemble learning based methods. Next we discuss some of the literature for relearning.

We first discuss the traditional statistics method. There are several examples of statistical methods that are widely used in attack detection [15]. Lee et al. [16] proposed to use

HELAD: Heterogeneous Ensemble Learning Anomaly Detection model

In this section, first motivation of this paper are presented. Next, we provide a basic model description that includes the meaning of statistical features, the meaning of formulas, and the way in which features are expressed. After that, we build the submodel for each part of the HELAD model. Finally, the abnormality detection result can be obtained by discriminant formula. Fig. 1 details the training process of the HELAD model and the abnormality determination process.

Model training & strategy optimization

The previous section mainly introduces our model from the perspective of technology construction. This section is now elaborated from the perspective of specific training details and the optimization of the effects of the entire model. The training of the model is to obtain the values of the parameters gp and p in the discriminant function after the neural network is stable. The values of gp and p are detailed in Table 2. All specific training steps can be seen in Fig. 1. We divide the overall

Experiments & evaluation

This section covers our experimental results. Our codes are available at the open-source code repository.2 In order to systematically evaluate our model, we want to check the following four points: (1) In what circumstances can our algorithm achieve the best performance. (2) Whether ensemble learning and the re-learning function are effective. (3) How does our algorithm compare to the performance of other state of the arts algorithms on different data sets. (4)

Discussion

In this section, we will discuss some of the details of our models and experimental methods.

Q1: Whether ey0is needed in the final discriminant formula?

The first question, in HELAD model, the abnormal score calculated by Autoencoder and the original feature are stitched into a new vector xL, which is used as an input to the LSTM. LSTM training uses the anomaly score and is affected by this abnormal score when forecasting. However, our discriminant formula also considers the abnormal score

Conclusion

The network environment is increasingly complex, and as such, the form of attack is ever-changing. Many of the existing machine learning related anomaly detection models published previously are evaluated using data such as KDD, leading to the emergence of this problem that it is not practical in real-life environments. By introducing the idea of organic integration of various deep learning techniques, the HELAD model can better combine LSTM classifier and Autoencoder classifier. This provides

Declaration of Competing interest

The authors declare that they have no conflicts of interest to this work.

Acknowledgments

This work is supported by the National Key Research and Development Program of China under grant no. 2018YFB1800205. And we also thank for the support of National Engineering Lab for Next Generation Internet Technologies (no. NGIT2019004).

Ying Zhong received master degree at the College of Information Science and Engineering, Hunan University, China. He is working towards the Ph.D. degree at the Institute for Network Sciences and Cyberspace at Tsinghua University. His research interests include the machine learning, big data, network security, data analysis and mining algorithms and their parallel implementation.

References (63)

  • D.E. Denning

    An intrusion-detection model

    IEEE Trans. Softw. Eng.

    (1987)
  • R. Sommer et al.

    Outside the closed world: on using machine learning for network intrusion detection

    IEEE Symposium on Security and Privacy. IEEE Computer Society

    (2010)
  • A. Abraham et al.

    Evolutionary design of intrusion detection programs

    Int. J. Netw. Secur.

    (2007)
  • P. Mishra et al.

    A detailed investigation and analysis of using machine learning techniques for intrusion detection

    IEEE Communications Surveys & Tutorials

    (2019)
  • J. Carr, Snort: Open source network intrusion prevention,...
  • K.E. Smith et al.

    Shepard interpolation neural networks with k-means: ashallow learning method for time series classification

    2018 International Joint Conference on Neural Networks (IJCNN)

    (2018)
  • L. Shao et al.

    Learning deep and wide: a spectral method for learning deep networks

    IEEE Trans. Neural Netw. Learn.Syst.

    (2014)
  • J. Jin et al.

    GPU-Accelerated parallel algorithms for linear rankSVM

    J. Supercomput.

    (2015)
  • E. Kijsipongse et al.

    A hybrid GPU cluster and volunteer computing platform for scalable deep learning

    J. Supercomput.

    (2018)
  • A.A. Aburomman et al.

    A survey of intrusion detection systems based on ensemble and hybrid classifiers

    Comput. Secur.

    (2017)
  • W. Lee et al.

    Information-theoretic measures for anomaly detection

    IEEE Symp. Secur. Privacy

    (2001)
  • M. Yu

    A nonparametric adaptive CUSUM method and its application in network anomaly detection

    Int. J. Adv. Comput.Technol.

    (2012)
  • B. Krishnamurthy et al.

    Sketch-based change detection: methods, evaluation, and applications

    In Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement (IMC ’03)

    (2003)
  • J.D. Brutlag

    Aberrant behavior detection in time series for network service monitoring

    in Proceedings of Usenix Conference on System Administration

    (2000)
  • A. Patcha et al.

    An overview of anomaly detection techniques: existing solutions and latest technological trends

    Comput. Netw.

    (2007)
  • T.T. Nguyen et al.

    A survey of techniques for internet traffic classification using machine learning

    IEEE Commun. Surv. Tutor.

    (2008)
  • A.L. Buczak et al.

    A survey of data mining and machine learning methods for cyber security intrusion detection

    IEEE Commun. Surv. Tutor.

    (2016)
  • Ian goodfellow and yoshua bengio and aaron courville. deep learning

    MIT Press

    (2016)
  • N. Shone et al.

    A deep learning approach to network intrusion detection

    IEEE Trans. Emerging Top.Comput. Intell.

    (2018)
  • F.A. Khan et al.

    TSDL: a two-stage deep learning model for efficient network intrusion detection

    IEEE Access

    (2019)
  • Y. Mirsky et al.

    Kitsune: an ensemble of autoencoders for online network intrusion detection

    Netw. Distrib. Syst. Secur. Symp.

    (2018)
  • Cited by (0)

    Ying Zhong received master degree at the College of Information Science and Engineering, Hunan University, China. He is working towards the Ph.D. degree at the Institute for Network Sciences and Cyberspace at Tsinghua University. His research interests include the machine learning, big data, network security, data analysis and mining algorithms and their parallel implementation.

    Wenqi Chen is now working for an undergraduate degreee in computer science and technology at Tsinghua University. His research interests include Cyberspace security and network intrusion detection.

    Zhiliang Wang received the B.E., M.E. and Ph.D. degrees in computer science from Tsinghua University, China in 2001, 2003 and 2006 respectively. Currently he is an Associate Professor in the Institute for Network Sciences and Cyberspace at Tsinghua University. His research interests include formal methods and protocol testing, next generation Internet, network measurement.

    Yifan Chen is now a junior student of Beijing University of Posts and Telecommunications, Beijing, P.R.China. His research interests include Computer Network and Intrusion Detection System.

    Kai Wang is now working for an undergraduate degreee at University of Electronic Science and Technology of China. His research interests include Cyberspace security and network intrusion detection.

    Yahui Li received her B.S. degree from the College of Software from Jilin University, China in 2015. She is now pursuing her Ph.D. at Tsinghua University. Her research concerns network verification, network testing and formal methods.

    Xia Yin received her B.E., M.E. and Ph.D. degrees in computer science from Tsinghua University in 1995, 1997 and 2000, respectively. She is a Full Professor at the Department of Computer Science and Technology at Tsinghua University. Her research interests include future Internet architectures, formal methods, protocol testing and largescale Internet routing.

    Xingang Shi received his B.S. degree from Tsinghua University and his Ph.D. degree from The Chinese University of Hong Kong. He is now working at the Institute for Network Sciences and Cyberspace at Tsinghua University. His research interests include network measurement and routing protocols.

    Jiahai Yang received his M.S. and Ph.D. degrees in computer science from Tsinghua University, Beijing, P.R.China, in 1992 and 2003, respectively. He is now a professor of Tsinghua University. Jiahais research interests include Internet architecture and its protocols, IP routing technology, network measurement, network management, cloud computing, big data, etc.

    Keqin Li is a SUNY Distinguished Professor of computer science with the State University of New York. He is also a Distinguished Professor at Hunan University, China. His current research interests include cloud computing, fog computing and mobile edge computing, energy-efficient computing and communication, embedded systems and cyberphysical systems, heterogeneous computing systems, big data computing, high-performance computing, CPU-GPU hybrid and cooperative computing, computer architectures and systems, computer networking, machine learning, intelligent and soft computing. He has published over 690 journal articles, book chapters, and refereed conference papers, and has received several best paper awards. He currently serves or has served on the editorial boards of the IEEE Transactions on Parallel and Distributed Systems, the IEEE Transactions on Computers, the IEEE Transactions on Cloud Computing, the IEEE Transactions on Services Computing, and the IEEE Transactions on Sustainable Computing. He is an IEEE Fellow.

    View full text