An identification strategy for unknown attack through the joint learning of space–time features

https://doi.org/10.1016/j.future.2020.11.023Get rights and content

Highlights

  • An identification strategy for unknown attack behaviors was presented.

Abstract

Deep learning (DL) can effectively extract the features of attack behaviours and identify unknown attack behaviours. However, the current DL-based methods separately learn spatial feature and temporal features and fail to consider the spatiotemporal correlation of cyber events. To make up for the gap, this paper proposes an identification strategy for unknown attack behaviours through the joint learning of spatiotemporal features. First, a double-layer long short-term memory (LSTM) was adopted to learn the spatial features of data packet and the temporal feature of the network flow, which makes the attack behaviour recognition less dependent on prior knowledge. Next, the temporal attention was constructed to suppress the noises in the spatial features of the data packet; the spatial attention was designed to reduce the temporal features of low-density information; the spatial attention was fused with the temporal attention to establish the spatiotemporal dependence of cyber-attack behaviours and distinguish the importance of spatiotemporal features. Finally, our identification strategy was experimentally compared with the identification models solely based on spatial features or temporal features. The comparison shows that our strategy outperformed the contrastive models by 2% in recognition accuracy. Thus, the fusion between spatial and temporal features can effectively promote the identification accuracy of unknown attack behaviours.

Introduction

The identification of network attack elements is an important part of network security in the network system. After network data streams are collected at different locations in the network system, the types of network security elements to which they belong must be accurately identified to provide a basis for the next network security situation assessment. The identification of network security elements is mainly based on the attributes and characteristics of the collected data, which are classified into normal, Dos, Probe, R2L, U2R and other categories according to the characteristics of the attack [1], [2]. In the classification process, the most important step is feature recognition, which is also the research focus for researchers. As a new technology, machine learning technology is now widely used in the field of attack recognition. Researchers use algorithms such as Naive Bayes [3], Support Vector Machines [4], Random Forest [5] or DBSCAN [6], and X-means [7] to build models based on various characteristics to perform network traffic analysis and classification. Although these recognition models have better detection effect and higher recognition rate in the laboratory environment, all of them rely too much on artificially selected features. Features are commonly set by researchers through experience before the model is established. Common angles include network flow attributes, time, behaviour, etc. Reasonable features can effectively improve the performance of the model, but manual selection has higher requirements for the designer’s prior knowledge, and fixed features provide an opportunity for attackers. Attackers can use the idea of adversarial machine learning to change the related characteristics of the attack behaviour in a targeted manner to evade the detection of the model. Cui et al. [8] note that attackers can eliminate space similarity by injecting specific data packets and data flow noise into the network traffic and adding random time delay to the communication to eliminate the time similarity. Therefore, how to establish a deep learning model for network attack recognition that does not depend on features remains a subject worthy of study.

In recent years, deep learning technology has been widely applied to identify network attack behaviours and has achieved certain results. Among them, Wen et al. [9] proposed the use of a multi-layer feed forward neural network with a back-propagation mechanism to build a classifier and improved the algorithm so that it could dynamically adjust the learning rate of the model when updating the weights. Wu et al. [10] combined CNN and LSTM to construct a botnets detection system based on a deep learning method, which extracts network traffic features from two dimensions of time and space; Wang et al. [11] used convolutional neural networks to learn the spatial features of network traffic and image classification technology to identify malicious traffic and achieved high accuracy. Wang et al. [12] proposed using CNN to learn the characteristics of network traffic and using graph classification to achieve traffic classification. Torres et al. [13] converted the traffic features into characters, used the recurrent neural network to learn the time series features of the character strings for the traffic anomaly detection experiments and achieved good results. Yang et al. [1] proposed a classification model and a training method based on the radial basis neural network. The training sample error is used to construct the cost function to obtain the minimum value of the cost function and improve the classification accuracy. Alsirhani et al. [3] designed a classifier for IRC botnets using J48, Naive Bayes, and Bayesian network algorithms, which has a low missing report rate. Al-Jarrah et al. [14] proposed a new randomized data segmentation learning model, which used an improved forward selection and sorting technique to filter redundant and irrelevant features from the feature set, and reduced the bulky training dataset by the data pruning method based on Tyson polygon. Ma et al. [15] used deep neural networks to perform network traffic anomaly detection experiments on the KDD99 dataset. Saad et al. [16] compared the performance of five classifiers such as SVM and KNN in real-time detection for P2P botnets. Niyaz et al. [17] used the deep belief network to study anomaly detection for the NSL-KDD dataset. Koning et al. [18] proposed a large-scale high-speed detection system for NetFlow data. By adopting a random forest model to dynamically select features, the missing report rate and false positive rate can be adaptively balanced in different application scenarios.

Unlike the aforementioned existing work, this paper combines various deep learning algorithms to build a model, gradually abstracts through multi-layer neural networks, and automatically learns the network traffic characteristics from two dimensions (time and space) to identify large-scale complex attack behaviours. The model recognizes the data features in network data messages by constructing a spatial feature learning module, adopts the time feature learning mode to learn the time-series features of the network attack behaviour, reduces the prior knowledge of artificial features, protocols, and topology by attack recognition, uses the spatiotemporal attention fusion mechanism to construct the spatiotemporal dependence information in network attack behaviours, and uses a segmentation strategy to construct an attack behaviour recognition framework. The results are applicable to static networks of the traditional Ethernet and dynamic networks such as IIoT.

The contributions of this article are as follows.

(1) An identification strategy through joint learning of spatiotemporal features is presented for unknown attack behaviours that do not have obvious characteristics, are difficult to find, and have high concealment.

(2) The DLLSTM algorithm can reduce the dependence of the prior knowledge of artificial features, protocols, and topology.

(3) The spatial attention was fused with the temporal attention to establish the spatiotemporal dependence of cyber-attack behaviours and distinguish the importance of spatiotemporal features.

The remainder of this article is organized as follows. Section 2 contains a design of Double-layer LSTM spatio-temporal learning model. The fusion and calculation of space–time attention is presented in Section 3. Section 4 explains the identification classification of network attacks. Section 5 contains the experimental results and a discussion. Section 6 concludes this article.

Section snippets

Data pre-processing

The characteristics of the network behaviour are contained in the data packets of the network flow. Therefore, the network data flow must be processed in the process of network behaviour identification to make it the basic carrier of network behaviour identification. However, the existing experimental datasets are stored in the pcap file format. Therefore, the original pcap file data must be fluidized before the model is established. To better understand the composition of the data flow, the

Fusion of space–time attention

The traditional LSTM model does not consider the spatiotemporal dependence, so the extracted features do not include the spatiotemporal significance. To capture the spatio-temporal dependence information in the network attack behaviour, the model in this paper introduces the spatio-temporal attention module in the network architecture, as shown in Fig. 4, where a two-level fusion STA module that considers the attentional synchronization characteristics of different features is provided. The

Classified learning training

The input of the STA-LSTM model in this paper is the extracted spatio-temporal attention features. During the initialization phase of the STA-LSTM model, two three-layer perceptrons are used to calculate memory state X0 and hidden state h0, which are noted as: x0=ϕinit,x(1Tt=1T(1K2i=1K2X1(t,i)))h0=ϕinit,h(1Tt=1T(1K2i=1K2X1(t,j))) where ϕinit,x and ϕinit,h are implemented by a three-layer perceptron, and T is the length of the data group sequence, which are shown as follows: fn=σWfxxn

Experimental environment

(1) Experimental environment

In the simulation experiment of this article, the deep learning framework is Keras, and the back-end framework is TensorFlow; the running hardware and operating system environment are shown in Table 2.

In terms of neural network architecture, 2 LSTM layers and 2 fully connected layers are used in total. Finally, a layer of softmax is used as a classifier. The structure of each layer is described in Section 2.3.

(2) Data set

This paper selects KDD99 as the dataset for

Conclusion

In the process of attack behaviour recognition in IIoT, there are problems such as difficult artificial feature extraction and incorrectly determined behaviour category. Considering these problems, this paper proposes a double-layer LSTM network attack recognition model that integrates the spatio-temporal attention and performs the automatic network attack feature recognition from two dimensions of time and space. The model recognizes the data features in network data messages by constructing a

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported in part by the Guangxi Natural Science Foundation, China under Grant 2020GXNSFBA159042, in part by the National Natural Science Foundation of China under Grant 62061003, in part by the Guangxi Education Department Program, China under Grant 2020KY08020, and in part by the Doctoral Fund of Guangxi University of Science and Technology, China under Grant XiaoKe Bo19Z33.

Huan Wang was born in Chaoyang, Liaoning, China in 1987. He received the B.S., M.S., and Ph.D. degrees in Computer Science and Technology from Changchun University of Science and Technology, Changchuan, China in 2009, 2012 and 2017, respectively.

From 2012 to 2019, he was a lecturer in the College of Computer Science and Technology, Changchun University of Science and Technology. Since 2019, he has been an associate researcher with the School of Computer Science and Communications Engineering,

References (19)

  • WangY. et al.

    Network traffic classification method basing on CNN

    J. Commun.

    (2018)
  • KoningR. et al.

    CoreFlow: Enriching Bro security events using network traffic monitoring data

    Future Gener. Comput. Syst.

    (2018)
  • YangZ.H. et al.

    Identification of Malicious injection attacks in Dense rating and Co-Visitation Behaviors

    IEEE Trans. Inf. Forensics Secur.

    (2021)
  • TangM.J. et al.

    Big data for cybersecurity: Vulnerability disclosure trends and dependencies

    IEEE Trans. Big Data

    (2019)
  • AlsirhaniA. et al.

    Ddos detection system: Using a set of Classification Algorithms Controlled by Fuzzy Logic system in apache Spark

    IEEE Trans. Netw. Serv. Manag.

    (2019)
  • MumtazS. et al.

    Guest editorial 5G and beyond mobile technologies and applications for industrial IoT (IIoT)

    IEEE Trans. Ind. Inform.

    (2018)
  • AhmedS. et al.

    Unsupervised machine learning-based detection of covert data integrity assault in smart grid networks utilizing isolation forest

    IEEE Trans. Inf. Forensics Secur.

    (2019)
  • SongM.K. et al.

    Analyzing user-level privacy attack against federated learning

    IEEE J. Sel. Areas Commun.

    (2020)
  • VinayakumarR. et al.

    A visualized botnet detection system based deep learning for the internet of things networks of smart cities

    IEEE Trans. Ind. Appl.

    (2020)
There are more references available in the full text version of this article.

Cited by (6)

Huan Wang was born in Chaoyang, Liaoning, China in 1987. He received the B.S., M.S., and Ph.D. degrees in Computer Science and Technology from Changchun University of Science and Technology, Changchuan, China in 2009, 2012 and 2017, respectively.

From 2012 to 2019, he was a lecturer in the College of Computer Science and Technology, Changchun University of Science and Technology. Since 2019, he has been an associate researcher with the School of Computer Science and Communications Engineering, Guangxi University of Science and Technology. He is the author of 12 articles. His research interest includes network and information security, and high reliability software.

Dr. Wang is a member of Chinese computer society.

Shahid Mumtaz is an ACM Distinguished Speaker, IEEE Senior member, founder and EiC of IET ”journal of Quantum communication,” Vice-Chair: Europe/Africa Region — IEEE ComSoc: Green Communications & Computing society and Vice-chair for IEEE standard on P1932.1: Standard for Licenced/Unlicensed Spectrum Interoperability in Wireless Mobile Networks. He is the founder of two Journals. He has more than 12 years of wireless industry/academic experience. He has received his Master’s and Ph.D. degrees in Electrical & Electronic Engineering from Blekinge Institute of Technology, Sweden, and University of Aveiro, Portugal in 2006 and 2011. He is the author of 4 technical books, 12 book chapters, 180+ technical papers (150+ Journal/transaction, 80+ conference, 2 IEEE best paper award — in the area of mobile communications. Most of his publication is in the field of cybersecurity and network security. He is serving as Scientific Expert and Evaluator for various Research Funding Agencies. He was awarded an “Alain Bensoussan fellowship” in 2012. He is the recipient of the NSFC Researcher Fund for Young Scientist in 2017 from China.

Houjun Li was born in Qinzhou, Guangxi, China in 1985. He received the M.S. and Ph.D. degrees in College of Computer Science and Technology from the Beijing University of Technology, in 2011 and 2015, respectively.

He has been a lecturer in the College of Computer Science and Communication Engineering, Guangxi University of Science and Technology since 2015. His current research interests include machine intelligence and pattern analysis.

Dr. Li is a member of Chinese computer society.

Jingxian Liu was born in Wuzhou, Guangxi, China in 1984. He received the B.S. and M.S. degrees from China University of Geosciences (Beijing), in 2007 and 2010, respectively, and the Ph.D. degree in communication and information system from Beihang University.

He has been a lecturer in the College of Computer Science and Communication Engineering, Guangxi University of Science and Technology since 2019. His research interests include multi-sensor information fusion, information security and radar countermeasure.

Dr. Liu is a member of Chinese computer society.

Yang Fan was born in Lanzhou, Gansu, China in 1982. He received the B.S. and M.S. degrees in computer science and technology from the Lanzhou University of Technology, Lanzhou, China, in 2007 and 2011, and the Ph.D. degree in applied technology of computer science from Lanzhou University, Lanzhou, China, in 2017.

Since 2017, he has been an Assistant Professor with the School of Computer Science and Communications Engineering, Guangxi University of Science and Technology. He is the author of 10 articles. His research interests include complex networks and machine learning.

Dr. Yang is a member of Chinese computer society.

View full text