Network intrusion detection based on n-gram frequency and time-aware transformer

https://doi.org/10.1016/j.cose.2023.103171Get rights and content

Abstract

Network intrusion detection system plays a critical role in protecting the target network from attacks. However, most existing detection methods cannot fully utilize the information contained in raw network traffic, such as information loss in the feature extraction process and incomplete feature dimensions, which lead to performance bottlenecks. In this paper, we propose a novel intrusion detection model based on n-gram frequency and time-aware transformer called GTID. GTID can learn traffic features from packet-level and session-level hierarchically and can minimize information as much as possible. To extract packet-level features effectively, GTID considers the different roles of packet header and payload, and processes them in different ways, where n-gram frequency is used to represent payload contextual information because of its conciseness. Then, GTID uses the proposed time-aware transformer to learn session-level features for intrusion detection. The time-aware transformer considers the time intervals between packets, and learns the temporal features of a session for classification. For evaluation, several solid experiments are conducted on the ISCX2012 dataset and the CICIDS2017 dataset, and the results show the effectiveness and robustness of GTID.

Introduction

In the past 20 years, network technology has developed rapidly and has been widely and deeply applied in economy, military and education, profoundly impacting society’s development. At the same time, attacks against network systems have become increasingly rampant, especially during the COVID-19 epidemic. The scope of attacks increases and various new attack tools and methods emerges, which not only cause economic losses but even threaten national security. Intrusion detection system (IDS) Mukherjee et al. (1994) is a widely used network security device that can monitor network activities and detect threats in real time. Network intrusion detection system (NIDS) is a type of IDS deployed on network nodes to detect attacks by directly analyzing network traffic.

NIDS includes misuse detection and anomaly detection Ghorbani et al. (2009). Misuse detection methods Modi et al. (2012); Shiri et al. (2011) use a series of rules to define malicious activities according to expert knowledge and detect attacks by matching network traffic with the rules line by line. It is widely used in practical deployments Snort (2022); Zeek (2022) since it can detect attacks quickly with a low false alarm rate. However, attackers constantly upgrade their attack tools and strategies, and misuse detection methods cannot detect these unknown attacks. Anomaly detection methods can detect unknown attacks and have been the focus of intrusion detection in recent years Samrin and Vasumathi (2017). Machine learning (ML) and deep learning (DL) are the dominant technologies for intrusion detection Ahmad et al. (2021).

Machine learning-based intrusion detection methods first extract features from raw traffic by performing feature engineering, and then train the model to detect anomalies. Conventional intrusion detection models such as Random forest,  Farnaaz and Jabbar (2016); Zhang et al. (2008), support vector machine (SVM) Gu et al. (2019); Jing and Chen (2019) are widely used methods. Although machine learning-based methods have achieved relatively high performance, they gradually reach the bottleneck as datas complexity and diversity increase. They highly depend on extracted features (e.g., the packet length, the flow duration) through complex feature engineering, and the design of features depends on expertise and causes information loss.

Deep learning-based intrusion detection methods break the bottleneck of machine learning-based methods and achieve better performance by automatically learning features from raw data. In the past decade, with the development of hardware and the generation of huge data, deep learning techniques have developed rapidly and achieved remarkable results in various applications, including intrusion detection. Representative deep learning methods include Convolutional Neural Network (CNN) Vinayakumar et al. (2017); Wang et al. (2017); Yu et al. (2021), Recurrent Neural Network (RNN) Park et al. (2020); Ullah and Mahmoud (2022), Long Short Term Memory (LSTM) Chen et al. (2022); Mirza and Cosan (2018), Transformer Kozik et al. (2021); Tan et al. (2019) and Generative Adversarial Network (GAN) Andresini et al. (2021); Lee and Park (2021). These methods have different focuses and learn features from different perspectives. For example, CNN learns the spatial features of network traffic, and LSTM learns the temporal features.

Although existing deep learning-based methods have achieved relatively high performance, they are still shortcomings. (1) Packet header and packet payload which are significantly different play critical roles for intrusion detection, but most deep learning-based methods process them simultaneously as a whole, which fails in the model to learn more focused features. (2) The number of packets contained in a session and the size of packets are not fixed. Existing methods solve this problem by truncating or patching directly to a fixed length, but the truncated part cannot be used, which will undoubtedly lead to the loss of information. (3) Time intervals between packets are ignored. A session can be considered as a sequence of multiple packets like a sentence can be regarded as a sequence of multiple words. However, different from the pattern of sentence, the distance between elements in the session has a wide range. Existing methods for ordinary sequences (e.g., RNN, LSTM, and Transformer) don’t work well due to the loss of temporal information during processing sessions.

On the other hand, n-gram analysis is used in many intrusion detection methods because of its conciseness and effectiveness. It mainly used to model normal or attack behavior by constructing the signature or data distribution of packet payload. However, these methods can only capture packet-level anomalies and are difficult to detect emerging or complex attacks.

In this paper, we propose a novel intrusion detection model based on n-gram frequency and time-aware transformer (GTID) to overcome the challenges in deep learning based methods and n-gram based methods. Firstly, to distinguish the features of packet header and packet payload, we process them separately using different methods. Secondly, to make full use of variable-length sessions and packets, we calculate the n-gram frequency of packet payloads and convert them into fixed-length vectors. We also choose transformer Vaswani et al. (2017) to learn the feature of sessions since the recursive structure of transformer enables it to handle variable-length sequences. Thirdly, to take the time interval between packets into account, we improve the original Transformer and propose time-aware transformer to fully learn session features. Lastly, n-gram analysis is only used as a component of packet payload feature extraction in this paper. By combining with other parts, our method can model network traffic comprehensively and detect intrusions at the packet level and session level.

In summary, the main contributions of this paper are as follows:

  • A novel intrusion detection model based on n-gram frequency and time-aware transformer (GTID) is proposed in this paper. It considers the characteristics of each component of raw network traffic, and can learn the packet-level features and session-level features of raw network traffic hierarchically.

  • A simple and effective packet feature extraction method based on n-gram and DNN is introduced. Packet header and payload are processed separately and variable-length packet payload to fixed-length vectors to minimize information loss.

  • Time-aware transformer is presented to extract features of variable-length session. It takes time intervals between packets into consideration and is more suitable for handling network traffic than the traditional transformer.

  • Solid experiments are conducted to demonstrate the excellent performance of GTID. GTID is superior to other methods and is competent for intrusion detection. Besides, additional experiments are performed to investigate the impact of the model and experimental settings.

The remainder of this paper is organized as follows: Section 2 introduces some related work to this paper; Section 3 describes the detailed design and implementation of the proposed GTID; Section 4 presents the experiments and analyzes the result; Section 5 concludes this paper.

Section snippets

ML-based and DL-based intrusion detection methods

Since the concept of intrusion detection was first introduced Denning (1987), the problem of intrusion detection has been attracting the attention of researchers. Various types of methods have been proposed to solve the problem, among which machine learning-based and deep learning-based methods have attracted much attention.

Machine learning-based intrusion detection models are widely used in practice because of their efficiency and interpretability. The features of raw network traffic, like

Overall framework

The overall framework of GTID1 is shown in Fig. 1. GTID consists of preprocess module, packet feature extraction module, session feature extraction module, and classification module. GTID organizes these modules in an end-to-end pipeline to reduce time consumption and information loss associated with intermediate result storage and achieves high detection performance through the cooperation of these modules.

The input of GTID is the

Experiment environment

The experiments are conducted on a server equipped with an eight cores Intel Xeon Silver 4210R CPU, an NVIDIA-430 GPU, 128GB of RAM, and a 512GB SSD hard drive, and the running environment is Ubuntu 16.04 and Python 3.7. Models are implemented with Pytorch and Keras. Besides, HuggingFace(Wolf et al., 2019) is used to speed up the implementation of the transformer.

Dataset description

To represent the current network environment properly and evaluate our methods comprehensively, we select two up-to-date and labeled

Parameters selection

To get better performance, there are a lot of parameters that need to be determined, including max session length, the value of n and granularity of time smoothing. We will investigate the performance impact of these parameters and find the optimal parameters in the following subsections.

Conclusion and future work

In this paper, we propose a novel intrusion detection model based on n-gram frequency and time-aware transformer (GTID), which can fully use captured network traffic for better performance. GTID extracts the features of each packet and the features of the session as a whole in turn for detection. During this process, GTID considers the difference between packet header and packet payload and processes them separately. It properly deals with the variable-length payload, variable-length session,

CRediT authorship contribution statement

Xueying Han: Conceptualization, Methodology, Software, Investigation, Writing – original draft. Susu Cui: Methodology, Investigation. Song Liu: Software. Chen Zhang: Software. Bo Jiang: Formal analysis, Resources, Writing – review & editing. Zhigang Lu: Resources, Formal analysis.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This research is supported by National Key Research and Development Program of China (No.2019QY1300), Youth Innovation Promotion Association CAS (No.2021156) and the Strategic Priority Research Program of Chinese Academy of Sciences (No.XDC02040100). This work is also supported by the Program of Key Laboratory of Network Assessment Technology, the Chinese Academy of Sciences, Program of Beijing Key Laboratory of Network Security and Protection Technology.

Xueying Han received the B.S. degree from Beijing University of Post and Telecommunication in 2020. She is currently pursuing the Ph.D. degree with the Institute of Information Engineering, University of Chinese Academy of Sciences, Beijing, China. Her research interests include deep learning and network security.

References (50)

  • J. Zhang et al.

    Random-forests-based network intrusion detection systems

    IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.)

    (2008)
  • Z. Ahmad et al.

    Network intrusion detection system: a systematic study of machine learning and deep learning approaches

    Trans. Emerg. Telecommun. Technol.

    (2021)
  • M. Aldwairi et al.

    N-grams exclusion and inclusion filter for intrusion detection in internet of energy big data systems

    Trans. Emerg. Telecommun. Technol.

    (2022)
  • J.P. Anderson

    Computer security threat monitoring and surveillance

    Technical Report, James P. Anderson Company

    (1980)
  • D. Arp et al.

    Dos and donts of machine learning in computer security

    Proc. of the USENIX Security Symposium

    (2022)
  • S. Bickel et al.

    Predicting sentences using n-gram language models

    Proceedings of human language technology conference and conference on empirical methods in natural language processing

    (2005)
  • D. Bolzoni et al.

    Poseidon: a 2-tier anomaly-based network intrusion detection system

    Fourth IEEE International Workshop on Information Assurance (IWIA’06)

    (2006)
  • P.F. Brown et al.

    Class-based n-gram models of natural language

    Comput. Linguistic.

    (1992)
  • Canadian Institute for Cybersecurity, 2017. Cicflowmeter....
  • D.E. Denning

    An intrusion-detection model

    IEEE Trans. Softw. Eng.

    (1987)
  • M. Geva et al.

    Transformer feed-forward layers are key-value memories

    arXiv preprint arXiv:2012.14913

    (2020)
  • A.A. Ghorbani et al.

    Network intrusion detection and prevention: Concepts and techniques

    (2009)
  • N. Hubballi et al.

    Layered higher order n-grams for hardening payload based anomaly intrusion detection

    2010 International Conference on Availability, Reliability and Security

    (2010)
  • D. Jing et al.

    Svm based network intrusion detection for the unsw-nb15 dataset

    2019 IEEE 13th international conference on ASIC (ASICON)

    (2019)
  • R. Kozik et al.

    A new method of hybrid time window embedding with transformer-based traffic data classification in iot-networked environment

    Pattern Anal. Appl.

    (2021)
  • Cited by (6)

    Xueying Han received the B.S. degree from Beijing University of Post and Telecommunication in 2020. She is currently pursuing the Ph.D. degree with the Institute of Information Engineering, University of Chinese Academy of Sciences, Beijing, China. Her research interests include deep learning and network security.

    Susu Cui received the B.S. degree from Nanchang University in 2019. She is currently pursuing the Ph.D. degree with the Institute of Information Engineering, University of Chinese Academy of Sciences, Beijing, China. Her research interests include encrypted traffic analysis and network security.

    Song Liu received his M.S. degree from the Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China, in 2018. He is currently an engineer in the Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China. His current research interests include network attack detection and defense, big data processing.

    Chen Zhang received the M.S. degree in China University of Geosciences in 2016. He is an engineer at the Institute of Information Engineering, Chinese Academy of Sciences. His research interests include network security and network situational awareness.

    Bo Jiang received the Ph.D. degree in Chinese Academy of Sciences in 2016. He is an assistant professor at the Institute of Information Engineering, Chinese Academy of Sciences. His research interests include network situational awareness, knowledge graph and data mining.

    Zhigang Lu received the Ph.D. degree in Chinese Academy of Sciences in 2010. He is a professor at the Institute of Information Engineering, Chinese Academy of Sciences. His research interests include network security and network situational awareness.

    View full text