Network intrusion detection based on n-gram frequency and time-aware transformer
Introduction
In the past 20 years, network technology has developed rapidly and has been widely and deeply applied in economy, military and education, profoundly impacting society’s development. At the same time, attacks against network systems have become increasingly rampant, especially during the COVID-19 epidemic. The scope of attacks increases and various new attack tools and methods emerges, which not only cause economic losses but even threaten national security. Intrusion detection system (IDS) Mukherjee et al. (1994) is a widely used network security device that can monitor network activities and detect threats in real time. Network intrusion detection system (NIDS) is a type of IDS deployed on network nodes to detect attacks by directly analyzing network traffic.
NIDS includes misuse detection and anomaly detection Ghorbani et al. (2009). Misuse detection methods Modi et al. (2012); Shiri et al. (2011) use a series of rules to define malicious activities according to expert knowledge and detect attacks by matching network traffic with the rules line by line. It is widely used in practical deployments Snort (2022); Zeek (2022) since it can detect attacks quickly with a low false alarm rate. However, attackers constantly upgrade their attack tools and strategies, and misuse detection methods cannot detect these unknown attacks. Anomaly detection methods can detect unknown attacks and have been the focus of intrusion detection in recent years Samrin and Vasumathi (2017). Machine learning (ML) and deep learning (DL) are the dominant technologies for intrusion detection Ahmad et al. (2021).
Machine learning-based intrusion detection methods first extract features from raw traffic by performing feature engineering, and then train the model to detect anomalies. Conventional intrusion detection models such as Random forest, Farnaaz and Jabbar (2016); Zhang et al. (2008), support vector machine (SVM) Gu et al. (2019); Jing and Chen (2019) are widely used methods. Although machine learning-based methods have achieved relatively high performance, they gradually reach the bottleneck as datas complexity and diversity increase. They highly depend on extracted features (e.g., the packet length, the flow duration) through complex feature engineering, and the design of features depends on expertise and causes information loss.
Deep learning-based intrusion detection methods break the bottleneck of machine learning-based methods and achieve better performance by automatically learning features from raw data. In the past decade, with the development of hardware and the generation of huge data, deep learning techniques have developed rapidly and achieved remarkable results in various applications, including intrusion detection. Representative deep learning methods include Convolutional Neural Network (CNN) Vinayakumar et al. (2017); Wang et al. (2017); Yu et al. (2021), Recurrent Neural Network (RNN) Park et al. (2020); Ullah and Mahmoud (2022), Long Short Term Memory (LSTM) Chen et al. (2022); Mirza and Cosan (2018), Transformer Kozik et al. (2021); Tan et al. (2019) and Generative Adversarial Network (GAN) Andresini et al. (2021); Lee and Park (2021). These methods have different focuses and learn features from different perspectives. For example, CNN learns the spatial features of network traffic, and LSTM learns the temporal features.
Although existing deep learning-based methods have achieved relatively high performance, they are still shortcomings. (1) Packet header and packet payload which are significantly different play critical roles for intrusion detection, but most deep learning-based methods process them simultaneously as a whole, which fails in the model to learn more focused features. (2) The number of packets contained in a session and the size of packets are not fixed. Existing methods solve this problem by truncating or patching directly to a fixed length, but the truncated part cannot be used, which will undoubtedly lead to the loss of information. (3) Time intervals between packets are ignored. A session can be considered as a sequence of multiple packets like a sentence can be regarded as a sequence of multiple words. However, different from the pattern of sentence, the distance between elements in the session has a wide range. Existing methods for ordinary sequences (e.g., RNN, LSTM, and Transformer) don’t work well due to the loss of temporal information during processing sessions.
On the other hand, n-gram analysis is used in many intrusion detection methods because of its conciseness and effectiveness. It mainly used to model normal or attack behavior by constructing the signature or data distribution of packet payload. However, these methods can only capture packet-level anomalies and are difficult to detect emerging or complex attacks.
In this paper, we propose a novel intrusion detection model based on n-gram frequency and time-aware transformer (GTID) to overcome the challenges in deep learning based methods and n-gram based methods. Firstly, to distinguish the features of packet header and packet payload, we process them separately using different methods. Secondly, to make full use of variable-length sessions and packets, we calculate the n-gram frequency of packet payloads and convert them into fixed-length vectors. We also choose transformer Vaswani et al. (2017) to learn the feature of sessions since the recursive structure of transformer enables it to handle variable-length sequences. Thirdly, to take the time interval between packets into account, we improve the original Transformer and propose time-aware transformer to fully learn session features. Lastly, n-gram analysis is only used as a component of packet payload feature extraction in this paper. By combining with other parts, our method can model network traffic comprehensively and detect intrusions at the packet level and session level.
In summary, the main contributions of this paper are as follows:
- •
A novel intrusion detection model based on n-gram frequency and time-aware transformer (GTID) is proposed in this paper. It considers the characteristics of each component of raw network traffic, and can learn the packet-level features and session-level features of raw network traffic hierarchically.
- •
A simple and effective packet feature extraction method based on n-gram and DNN is introduced. Packet header and payload are processed separately and variable-length packet payload to fixed-length vectors to minimize information loss.
- •
Time-aware transformer is presented to extract features of variable-length session. It takes time intervals between packets into consideration and is more suitable for handling network traffic than the traditional transformer.
- •
Solid experiments are conducted to demonstrate the excellent performance of GTID. GTID is superior to other methods and is competent for intrusion detection. Besides, additional experiments are performed to investigate the impact of the model and experimental settings.
The remainder of this paper is organized as follows: Section 2 introduces some related work to this paper; Section 3 describes the detailed design and implementation of the proposed GTID; Section 4 presents the experiments and analyzes the result; Section 5 concludes this paper.
Section snippets
ML-based and DL-based intrusion detection methods
Since the concept of intrusion detection was first introduced Denning (1987), the problem of intrusion detection has been attracting the attention of researchers. Various types of methods have been proposed to solve the problem, among which machine learning-based and deep learning-based methods have attracted much attention.
Machine learning-based intrusion detection models are widely used in practice because of their efficiency and interpretability. The features of raw network traffic, like
Overall framework
The overall framework of GTID1 is shown in Fig. 1. GTID consists of preprocess module, packet feature extraction module, session feature extraction module, and classification module. GTID organizes these modules in an end-to-end pipeline to reduce time consumption and information loss associated with intermediate result storage and achieves high detection performance through the cooperation of these modules.
The input of GTID is the
Experiment environment
The experiments are conducted on a server equipped with an eight cores Intel Xeon Silver 4210R CPU, an NVIDIA-430 GPU, 128GB of RAM, and a 512GB SSD hard drive, and the running environment is Ubuntu 16.04 and Python 3.7. Models are implemented with Pytorch and Keras. Besides, HuggingFace(Wolf et al., 2019) is used to speed up the implementation of the transformer.
Dataset description
To represent the current network environment properly and evaluate our methods comprehensively, we select two up-to-date and labeled
Parameters selection
To get better performance, there are a lot of parameters that need to be determined, including max session length, the value of and granularity of time smoothing. We will investigate the performance impact of these parameters and find the optimal parameters in the following subsections.
Conclusion and future work
In this paper, we propose a novel intrusion detection model based on n-gram frequency and time-aware transformer (GTID), which can fully use captured network traffic for better performance. GTID extracts the features of each packet and the features of the session as a whole in turn for detection. During this process, GTID considers the difference between packet header and packet payload and processes them separately. It properly deals with the variable-length payload, variable-length session,
CRediT authorship contribution statement
Xueying Han: Conceptualization, Methodology, Software, Investigation, Writing – original draft. Susu Cui: Methodology, Investigation. Song Liu: Software. Chen Zhang: Software. Bo Jiang: Formal analysis, Resources, Writing – review & editing. Zhigang Lu: Resources, Formal analysis.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
This research is supported by National Key Research and Development Program of China (No.2019QY1300), Youth Innovation Promotion Association CAS (No.2021156) and the Strategic Priority Research Program of Chinese Academy of Sciences (No.XDC02040100). This work is also supported by the Program of Key Laboratory of Network Assessment Technology, the Chinese Academy of Sciences, Program of Beijing Key Laboratory of Network Security and Protection Technology.
Xueying Han received the B.S. degree from Beijing University of Post and Telecommunication in 2020. She is currently pursuing the Ph.D. degree with the Institute of Information Engineering, University of Chinese Academy of Sciences, Beijing, China. Her research interests include deep learning and network security.
References (50)
- et al.
Gan augmentation to deal with imbalance in imaging-based intrusion detection
Future Generat. Comput. Syst.
(2021) - et al.
A quantitative approach for intrusions detection and prevention based on statistical n-gram models
Procedia Comput. Sci.
(2012) - et al.
An efficient network behavior anomaly detection using a hybrid dbn-lstm network
Comput. Secur.
(2022) - et al.
Random forest modeling for network intrusion detection system
Procedia Comput. Sci.
(2016) - et al.
An effective intrusion detection approach using svm with naïve bayes feature embedding
Comput. Secur.
(2021) - et al.
A novel approach to intrusion detection using svm ensemble with feature augmentation
Comput. Secur.
(2019) - et al.
Integrating signature apriori based network intrusion detection system (nids) in cloud computing
Procedia Technol.
(2012) - et al.
Toward developing a systematic approach to generate benchmark datasets for intrusion detection
Comput. Secur.
(2012) - et al.
Intrusion detection for capsule networks based on dual routing mechanism
Comput. Netw.
(2021) - et al.
Pbcnn: packet bytes-based convolutional neural network for network intrusion detection
Comput. Netw.
(2021)
Random-forests-based network intrusion detection systems
IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.)
Network intrusion detection system: a systematic study of machine learning and deep learning approaches
Trans. Emerg. Telecommun. Technol.
N-grams exclusion and inclusion filter for intrusion detection in internet of energy big data systems
Trans. Emerg. Telecommun. Technol.
Computer security threat monitoring and surveillance
Technical Report, James P. Anderson Company
Dos and donts of machine learning in computer security
Proc. of the USENIX Security Symposium
Predicting sentences using n-gram language models
Proceedings of human language technology conference and conference on empirical methods in natural language processing
Poseidon: a 2-tier anomaly-based network intrusion detection system
Fourth IEEE International Workshop on Information Assurance (IWIA’06)
Class-based n-gram models of natural language
Comput. Linguistic.
An intrusion-detection model
IEEE Trans. Softw. Eng.
Transformer feed-forward layers are key-value memories
arXiv preprint arXiv:2012.14913
Network intrusion detection and prevention: Concepts and techniques
Layered higher order n-grams for hardening payload based anomaly intrusion detection
2010 International Conference on Availability, Reliability and Security
Svm based network intrusion detection for the unsw-nb15 dataset
2019 IEEE 13th international conference on ASIC (ASICON)
A new method of hybrid time window embedding with transformer-based traffic data classification in iot-networked environment
Pattern Anal. Appl.
Cited by (6)
FlowTransformer: A transformer framework for flow-based network intrusion detection systems[Formula presented]
2024, Expert Systems with ApplicationsA Transformer-based network intrusion detection approach for cloud security
2024, Journal of Cloud Computing
Xueying Han received the B.S. degree from Beijing University of Post and Telecommunication in 2020. She is currently pursuing the Ph.D. degree with the Institute of Information Engineering, University of Chinese Academy of Sciences, Beijing, China. Her research interests include deep learning and network security.
Susu Cui received the B.S. degree from Nanchang University in 2019. She is currently pursuing the Ph.D. degree with the Institute of Information Engineering, University of Chinese Academy of Sciences, Beijing, China. Her research interests include encrypted traffic analysis and network security.
Song Liu received his M.S. degree from the Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China, in 2018. He is currently an engineer in the Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China. His current research interests include network attack detection and defense, big data processing.
Chen Zhang received the M.S. degree in China University of Geosciences in 2016. He is an engineer at the Institute of Information Engineering, Chinese Academy of Sciences. His research interests include network security and network situational awareness.
Bo Jiang received the Ph.D. degree in Chinese Academy of Sciences in 2016. He is an assistant professor at the Institute of Information Engineering, Chinese Academy of Sciences. His research interests include network situational awareness, knowledge graph and data mining.
Zhigang Lu received the Ph.D. degree in Chinese Academy of Sciences in 2010. He is a professor at the Institute of Information Engineering, Chinese Academy of Sciences. His research interests include network security and network situational awareness.