Hybrid fuzzy integrated convolutional neural network (HFICNN) for similarity feature recognition problem in abnormal netflow detection
Introduction
With the rapid development of networks, their security issues involve a series of real-world problems, among which the abnormal netflow attack problem is significant, difficult, and related to personal safety. Many researchers have studied abnormal netflow detection. Lin Weichao et al. [1] proposed a new feature representation method by calculating the distance between samples and the cluster center. They found that the dimension of feature representation can affect detection performance, and the feature selection method is a representative way for abnormal detection. Tavallaee et al. [2] analyzed the KDD99 dataset [3], [4], [5] and proposed a new NSL-KDD dataset with a large number of redundant records, which had a high impact on the evaluation performance of the detection model. Sabhnani et al. [6] compared ten kinds of machine learning algorithms and proposed a multi-classification model combining the multi-layer perceptron, Gaussian classifier, and k-means algorithm. They were concerned that detection algorithms are associated with specific attack categories. Dong et al. [7] encoded test data using the DBN model employing a non-supervised learning algorithm in each layer. They pointed out that the deep learning method can analyze network security in more ways than machine learning. Aminanto et al. [8] used an artificial neural network (ANN) for feature selection, and confirmed that to simplify the input features can sufficiently meet the requirements of the classification task. Tamar et al. [9] divided a dataset by spectrum clustering and extracted features using the automatic encoder method. They thought that a more complex network structure reduces learning efficiency. Most of these studies selected strongly influencing features [10] for machine learning and deep learning [11], [12], [13], while few researchers were concerned about why attacks cannot be detected accurately
It is said that abnormal netflow attacks are usually hidden, destructive, and uncertain [14], [15], [16]. Abnormal netflow detection relies on network equipment feature extraction, which requires sufficient expertise. Furthermore, a realistic extraction process is complex and time-consuming. In these circumstances, the extraction feature dimension is usually less and the differentiation is not obvious. Nowadays, the most famous public dataset is KDD99, which contains 41 dimensional features and a large number of zero values in features. When the detected dimension is low, the normal and abnormal netflow resemble each other such that their extracted features are the same, making it difficult to precisely detect abnormal netflow. Fig. 1 displays a filtered part of the ICMPv6-based DDoS attacks dataset. It is obvious that there are a large number of different netflow labels with the same eigenvalues. What’s more, there are many zero values throughout the dataset.
Given the above facts, traditional machine learning and deep learning will not recognize accurately because, regardless of what learning model is constructed, all of the parameters should be invariant. Although the test model provides good performance, once new records with the same features with different labels are tested in the model, they will all be misclassified. In the other words, the detection accuracy depends most on the number of records with the same feature and different label records over all the data. Hence, we must find a new way to transform differently labeled individuals with the same eigenvaluess to have different eigenvalues.
It is a fact that flow feature acquisition is a feature extraction process during a certain period of time. Fig. 2 shows part of the netflow dataset shown in Wireshark, strictly ordered by occurrence time. It is considered that to add a unique time feature can enable accurate feature recognition. However, the flow occurrence time cannot be directly added to the training part as a new recognition feature. Therefore, we would integrate the next time flow performance feature for feature integration, which can form a new integrated feature to identify different classifications, and time-dependent individuals can be identified as a new combined netflow to deal with the similarity feature recognition problem.
The similarity feature recognition problem cannot be identified, just as human twins cannot be recognized. It is obvious that when the netflow characteristics of one moment are identical to those of the other, the next or more adjacent netflow can be integrated until a new integration individual is created to distinguish each other. The number of new integration records is called the integration parameter (q). Fig. 3 shows the integration process with different integration parameters. Fig. 4 shows the main integration training process in our method. The original individuals with the same features have been converted to new integration individuals with different features. Feature learning training is then carried out to identify the whole combination. As a result of the feature integration, the corresponding labels should be function mapped to a new label. After data training, the combined results should be inverse function mapped to the original individual, so as to solve the similar feature recognition problem.
Hybrid architectures are developed using constructs of computational intelligence such as fuzzy sets, convolutional neural networks (CNNs), and deep neural networks (DNNs). They show good performance in some fields [17], [18], [19], [20]. So, a combination of adaptive feature integration methods, CNNs, and DNNs is a viable method to deal with the problem. We adopt the adaptive feature integration method to integrate the different individuals’ input data, and function mapping label redefinition method to integrate their output labels. Subsequent output layers must be output in stages due to the new integrated processing. The first output layer is connected to the DNN output layer, so the value of each output result must be converted to a subordinate category, which is determined by fuzzy classification. Moreover, a large number of fuzzy neural networks composed of fuzzy sets and neural networks are proposed in the literature [21], [22], [23]. By combining fuzzy sets and neural networks, the fuzzy neural network becomes an effective tool for the neural network classification problem [24], [25], [26], so we choose fuzzy classification as the best first-stage output method in a hybrid fuzzy integrated convolutional neural network (HFICNN). In the second output layer, the first-stage integrated output must be transformed to the original classification results. As a result of the FMLR process, we select IFMIR as the inverse function to restore the combined flow label, and this converts to the original classification label. To deal with the similarity feature recognition problem in abnormal netflow detection, we propose AFICNs and FMLR to develop an HFICNN through the FC and IFMIR methods. For example, the integration process is shown in Fig. 5 for the case that q = 1.
This paper can be summarized as follows.
First, different from the universal perspective of which features strongly influence the abnormal netflow detection problem, we study why abnormal netflow cannot be detected accurately. We analyze the realistic environment and the actual problems of abnormal flow detection, whose similarity feature recognition problem is that of different labels with the same netflow eigenvalues, which traditional machine learning and deep learning methods cannot well solve.
Second, we present HFICNN to deal with the problem. The model can convert the same feature individual with different labels to different feature individuals with different labels by constructing an integration method. Our method addresses the actual environment and operational complexity, adding a unique occurrence time feature to feature recognition. HFICNN integrates time-dependent records into new records to identify features.
Third, we combine the integration method with CNN, and propose adaptive feature integration convolutional neurons (AFICNs), which can be regarded as a generalized type of CNN. Different from traditional convolution neurons, AFICNs have dimensional characteristics of different individuals instead of the same individual. We propose function mapping label redefinition (FMLR), which integrates labels of different individuals.
Fourth, after AFICNN feature extraction, we make full use of DNN for deep feature recognition. It is necessary to transform from AFICNN to DNN layers. We transform the AFICNN layer outputs as the DNN inputs to connect to the DNN layers. Each layer of the model has high flexibility, and integration parameters can be adaptively adjusted. Different numbers of layers and neurons are parameters to achieve the final efficient classification results.
Last, the integrated label should be the inverse of the original label to identify the original individuals. Different from the traditional single-layer output neural network, our HFICNN output results in stages. In the first output stage, fuzzy classification (FC) and deep learning methods are combined to form AFICNN. In the second output stage, inverse function mapping integrated reduction (IFMIR) is applied to restore the original classification results, and then a complete fuzzy integrated convolution neural network is formed.
The paper is organized as follows. Section 1 introduces the similarity feature recognition problem and the source construction of the HFICNN method. Section 2 introduces related work and the design of HFICNN. Section 3 describes the HFICNN architecture and algorithm. Section 4 explains HFICNN procedures. Section 5 introduces the dataset and discusses the experimental results. Section 6 relates our conclusions.
Section snippets
Related work
We introduce the basic principles of some key methods of CNN, DNN, and FC, and propose our HFICNN model.
Architecture of HFICNN
We elaborate on the design and architecture of HFICNN. Fig. 7(a) describes a general architecture of HFICNN. It is essentially a new architecture that can be regarded as an improved CNN, as shown in Fig. 7(e). The labels are integrated into the new integrated labels by function mapping label redefinition (FMLR). The HFICNN model has three parts. First, it splits the original data into feature input data and label output data. It integrates multiple netflow records in time-related order, and
Overall design procedure of HFICNN model
We describe the HFICNN procedure in detail. The brief design procedure is shown in Fig. 9. The overall design procedure is shown in Fig. 10. The procedure has four steps. First is data integration, which splits the original data into feature input and label output data. It integrates multiple netflow records according to the time-dependent order and chronologically merges the self-arranged netflow into new records. HFICNN reintegrates the corresponding integrated labels into new labels
Experiments
We elaborate on the experimental dataset and results.
Conclusions
This paper presents a hybrid fuzzy integrated convolutional neural network (HFICNN) to deal with the similarity feature recognition problem in abnormal netflow detection, which is that different labels with the same netflow eigenvalues are not well disposed by traditional machine learning and deep learning methods. The HFICNN model can convert the same feature individuals with different labels to different feature individuals with different labels by constructing an integration method. Our
CRediT authorship contribution statement
Xin Yue: Conceptualization, Methodology, Investigation, Software, Data curation, Writing - original draft. Jinsong Wang: Conceptualization, Resources, Supervision, Project administration, Funding acquisition. Wei Huang: Conceptualization, Methodology, Investigation, Supervision.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgment
The work was supported by the National Key Research and Development Plan of China (No. 2018YFC0831405), the Natural Science Foundation of Tianjin (No. 18JCZDJC30700), the New Generation Artificial Intelligence Technology Major Project of Tianjin (No. 19ZXZNGX000801100), the National Natural Science Foundation of China (No. 61673295), the Natural Science Foundation of Tianjin for Distinguished Young Scholars (No. 19JCJQJC61500), and the Natural Science Foundation of Tianjin (No. 18JCYBJC85200).
Xin Yue received the B.Sc. And M.Sc. degree from the Department of Computer Science and Engineering, Tianjin University of Technology, Tianjin, China. Now she is a PhD candidate in Tianjin University of Technology.
Her main research interests include computer networks, artificial intelligence, evolutionary computation.
References (61)
- et al.
CANN: An intrusion detection system based on combining cluster centers and nearest neighbors
Knowl.-Based Syst.
(2015) - et al.
Application of deep learning to cybersecurity: a survey
Neurocomputing
(2019) - et al.
Intrusion detection model using fusion of chi-square feature selection and multi class SVM
J. King Saud Univ. - Comput. Inf. Sci.
(2017) - et al.
Automated detection and classification of liver fibrosis stages using contourlet transform and nonlinear features
Comput. Methods Prog. Biomed.
(2018) - et al.
Discriminative dictionary pair learning based on differentiable support vector function for visual recognition
Neurocomputing
(2018) - et al.
Evolutional RBFNs image model describing-based segmentation system designs
Neurocomputing
(2018) - et al.
Crossing generative adversarial networks for cross-view person re-identification
Neurocomputing
(2019) - et al.
Distributed attack detection scheme using deep learning approach for Internet of Things
Futur. Gener. Comput. Syst.
(2018) - et al.
Efficient DDoS flood attack detection using dynamic thresholding on flow-based network traffic R
Comput. Secur.
(2019) - et al.
Deep learning with support vector data description
Neurocomputing
(2015)
A deep learning framework for identifying children with ADHD using an EEG-based brain network
Neurocomputing
Abnormal behavior recognition for intelligent video surveillance systems: a review
Expert Syst. Appl.
Neural-network-based adaptive fault-tolerant tracking control of uncertain nonlinear time-delay systems under output constraints and infinite number of actuator faults
Neurocomputing
A novel intrusion detection system based on hierarchical clustering and support vector machines
Expert Syst. Appl.
Finite-time adaptive fuzzy output-feedback control of MIMO nonlinear systems with hysteresis
Neurocomputing
Object class segmentation of RGB-D video using recurrent convolutional neural networks
Neural Networks
Moving object detection via segmentation and saliency constrained RPCA
Neurocomputing
Adaptive activation functions in convolutional neural networks
Neurocomputing
Graph classification based on graph set reconstruction and graph kernel feature reduction
Neurocomputing
Evaluating deep learning architectures for speech emotion recognition
Neural Networks
Appearance based pedestrians head pose and body orientation estimation using deep learning
Neurocomputing
A hybrid approach combining an extended BBO algorithm with an intuitionistic fuzzy entropy weight method for QoS-aware manufacturing service supply chain optimization
Neurocomputing
Proactive management of SLA violations by capturing relevant external events in a Cloud of Things environment
Futur. Gener. Comput. Syst.
Semi-supervised machine learning approach for DDoS detection
Appl. Intell.
Application of machine learning algorithms to KDD intrusion detection dataset within misuse detection context
A hybrid spectral clustering and deep neural network ensemble algorithm for intrusion detection in sensor networks
Sensors
Deep radial intelligence with cumulative incarnation approach for detecting denial of service attacks
Neurocomputing
Cited by (9)
Hesitant convolutional neural networks and intelligent drive algorithm fused subjective guidance
2023, Applied Soft ComputingResearch and Application of Network Anomaly Traffic Detection System
2022, Procedia Computer ScienceUnbalanced abnormal traffic detection based on improved Res-BIGRU and integrated dynamic ELM optimization
2021, Computer CommunicationsFemtosecond Laser Ablation of Quantum Dot Films toward Physical Unclonable Multilevel Fluorescent Anticounterfeiting Labels
2023, ACS Applied Materials and InterfacesConvolutional Neural Network for DDoS Detection
2023, Lecture Notes in Networks and SystemsThe Dual-Fuzzy Convolutional Neural Network to Deal with Handwritten Image Recognition
2022, IEEE Transactions on Fuzzy Systems
Xin Yue received the B.Sc. And M.Sc. degree from the Department of Computer Science and Engineering, Tianjin University of Technology, Tianjin, China. Now she is a PhD candidate in Tianjin University of Technology.
Her main research interests include computer networks, artificial intelligence, evolutionary computation.
Jinsong Wang received the B.Sc. degree from the Department of Computer Science and Engineering, Tianjin University of Technology, Tianjin, China, and the M.Sc. and Ph.D. degrees from Nankai University, Tianjin.
He is currently a Professor with the School of Computer and Communication Engineering, Tianjin University of Technology. His main research interests include computer networks, blockchain, distributed computation, and evolutionary computation.
Wei Huang received his M.Sc. degree from the School of Information Engineering, East China Institute of Technology, Jiangxi, China, in 2006, and Ph.D. degree at State Key Laboratory of Software Engineering, Wuhan University, Wuhan, China, in 2011.
From 2011 to 2012, he was a Research Professor in the Computational Intelligence Laboratory, Suwon University, South Korea. He is currently a Professor in the School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, China. His research interests include evolutionary computation, fuzzy system, fuzzy-neural networks, and advanced Computational Intelligence. He currently serves as an Associate Editor of the Journal of Electrical Engineering & Technology.