Elsevier

Neurocomputing

Volume 415, 20 November 2020, Pages 332-346
Neurocomputing

Hybrid fuzzy integrated convolutional neural network (HFICNN) for similarity feature recognition problem in abnormal netflow detection

https://doi.org/10.1016/j.neucom.2020.07.076Get rights and content

Abstract

This paper presents a hybrid fuzzy integrated convolutional neural network (HFICNN) to deal with the similarity feature recognition problem in abnormal netflow detection, which is that different labels with the same netflow eigenvalues are not well disposed by traditional machine learning and deep learning methods. The HFICNN model can convert the same feature individuals with different labels to different feature individuals with different labels by constructing an integration method. Our method recognizes the actual environment and operational complexity, and converts the unique occurrence time feature performance as a new feature for feature recognition. HFICNN integrates time-dependent records into a new record to identify a feature, and converts the integrated records to the original label. The integration process plays a good role and effect in the entire process. HFICNN is realized with the aid of function mapping label redefinition (FMLR), adaptive feature integration convolutional neurons (AFICNs), fuzzy classification (FC), and inverse function mapping integrated reduction (IFMIR). The ICMPv6-based DDoS attacks dataset of a new-generation network is tested, and experimental results show that HFICNN performs better than 10 types of traditional machine learning and two types of deep learning methods on the similarity feature recognition problem, and the HFICNN model is reliable and effective.

Introduction

With the rapid development of networks, their security issues involve a series of real-world problems, among which the abnormal netflow attack problem is significant, difficult, and related to personal safety. Many researchers have studied abnormal netflow detection. Lin Weichao et al. [1] proposed a new feature representation method by calculating the distance between samples and the cluster center. They found that the dimension of feature representation can affect detection performance, and the feature selection method is a representative way for abnormal detection. Tavallaee et al. [2] analyzed the KDD99 dataset [3], [4], [5] and proposed a new NSL-KDD dataset with a large number of redundant records, which had a high impact on the evaluation performance of the detection model. Sabhnani et al. [6] compared ten kinds of machine learning algorithms and proposed a multi-classification model combining the multi-layer perceptron, Gaussian classifier, and k-means algorithm. They were concerned that detection algorithms are associated with specific attack categories. Dong et al. [7] encoded test data using the DBN model employing a non-supervised learning algorithm in each layer. They pointed out that the deep learning method can analyze network security in more ways than machine learning. Aminanto et al. [8] used an artificial neural network (ANN) for feature selection, and confirmed that to simplify the input features can sufficiently meet the requirements of the classification task. Tamar et al. [9] divided a dataset by spectrum clustering and extracted features using the automatic encoder method. They thought that a more complex network structure reduces learning efficiency. Most of these studies selected strongly influencing features [10] for machine learning and deep learning [11], [12], [13], while few researchers were concerned about why attacks cannot be detected accurately

It is said that abnormal netflow attacks are usually hidden, destructive, and uncertain [14], [15], [16]. Abnormal netflow detection relies on network equipment feature extraction, which requires sufficient expertise. Furthermore, a realistic extraction process is complex and time-consuming. In these circumstances, the extraction feature dimension is usually less and the differentiation is not obvious. Nowadays, the most famous public dataset is KDD99, which contains 41 dimensional features and a large number of zero values in features. When the detected dimension is low, the normal and abnormal netflow resemble each other such that their extracted features are the same, making it difficult to precisely detect abnormal netflow. Fig. 1 displays a filtered part of the ICMPv6-based DDoS attacks dataset. It is obvious that there are a large number of different netflow labels with the same eigenvalues. What’s more, there are many zero values throughout the dataset.

Given the above facts, traditional machine learning and deep learning will not recognize accurately because, regardless of what learning model is constructed, all of the parameters should be invariant. Although the test model provides good performance, once new records with the same features with different labels are tested in the model, they will all be misclassified. In the other words, the detection accuracy depends most on the number of records with the same feature and different label records over all the data. Hence, we must find a new way to transform differently labeled individuals with the same eigenvaluess to have different eigenvalues.

It is a fact that flow feature acquisition is a feature extraction process during a certain period of time. Fig. 2 shows part of the netflow dataset shown in Wireshark, strictly ordered by occurrence time. It is considered that to add a unique time feature can enable accurate feature recognition. However, the flow occurrence time cannot be directly added to the training part as a new recognition feature. Therefore, we would integrate the next time flow performance feature for feature integration, which can form a new integrated feature to identify different classifications, and time-dependent individuals can be identified as a new combined netflow to deal with the similarity feature recognition problem.

The similarity feature recognition problem cannot be identified, just as human twins cannot be recognized. It is obvious that when the netflow characteristics of one moment are identical to those of the other, the next or more adjacent netflow can be integrated until a new integration individual is created to distinguish each other. The number of new integration records is called the integration parameter (q). Fig. 3 shows the integration process with different integration parameters. Fig. 4 shows the main integration training process in our method. The original individuals with the same features have been converted to new integration individuals with different features. Feature learning training is then carried out to identify the whole combination. As a result of the feature integration, the corresponding labels should be function mapped to a new label. After data training, the combined results should be inverse function mapped to the original individual, so as to solve the similar feature recognition problem.

Hybrid architectures are developed using constructs of computational intelligence such as fuzzy sets, convolutional neural networks (CNNs), and deep neural networks (DNNs). They show good performance in some fields [17], [18], [19], [20]. So, a combination of adaptive feature integration methods, CNNs, and DNNs is a viable method to deal with the problem. We adopt the adaptive feature integration method to integrate the different individuals’ input data, and function mapping label redefinition method to integrate their output labels. Subsequent output layers must be output in stages due to the new integrated processing. The first output layer is connected to the DNN output layer, so the value of each output result must be converted to a subordinate category, which is determined by fuzzy classification. Moreover, a large number of fuzzy neural networks composed of fuzzy sets and neural networks are proposed in the literature [21], [22], [23]. By combining fuzzy sets and neural networks, the fuzzy neural network becomes an effective tool for the neural network classification problem [24], [25], [26], so we choose fuzzy classification as the best first-stage output method in a hybrid fuzzy integrated convolutional neural network (HFICNN). In the second output layer, the first-stage integrated output must be transformed to the original classification results. As a result of the FMLR process, we select IFMIR as the inverse function to restore the combined flow label, and this converts to the original classification label. To deal with the similarity feature recognition problem in abnormal netflow detection, we propose AFICNs and FMLR to develop an HFICNN through the FC and IFMIR methods. For example, the integration process is shown in Fig. 5 for the case that q = 1.

This paper can be summarized as follows.

First, different from the universal perspective of which features strongly influence the abnormal netflow detection problem, we study why abnormal netflow cannot be detected accurately. We analyze the realistic environment and the actual problems of abnormal flow detection, whose similarity feature recognition problem is that of different labels with the same netflow eigenvalues, which traditional machine learning and deep learning methods cannot well solve.

Second, we present HFICNN to deal with the problem. The model can convert the same feature individual with different labels to different feature individuals with different labels by constructing an integration method. Our method addresses the actual environment and operational complexity, adding a unique occurrence time feature to feature recognition. HFICNN integrates time-dependent records into new records to identify features.

Third, we combine the integration method with CNN, and propose adaptive feature integration convolutional neurons (AFICNs), which can be regarded as a generalized type of CNN. Different from traditional convolution neurons, AFICNs have dimensional characteristics of different individuals instead of the same individual. We propose function mapping label redefinition (FMLR), which integrates labels of different individuals.

Fourth, after AFICNN feature extraction, we make full use of DNN for deep feature recognition. It is necessary to transform from AFICNN to DNN layers. We transform the AFICNN layer outputs as the DNN inputs to connect to the DNN layers. Each layer of the model has high flexibility, and integration parameters can be adaptively adjusted. Different numbers of layers and neurons are parameters to achieve the final efficient classification results.

Last, the integrated label should be the inverse of the original label to identify the original individuals. Different from the traditional single-layer output neural network, our HFICNN output results in stages. In the first output stage, fuzzy classification (FC) and deep learning methods are combined to form AFICNN. In the second output stage, inverse function mapping integrated reduction (IFMIR) is applied to restore the original classification results, and then a complete fuzzy integrated convolution neural network is formed.

The paper is organized as follows. Section 1 introduces the similarity feature recognition problem and the source construction of the HFICNN method. Section 2 introduces related work and the design of HFICNN. Section 3 describes the HFICNN architecture and algorithm. Section 4 explains HFICNN procedures. Section 5 introduces the dataset and discusses the experimental results. Section 6 relates our conclusions.

Section snippets

Related work

We introduce the basic principles of some key methods of CNN, DNN, and FC, and propose our HFICNN model.

Architecture of HFICNN

We elaborate on the design and architecture of HFICNN. Fig. 7(a) describes a general architecture of HFICNN. It is essentially a new architecture that can be regarded as an improved CNN, as shown in Fig. 7(e). The labels are integrated into the new integrated labels by function mapping label redefinition (FMLR). The HFICNN model has three parts. First, it splits the original data into feature input data and label output data. It integrates multiple netflow records in time-related order, and

Overall design procedure of HFICNN model

We describe the HFICNN procedure in detail. The brief design procedure is shown in Fig. 9. The overall design procedure is shown in Fig. 10. The procedure has four steps. First is data integration, which splits the original data into feature input and label output data. It integrates multiple netflow records according to the time-dependent order and chronologically merges the self-arranged netflow into new records. HFICNN reintegrates the corresponding integrated labels into new labels

Experiments

We elaborate on the experimental dataset and results.

Conclusions

This paper presents a hybrid fuzzy integrated convolutional neural network (HFICNN) to deal with the similarity feature recognition problem in abnormal netflow detection, which is that different labels with the same netflow eigenvalues are not well disposed by traditional machine learning and deep learning methods. The HFICNN model can convert the same feature individuals with different labels to different feature individuals with different labels by constructing an integration method. Our

CRediT authorship contribution statement

Xin Yue: Conceptualization, Methodology, Investigation, Software, Data curation, Writing - original draft. Jinsong Wang: Conceptualization, Resources, Supervision, Project administration, Funding acquisition. Wei Huang: Conceptualization, Methodology, Investigation, Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

The work was supported by the National Key Research and Development Plan of China (No. 2018YFC0831405), the Natural Science Foundation of Tianjin (No. 18JCZDJC30700), the New Generation Artificial Intelligence Technology Major Project of Tianjin (No. 19ZXZNGX000801100), the National Natural Science Foundation of China (No. 61673295), the Natural Science Foundation of Tianjin for Distinguished Young Scholars (No. 19JCJQJC61500), and the Natural Science Foundation of Tianjin (No. 18JCYBJC85200).

Xin Yue received the B.Sc. And M.Sc. degree from the Department of Computer Science and Engineering, Tianjin University of Technology, Tianjin, China. Now she is a PhD candidate in Tianjin University of Technology.

Her main research interests include computer networks, artificial intelligence, evolutionary computation.

References (61)

  • H. Chen et al.

    A deep learning framework for identifying children with ADHD using an EEG-based brain network

    Neurocomputing

    (2019)
  • A. Ben Mabrouk et al.

    Abnormal behavior recognition for intelligent video surveillance systems: a review

    Expert Syst. Appl.

    (2018)
  • Y.H. Jing et al.

    Neural-network-based adaptive fault-tolerant tracking control of uncertain nonlinear time-delay systems under output constraints and infinite number of actuator faults

    Neurocomputing

    (2018)
  • S.J. Horng et al.

    A novel intrusion detection system based on hierarchical clustering and support vector machines

    Expert Syst. Appl.

    (2011)
  • W. Lv et al.

    Finite-time adaptive fuzzy output-feedback control of MIMO nonlinear systems with hysteresis

    Neurocomputing

    (2018)
  • M.S. Pavel et al.

    Object class segmentation of RGB-D video using recurrent convolutional neural networks

    Neural Networks

    (2017)
  • Y. Li et al.

    Moving object detection via segmentation and saliency constrained RPCA

    Neurocomputing

    (2019)
  • S. Qian et al.

    Adaptive activation functions in convolutional neural networks

    Neurocomputing

    (2018)
  • T. Ma et al.

    Graph classification based on graph set reconstruction and graph kernel feature reduction

    Neurocomputing

    (2018)
  • H.M. Fayek et al.

    Evaluating deep learning architectures for speech emotion recognition

    Neural Networks

    (2017)
  • M. Raza et al.

    Appearance based pedestrians head pose and body orientation estimation using deep learning

    Neurocomputing

    (2018)
  • S. Zhang et al.

    A hybrid approach combining an extended BBO algorithm with an intuitionistic fuzzy entropy weight method for QoS-aware manufacturing service supply chain optimization

    Neurocomputing

    (2018)
  • F. Nawaz et al.

    Proactive management of SLA violations by capturing relevant external events in a Cloud of Things environment

    Futur. Gener. Comput. Syst.

    (2019)
  • M. Tavallaee, E. Bagheri, W. Lu, A.A. Ghorbani, A detailed analysis of the KDD CUP 99 data set, in: IEEE Symposium on...
  • M. Idhammad et al.

    Semi-supervised machine learning approach for DDoS detection

    Appl. Intell.

    (2018)
  • M. Sabhnani et al.

    Application of machine learning algorithms to KDD intrusion detection dataset within misuse detection context

  • B. Dong, X. Wang, Comparison deep learning method to traditional methods using for network intrusion detection, in:...
  • M.E. Aminanto, K. Kim, Deep learning-based feature selection for intrusion detection system in transport layer,...
  • T. Ma et al.

    A hybrid spectral clustering and deep neural network ensemble algorithm for intrusion detection in sensor networks

    Sensors

    (2016)
  • B.A. Bhuvaneswari et al.

    Deep radial intelligence with cumulative incarnation approach for detecting denial of service attacks

    Neurocomputing

    (2019)
  • Cited by (9)

    • Convolutional Neural Network for DDoS Detection

      2023, Lecture Notes in Networks and Systems
    View all citing articles on Scopus

    Xin Yue received the B.Sc. And M.Sc. degree from the Department of Computer Science and Engineering, Tianjin University of Technology, Tianjin, China. Now she is a PhD candidate in Tianjin University of Technology.

    Her main research interests include computer networks, artificial intelligence, evolutionary computation.

    Jinsong Wang received the B.Sc. degree from the Department of Computer Science and Engineering, Tianjin University of Technology, Tianjin, China, and the M.Sc. and Ph.D. degrees from Nankai University, Tianjin.

    He is currently a Professor with the School of Computer and Communication Engineering, Tianjin University of Technology. His main research interests include computer networks, blockchain, distributed computation, and evolutionary computation.

    Wei Huang received his M.Sc. degree from the School of Information Engineering, East China Institute of Technology, Jiangxi, China, in 2006, and Ph.D. degree at State Key Laboratory of Software Engineering, Wuhan University, Wuhan, China, in 2011.

    From 2011 to 2012, he was a Research Professor in the Computational Intelligence Laboratory, Suwon University, South Korea. He is currently a Professor in the School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, China. His research interests include evolutionary computation, fuzzy system, fuzzy-neural networks, and advanced Computational Intelligence. He currently serves as an Associate Editor of the Journal of Electrical Engineering & Technology.

    View full text