Intrusion detection system for wireless mesh network using multiple support vector machine classifiers with genetic-algorithm-based feature selection
Introduction
The wireless mesh network (WMN) is a notable communication technology in recent years, which adopts multi-hop forwarding techniques for high speed data communication with minimum data loss. It is emerging as a highly developed communication technology and is suitable for various cyber–physical applications such as in smart grids, military applications and internet of things (Kim et al., 2012). The multi-hop nature of a WMN is vulnerable to various attacks such as flooding, blackhole and greyhole. Flooding is a denial-of-service attack that occurs by frequent transmission of bulk amounts of HELLO packets, data and undesirable messages by the malicious node. It causes congestion at the receiver buffer and communication channels of the WMN, which increases the packet dropping ratio and blocks the communication. Blackhole is a major Man-in-the-Middle (MITM) attack on WMNs, that interrupts communication by affecting an active node that drops the packet of all the nodes in a network. Greyhole attack also belongs to the MITM category; it maliciously drops the packets of only the selected nodes. An intrusion detection system (IDS) is an important security mechanism that protects the system from these attacks by analysing the network traffic. Numerous methods are adopted for the implementation of IDSs in WMNs (Wang and Yi, 2011; Nguyen et al., 2014).
Various machine learning algorithms such as support vector machine (SVM) (Cortes and Vapnik, 1995) and artificial neural network (ANN) (McCulloch and Pitts, 1943) are used as classifiers of IDSs in WMNs. They detect attacks by analysing the parameters of the network traffic data. The traffic data is noisy, and the input of certain features increases the accuracy of intrusion detection, whereas certain other features decrease the detection accuracy. Hence, the selection of informative features to serve as the input to the learning algorithms is essential for an IDS. The selection of features is challenging, and the complexity of the feature selection problem is NP-Hard (Montazeri et al., 2013).
The feature selection methods are categorized as either filter or wrapper methods. In a filter method, the non-informative features are removed from the input set by determining the relations between the input variables and the corresponding output (Zhang et al., 2014).However, in a wrapper method, the informative features are selected by evaluating the fitness of the features, using learning algorithms such as Bayesian classifier and SVM in an iterative manner (Jie and Po, 2011; Huang and Wang, 2006). Feature selection using evolutionary computation (EC) techniques such as genetic algorithm (GA), differential evolution (DE) and particle swarm optimization (PSO) belongs to the category of wrapper methods. Recently, filter and wrapper methods have been combined for selecting optimal subsets of features and are named as hybrid feature-selection technique. In Huang et al. (2007), a hybrid technique based on mutual information (MI) and GA is used, in which MI selects the semi-informative features by eliminating the non-informative features and GA selects the informative features from those semi-informative features. The major drawback of this method is the usage of filter methods as the initial technique, which is likely to eliminate a few informative features. In certain cases, multiple EC techniques are combined for obtaining good classification results: e.g., GA and particle swarm optimization (PSO) (Ghamisi and Benediktsson, 2015), ant colony optimization (ACO) and bee colony optimization (BCO) (Shunmugapriya and Kanmani, 2017).
Genetic algorithm is a population-based search algorithm that iterates the population of individuals using the three genetic operators, namely, selection, crossover and mutation, to obtain the optimal solution (Goldberg, 1989). It has undergone numerous enhancements from the early version for obtaining optimized result in different applications. For example, the population is represented in real coded format in Devaraj (2007), and the crossover operation is enhanced in Durairaj et al. (2006). Apart from the above, several enhancements were carried out on GA for the selection of informative features from datasets. Machine learning algorithms such as fuzzy, learning automata, SVM and multilayer perceptron (MLP) have been used as classifiers in such applications. The performance of SVM is higher than those of numerous available machine learning algorithms and is widely used as a classifier in a number of complex applications.
Most of the available feature selection techniques select the common informative features (CIF) of all the classes from the sample space. In the previous works (Wang, 2008; Khushaba et al., 2011), a feature subset is selected as the common informative features for all type of attacks. The drawback of using CIF is that the classifier exhibits high false positive rate. Another drawback is that it results in a sub-optimal subset of informative features. Hence, it is necessary to identify the informative features of each class for improving the performance of the classifier.
The local feature selection for improving the classifier performance can be implemented in either instance- or model-based learning method. In instance-based method, the weights of each feature are adjusted to achieve maximal margin, whereas, in model-based approach, an approximate model is designed for learning purpose (Liu et al., 2011). Both instance- and model-based methods have their own drawbacks, such as high computational intensity and coarse function approximation, respectively (Driessens and Dzeroski, 2005). In certain cases (Liu et al., 2012), both the learning methods are combined to gain the advantages of both methods.
In this paper, a model-based local feature selection for each category of attacks is proposed for IDS development. The performance of the proposed system is evaluated by using an intrusion dataset generated from a WMN simulated in Network Simulator 3 (NS3) tool by using the standard intrusion dataset. The experimental results have demonstrated that the proposed system with the feature selection technique is substantially more efficient than a conventional system with common feature selection techniques.
Section snippets
Proposed methodology for development of IDS
The proposed IDS is based on GA-based feature selection and SVM classifier. SVM classifier exhibits high attack detection ratio and is suitable for multiple attack detection in WMN (Zhang et al., 2011). A separate classifier is assigned for each attack category and is trained with the informative features of each attack data selected by the proposed feature selection technique. The classifiers are arranged in linear order as shown in Fig. 1, and each classifier is placed in the order of
GA-based feature selection technique
The feature selection problem is a highly complex NP-hard problem. It can be stated as follows: “m” subsets of informative features are selected from the “n” number of features, where m < n. The informative features are expected to require less computation effort and exhibit high accuracy. In this section, the optimal feature selection using genetic algorithm is described in detail.
Genetic algorithm (Holland, 1975) is an adaptive heuristic global search algorithm inspired from the evolutionary
SVM classifier
The SVM classifier, a supervised learning algorithm, is designed using the fundamental concept of classifying data with a hyperplane or line. The architecture of an SVM classifier is shown in Fig. 3. The hyperplanes are linear in nature and is mathematically expressed as where w is the weight vector, x is the input data and b is the bias value. Several hyperplanes are available between the classes, and it is necessary to select the most effective classification hyperplane to obtain
Results and discussion
To demonstrate the efficiency of the proposed feature selection technique, an intrusion dataset has been generated from a WMN environment simulated in NS3. The WMN simulation diagram is shown in Fig. 4. The simulated network has 30 nodes, of which one node acts as the base station, and all the other nodes transmit data to that base station in mesh fashion. In this simulation, AODV routing protocol is used to establish WMN communication in NS3 (Kim et al., 2010).
After generating the dataset, the
Conclusion
In this paper, the local informative features of each category of attacks are selected using GA and SVM classifiers for developing an IDS for a WMN. The use of the informative features of each category of attacks yields a higher accuracy of detection than the use of the common informative features, for particular attacks. The performance of the proposed feature selection algorithm is analysed by comparing with MI-based feature selection techniques using generated intrusion datasets and standard
R. VIJAYANAND, He is a Research Scholar in Department of Computer Science and Engineering, Kalasalingam University, Krishnankoil, India. He received his B.E. in Computer Science and Engineering from PTR College of Engineering and Technology, Madurai and M.Tech. in Computer Science and Engineering from Kalasalingam University. Currently he is pursuing his doctoral degree in the field of security in smart meter data transmission.
References (32)
- et al.
Anomaly intrusion detection based on PLS feature extraction and core vector machine
Knowl Based Syst
(2013) - et al.
A GA based feature selection and parameter optimization for support vector machines
Expert Syst
(2006) - et al.
A hybrid genetic algorithm for feature selection wrapper based on mutual information
Pattern Recognit Lett
(2007) - et al.
Naive Bayesian classifier based on genetic simulated annealing algorithm
Procedia Eng
(2011) - et al.
A hybrid algorithm using ant and bee colony optimization for feature selection and classification
Swarm Evol comput
(2017) - et al.
Image reconstruction using multi layer perceptron and support vector machine classifier and study of classification accuracy
Int J Sci Technol Res Vol
(2015) - et al.
Support-vector networks
Mach Learn
(1995) - et al.
Generation of a new IDS test dataset: time to retire the KDD collection
- et al.
Radial basis function networks for fast contingency ranking
Electr Power Energy Syst
(2002) Improved genetic algorithm for multi-objective reactive power dispatch problem
Int Trans Electr Energy Syst
(2007)
Combining model-based and instance-based learning for first order regression
Voltage stability constrained reactive power planning using improved genetic algorithm
Int J Water Energy
A computationally efficient estimator for mutual information
Proc R Soc
Feature selection based on hybridization of genetic algorithm and particle swarm optimization
IEEE Trans. Geosci. Remote.
Genetic algorithms in search, optimization, and machine learning
A comparative performance evaluation of intrusion detection techniques for hierarchical wireless sensor networks
Egypt Inform J
Cited by (0)
R. VIJAYANAND, He is a Research Scholar in Department of Computer Science and Engineering, Kalasalingam University, Krishnankoil, India. He received his B.E. in Computer Science and Engineering from PTR College of Engineering and Technology, Madurai and M.Tech. in Computer Science and Engineering from Kalasalingam University. Currently he is pursuing his doctoral degree in the field of security in smart meter data transmission.
Dr. D. Devaraj is a Senior Professor in the Department of Electrical and Electronics Engineering, Kalasalingam University, Tamilnadu, india. He completed his B.E. and M.E in Electrical & Electronics Engineering and Power System Engineering in the year 1992 and 1994, respectively, from Thiagarajar College of Engineering, Madurai. He obtained his Ph.D. degree from IIT Madras, in the year 2001. He guided more than 18 Ph.D. scholars. His research interest includes Power system security, Voltage stability, Smart Grid and Evolutionary Algorithm.
Dr. B. Kannapiran is an Associate Professor, in the Department of Instrumentation and Control Engineering, Kalasalingam University, Tamilnadu, India. He received his Ph.D. degree in Information and Communication Engineering from Anna University, Chennai in the year 2013. He received his M.E. degree in Applied Electronics from Madurai Kamaraj University in the year 2002. He also received his B.E. degree in Instrumentation and Control Engineering from Madurai Kamaraj University in the year 2001. His research interests include soft computing, fault diagnosis, Biomedical Instrumentation, wireless networks.