Skip to main content
Log in

Hybrid of binary gravitational search algorithm and mutual information for feature selection in intrusion detection systems

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Intrusion detection systems (IDSs) play an important role in the security of computer networks. One of the main challenges in IDSs is the high-dimensional input data analysis. Feature selection is a solution to overcoming this problem. This paper presents a hybrid feature selection method using binary gravitational search algorithm (BGSA) and mutual information (MI) for improving the efficiency of standard BGSA as a feature selection algorithm. The proposed method, called MI-BGSA, used BGSA as a wrapper-based feature selection method for performing global search. Moreover, MI approach was integrated into the BGSA, as a filter-based method, to compute the feature–feature and the feature–class mutual information with the aim of pruning the subset of features. This strategy found the features considering the least redundancy to the selected features and also the most relevance to the target class. A two-objective function based on maximizing the detection rate and minimizing the false positive rate was defined as a fitness function to control the search direction of the standard BGSA. The experimental results on the NSL-KDD dataset showed that the proposed method can reduce the feature space dramatically. Moreover, the proposed algorithm found better subset of features and achieved higher accuracy and detection rate as compared to the some standard wrapper-based and filter-based feature selection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Amiri F, RezaeiYousefi M, Lucas C, Shakery A, Yazdani N (2011) Mutual information-based feature selection for intrusion detection systems. J Netw Comput Appl 34(4):1184–1199. doi:10.1016/j.jnca.2011.01.002

    Article  Google Scholar 

  • Battiti R (2002) Using mutual information for selecting features in supervised neural networks learning. IEEE Trans Neural Networ 5(4):537–550. doi:10.1109/72.298224

    Article  Google Scholar 

  • Bhuse V, Gupta A (2006) Anomaly intrusion detection in wireless sensor networks. J High Speed Netw 15(1):33–51

    Google Scholar 

  • Blake CL, Merz CJ (1998) UCI repository of machine learning databases. Department of Information and Computer Science, University of California, Irvine. http://mrl.cs.umass.edu/ml/datasets. Accessed 21 May 2008

  • Bonev BI (2010) Feature selection based on information theory. Dissertation, University of Alicante

  • Cutillo L, Carissimo A, Figini S (2012) Network selection: a method for ranked lists selection. Plos One 7(8):e43678. doi:10.1371/journal.pone.0043678

    Article  Google Scholar 

  • Dash R, Paramguru RL, Dash R (2011) Comparative analysis of supervised and unsupervised discretization techniques. Int J Adv Sci Technol 2(3):29–37

    Google Scholar 

  • Deisy C, Baskar S, Ramraj N, Saravanan Koori J, Jeevanandam P (2010) A novel information theoretic-interact algorithm (IT-IN) for feature selection using three machine learning algorithms. Expert Syst Appl 37(12):7589–7597. doi:10.1016/j.eswa.2010.04.084

    Article  Google Scholar 

  • Enache AC, Patriciu VV (2014) Intrusions detection based on support vector machine optimized with swarm intelligence. In: 9th international symposium on applied computational intelligence and informatics, pp 153–158

  • Fiore U, Palmieri F, Castiglione A, De Santis A (2013) Network anomaly detection with the restricted Boltzmann machine. Neurocomputing 122:13–23. doi:10.1016/j.neucom.2012.11.050

    Article  Google Scholar 

  • Hall MA (2000) Correlation-based feature selection for discrete and numeric class machine learning. In: 17th International Conference on Machine Learning, pp 359–366

  • Hopkins M, Reeber E, Forman G, Suermondt J (1999) Spam dataset- machine learning repository, UCI. http://archive.ics.uci.edu/ml/datasets/Spambase. Accessed 1 August 2015

  • Hoque N, Bhattacharyya DK, Kalita JK (2014) MIFS-ND: a mutual information-based feature selection method. Expert Syst Appl 41(14):6371–6385. doi:10.1016/j.eswa.2014.04.019

    Article  Google Scholar 

  • Jiang S, Wang Y, Ji Z (2014) Convergence analysis and performance of an improved gravitational search algorithm. Appl Soft Comput 24:363–384. doi:10.1016/j.asoc.2014.07.016

    Article  Google Scholar 

  • Kim G, Lee S, Kim S (2014) A novel hybrid intrusion detection method integrating anomaly detection with misuse detection. Expert Syst Appl 41(4):1690–1700. doi:10.1016/j.eswa.2013.08.066

    Article  Google Scholar 

  • Kira K, Rendell LA (1992) Feature selection problem: Traditional methods and a new algorithm. In: 10th National Conference on artificial intelligence, pp 129–134

  • Kuang F, Zhang S, Jin Z, Xu W (2015) A novel SVM by combining kernel principal component analysis and improved chaotic particle swarm optimization for intrusion detection. Soft Comput 19(5):1187–1199. doi:10.1007/s00500-014-1332-7

    Article  Google Scholar 

  • Kudłacik P, Porwik P, Wesołowski T (2015) Fuzzy approach for intrusion detection based on user’s commands. Soft Comput. doi:10.1007/s00500-015-1669-6

    Google Scholar 

  • Kumar G, Kumar K (2012) An information theoretic approach for feature selection. Secur Commun Netw 5(2):178–185. doi:10.1002/sec.303

    Article  Google Scholar 

  • Kwak N, Choi CH (2003) Input feature selection by mutual information based on Parzen window. IEEE Trans Pattern Anal 24(12):1667–1671. doi:10.1109/TPAMI.2002.1114861

    Article  Google Scholar 

  • Liu H, Setiono R (1995) Chi2: Feature selection and discretization of numeric attributes. In: 7th international conference on tools with artificial intelligence, pp 388–391

  • Liu H, Sun J, Liu L, Zhang H (2009) Feature selection with dynamic mutual information. Pattern Recogn 42(7):1330–1339. doi:10.1016/j.patcog.2008.10.028

    Article  MATH  Google Scholar 

  • Liu H, Wu X, Zhang S (2014) A new supervised feature selection method for pattern classification. Comput Intell 30(2):342–361. doi:10.1111/j.1467-8640.2012.00465.x

    Article  MathSciNet  Google Scholar 

  • Migliardi M, Merlo A (2013) Improving energy efficiency in distributed intrusion detection systems. J High Speed Netw 19(3):251–264. doi:10.3233/JHS-130476

    Google Scholar 

  • Nezamabadi-pour H, Rostami-Shahrbabaki M, Maghfoori-Farsangi M (2008) Binary particle swarm optimization: challenges and new solutions. CSI J Comput Sci Eng 6(1-A):21–32

  • Noto K, Brodley C, Slonim D (2012) FRaC: a feature-modeling approach for semi-supervised and unsupervised anomaly detection. Data Min Knowl Disc 25(1):109–133. doi:10.1007/s10618-011-0234-x

    Article  MathSciNet  Google Scholar 

  • Palmieri F, Fiore U (2010) Network anomaly detection through nonlinear analysis. Comput Secur 29(7):737–755. doi:10.1016/j.cose.2010.05.002

    Article  Google Scholar 

  • Palmieri F, Fiore U, Castiglione A, De Santis A (2013) On the detection of card-sharing traffic through wavelet analysis and support vector machines. Appl Soft Comput 13(1):615–627. doi:10.1016/j.asoc.2012.08.045

    Article  Google Scholar 

  • Pang S, Ban T, Kadobayashi Y, Kasabov N (2011) Personalized mode transductive spanning SVM classification tree. Inf Sci 181(11):2071–2085. doi:10.1016/j.ins.2011.01.008

    Article  Google Scholar 

  • Pei M, Goodman ED, Punch WF (1998) Feature extraction using genetic algorithms. In: International symposium on intelligent data engineering and learning, pp 371–384

  • Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal 27(8):1226–1238. doi:10.1109/TPAMI.2005.159

    Article  Google Scholar 

  • Rashedi E, Nezamabadi-pour H, Saryazdi S (2009) GSA: a gravitational search algorithm. Inf Sci 179(13):2232–2248. doi:10.1016/j.ins.2009.03.004

    Article  MATH  Google Scholar 

  • Rashedi E, Nezamabadi-pour H, Saryazdi S (2010) BGSA: binary gravitational search algorithm. Nat Comput 9(3):727–745. doi:10.1007/s11047-009-9175-3

    Article  MathSciNet  MATH  Google Scholar 

  • Robnik-Šikonja M, Kononenko I (2003) Theoretical and empirical analysis of ReliefF and RReliefF. Mach Learn 53(1–2):23–69. doi:10.1023/A:1025667309714

    Article  MATH  Google Scholar 

  • Ruiz R, Riquelme JC, Aguilar-Ruiz JS (2005) Heuristic search over a ranking for feature selection. Lect Notes Comput Sci 3512:742–749. doi:10.1007/11494669_91

  • Sheikhan M (2014) Generation of suprasegmental information for speech using a recurrent neural network and binary gravitational search algorithm for feature selection. Appl Intell 40(4):772–790. doi:10.1007/s10489-013-0505-x

  • Sheikhan M, Jadidi Z, Farrokhi A (2012) Intrusion detection using reduced-size RNN based on feature grouping. Neural Comput Appl 21(6):1185–1190. doi:10.1007/s00521-010-0487-0

    Article  Google Scholar 

  • Sheikhan M, Mohammadi N (2012) Neural-based electricity load forecasting using hybrid of GA and ACO for feature selection. Neural Comput Appl 21(8):1961–1970. doi:10.1007/s00521-011-0599-1

    Article  Google Scholar 

  • Sigillito VG (1989) Ionosphere dataset- machine learning repository, UCI. http://archive.ics.uci.edu/ml/datasets/Ionosphere. Accessed 1 August 2015

  • Stakhanova N, Basu S, Wong J (2010) On the symbiosis of specification-based and anomaly-based detection. Comput Secur 29(2):253–268. doi:10.1016/j.cose.2009.08.007

    Article  Google Scholar 

  • Tavallaee M, Bagheri E, Wei L Ghorbani A (2009a) NSL-KDD Data Set. http://nsl.cs.unb.ca/NSL-KDD. Accessed 21 November 2014

  • Tavallaee M, Bagheri E, Wei L, Ghorbani A (2009b) A detailed analysis of the KDD CUP 99 data set. In: 2nd international symposium on computational intelligence for security and defense applications, pp 53–58

  • Unler A, Murat A, Chinnam RB (2011) mr\(^{2}\)PSO: a maximum relevance minimum redundancy feature selection method based on swarm intelligence for support vector machine classification. Inf Sci 181(20):4625–4641. doi:10.1016/j.ins.2010.05.037

    Article  Google Scholar 

  • Wang G, Hao J, Ma J, Huang L (2010) A new approach to intrusion detection using artificial neural networks and fuzzy clustering. Expert Syst Appl 37(9):6225–6232. doi:10.1016/j.eswa.2010.02.102

    Article  Google Scholar 

  • Wang W, Zhang X, Gombault S, Knapskog SJ (2009) Attribute normalization in network intrusion detection. In: 10th international symposium on pervasive systems, algorithms, and networks, pp 448–453

  • Wolberg WH (1992) Original Wisconsin Breast Cancer Dataset- Machine Learning Repository, UCI. http://archive.ics.uci.edu/ml/datasets. Accessed 1 August 2015

  • Wu S, Yen E (2009) Data mining-based intrusion detectors. Expert Syst Appl 36(3):5605–5612. doi:10.1016/j.eswa.2008.06.138

    Article  Google Scholar 

  • Wu SX, Banzhaf W (2010) The use of computational intelligence in intrusion detection systems: a review. Appl Soft Comput 10(1):1–35. doi:10.1016/j.asoc.2009.06.019

    Article  Google Scholar 

  • Zhang Z, Hancock ER (2012) Hypergraph based information-theoretic feature selection. Pattern Recogn Lett 33(15):1991–1999. doi:10.1016/j.patrec.2012.03.021

    Article  Google Scholar 

  • Zhao Z, Liu H (2007) Searching for interacting features. In: 20th international joint conference on artificial intelligence, pp 1156–1161

  • Zheng Y, Kwoh CK (2011) A feature subset selection method based on high-dimensional mutual information. Entropy 13(4):860–901. doi:10.3390/e13040860

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mansour Sheikhan.

Ethics declarations

Conflict of interest

The authors declare that there is no potential conflict of interest in this work.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bostani, H., Sheikhan, M. Hybrid of binary gravitational search algorithm and mutual information for feature selection in intrusion detection systems. Soft Comput 21, 2307–2324 (2017). https://doi.org/10.1007/s00500-015-1942-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-015-1942-8

Keywords

Navigation