Abstract
The performance of network intrusion detection systems based on machine learning techniques in terms of accuracy and efficiency largely depends on the selected features. However, choosing the optimal subset of features from a number of commonly used features to detect network intrusion requires extensive computing resources. The number of possible feature subsets from given n features is 2\(^{n}-1\). In this paper, to tackle this problem we propose an optimal feature selection algorithm. Proposed algorithm is based on a local search algorithm, one of the representative meta-heuristic algorithms for solving computationally hard optimization problems. Particularly, the accuracy of clustering obtained by applying k-means clustering algorithm to the training data set is exploited to measure the goodness of a feature subset as a cost function. In order to evaluate the performance of our proposed algorithm, comparisons with a feature set composed of all 41 features are carried out over the NSL-KDD data set using a multi-layer perceptron.
Similar content being viewed by others
References
Paliwal, S., Gupta, R.: Denial-of-service, probing & remote to user (R2L) attack detection using genetic algorithm. Int. J. Comput. Appl. 60(19), 57–62 (2012)
Sabhnani, M., Serpen, G.: Application of machine learning algorithms to KDD intrusion detection dataset within misuse detection context. In: Proc. Int. Conf. Mach. Learn.: Model, Technol., and Appl, pp. 209–215 (2003)
Bankovic, Z., Stepanovic, D., Bojanic, S., Nieto-Taladriz, O.: Improving network security using genetic algorithm approach. Comput. Electr. Eng. 33(5–6), 438–451 (2007)
Azad, C., Jha, V.K.: Data mining in intrusion detection: a comparative study of methods, types and data sets. Int. J. Inf. Technol. Comput. Sci. 5(8), 75–90 (2013)
Balajinath, B., Raghavan, S.V.: Intrusion detection through learning behavior model. Comput. Commun. 24(12), 1202–1212 (2001)
Tsai, C.F., Hsu, Y.F., Lin, C.Y., Lin, W.Y.: Intrusion detection by machine learning: a review. Expert Syst. Appl. 36(10), 11994–12000 (2009)
Wu, S.X., Banzhaf, W.: The use of computational intelligence in intrusion detection system: a review. Appl. Soft Comput. 10(1), 1–35 (2010)
Kolias, C., Kambourakis, G., Maragoudakis, M.: Swarm intelligence in intrusion detection. A survey. Comput. Secur. 30(8), 625–642 (2011)
Nguyen, H.T., Petrovic, S., Franke, K.: A comparison of feature-selection methods for intrusion detection. LNCS 6258, 242–255 (2010)
Chebrolu, S., Abraham, A., Thomas, J.P.: Hybrid feature selection for modeling intrusion detection system. LNCS 3316, 1020–1025 (2004)
Chebrolu, S., Abraham, A., Thomas, J.P.: Feature deduction and ensemble design of intrusion detection systems. Comput. Secur. 24(4), 295–307 (2005)
KDD Cup 1999: Available on:http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html. (2007)
NSL\(\_\)KDD data set: http://nsl.cs.unb.ca/NSL-KDD/
Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed analysis of the KDD CUP 99 data set. In: Proc. 2009 IEEE Int. Conf. Comput. Intell. Secur. Def. Appl., pp. 53–58 (2009)
Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)
Kayacik, H.G., Zincir-Heywood, A.N., Heywood, M.I.: Selecting features for intrusion detection: a feature relevance analysis on KDD 99 intrusion detection datasets. 3rd Annu. Conf. Priv. Secur. Trust, St. Andrews, New Brunswick, Canada, (2005). http://www.cs.dal.ca/projectx/
Olusola, A.A., Oladele, A.S., Abosede, D.O.: Analysis of KDD ’99 intrusion detection dataset for selection of relevance features. In: Proc. World Congr. Eng. Comput. Sci., p. 1 (2010)
Parazad, S., Saboori, E., Allahyar, A.: Fast feature reduction in intrusion detection datasets. In: Proc. 35th Int. Conv., MIPRO, pp. 1023–1029. (2012)
Cuyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Xu, Z., King, I., Lyu, M.R.T., Jin, R.: Discriminative semi-supervised feature selection via manifold regularization. IEEE Trans. Neural Netw. 21(7), 1033–1047 (2010)
Kang, S.-H.: A feature selection algorithm to find optimal feature subsets for detecting DoS attacks. In: Proc. 5th Int. Conf. IT Conv. Secur., pp. 352–354. (2015)
Ghorbani, A.A., Lu, W., Tavallaee, M.: Network intrusion detection and prevention: concepts and techniques. Springer, New York (2010)
Acknowledgments
This research was supported by Korea Electric Power Corporation through Korea Electrical Engineering & Science Research Institute (Grant number: R15XA03-63).
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Kang, SH., Kim, K.J. A feature selection approach to find optimal feature subsets for the network intrusion detection system. Cluster Comput 19, 325–333 (2016). https://doi.org/10.1007/s10586-015-0527-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-015-0527-8