Skip to main content
Log in

A feature selection approach to find optimal feature subsets for the network intrusion detection system

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

The performance of network intrusion detection systems based on machine learning techniques in terms of accuracy and efficiency largely depends on the selected features. However, choosing the optimal subset of features from a number of commonly used features to detect network intrusion requires extensive computing resources. The number of possible feature subsets from given n features is 2\(^{n}-1\). In this paper, to tackle this problem we propose an optimal feature selection algorithm. Proposed algorithm is based on a local search algorithm, one of the representative meta-heuristic algorithms for solving computationally hard optimization problems. Particularly, the accuracy of clustering obtained by applying k-means clustering algorithm to the training data set is exploited to measure the goodness of a feature subset as a cost function. In order to evaluate the performance of our proposed algorithm, comparisons with a feature set composed of all 41 features are carried out over the NSL-KDD data set using a multi-layer perceptron.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Paliwal, S., Gupta, R.: Denial-of-service, probing & remote to user (R2L) attack detection using genetic algorithm. Int. J. Comput. Appl. 60(19), 57–62 (2012)

    Google Scholar 

  2. Sabhnani, M., Serpen, G.: Application of machine learning algorithms to KDD intrusion detection dataset within misuse detection context. In: Proc. Int. Conf. Mach. Learn.: Model, Technol., and Appl, pp. 209–215 (2003)

  3. Bankovic, Z., Stepanovic, D., Bojanic, S., Nieto-Taladriz, O.: Improving network security using genetic algorithm approach. Comput. Electr. Eng. 33(5–6), 438–451 (2007)

    Article  Google Scholar 

  4. Azad, C., Jha, V.K.: Data mining in intrusion detection: a comparative study of methods, types and data sets. Int. J. Inf. Technol. Comput. Sci. 5(8), 75–90 (2013)

  5. Balajinath, B., Raghavan, S.V.: Intrusion detection through learning behavior model. Comput. Commun. 24(12), 1202–1212 (2001)

    Article  Google Scholar 

  6. Tsai, C.F., Hsu, Y.F., Lin, C.Y., Lin, W.Y.: Intrusion detection by machine learning: a review. Expert Syst. Appl. 36(10), 11994–12000 (2009)

    Article  Google Scholar 

  7. Wu, S.X., Banzhaf, W.: The use of computational intelligence in intrusion detection system: a review. Appl. Soft Comput. 10(1), 1–35 (2010)

  8. Kolias, C., Kambourakis, G., Maragoudakis, M.: Swarm intelligence in intrusion detection. A survey. Comput. Secur. 30(8), 625–642 (2011)

    Article  Google Scholar 

  9. Nguyen, H.T., Petrovic, S., Franke, K.: A comparison of feature-selection methods for intrusion detection. LNCS 6258, 242–255 (2010)

    Google Scholar 

  10. Chebrolu, S., Abraham, A., Thomas, J.P.: Hybrid feature selection for modeling intrusion detection system. LNCS 3316, 1020–1025 (2004)

    Google Scholar 

  11. Chebrolu, S., Abraham, A., Thomas, J.P.: Feature deduction and ensemble design of intrusion detection systems. Comput. Secur. 24(4), 295–307 (2005)

    Article  Google Scholar 

  12. KDD Cup 1999: Available on:http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html. (2007)

  13. NSL\(\_\)KDD data set: http://nsl.cs.unb.ca/NSL-KDD/

  14. Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A.: A detailed analysis of the KDD CUP 99 data set. In: Proc. 2009 IEEE Int. Conf. Comput. Intell. Secur. Def. Appl., pp. 53–58 (2009)

  15. Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)

    Article  Google Scholar 

  16. Kayacik, H.G., Zincir-Heywood, A.N., Heywood, M.I.: Selecting features for intrusion detection: a feature relevance analysis on KDD 99 intrusion detection datasets. 3rd Annu. Conf. Priv. Secur. Trust, St. Andrews, New Brunswick, Canada, (2005). http://www.cs.dal.ca/projectx/

  17. Olusola, A.A., Oladele, A.S., Abosede, D.O.: Analysis of KDD ’99 intrusion detection dataset for selection of relevance features. In: Proc. World Congr. Eng. Comput. Sci., p. 1 (2010)

  18. Parazad, S., Saboori, E., Allahyar, A.: Fast feature reduction in intrusion detection datasets. In: Proc. 35th Int. Conv., MIPRO, pp. 1023–1029. (2012)

  19. Cuyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  20. Xu, Z., King, I., Lyu, M.R.T., Jin, R.: Discriminative semi-supervised feature selection via manifold regularization. IEEE Trans. Neural Netw. 21(7), 1033–1047 (2010)

    Article  Google Scholar 

  21. Kang, S.-H.: A feature selection algorithm to find optimal feature subsets for detecting DoS attacks. In: Proc. 5th Int. Conf. IT Conv. Secur., pp. 352–354. (2015)

  22. Ghorbani, A.A., Lu, W., Tavallaee, M.: Network intrusion detection and prevention: concepts and techniques. Springer, New York (2010)

    Book  Google Scholar 

Download references

Acknowledgments

This research was supported by Korea Electric Power Corporation through Korea Electrical Engineering & Science Research Institute (Grant number: R15XA03-63).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Seung-Ho Kang or Kuinam J. Kim.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kang, SH., Kim, K.J. A feature selection approach to find optimal feature subsets for the network intrusion detection system. Cluster Comput 19, 325–333 (2016). https://doi.org/10.1007/s10586-015-0527-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-015-0527-8

Keywords

Navigation