Abstract
The goal of securing a network is to protect the information flowing through the network and to ensure the security of intellectual as well as sensitive data for the underlying application. To accomplish this goal, security mechanism such as Intrusion Detection System (IDS) is used, that analyzes the network traffic and extract useful information for inspection. It identifies various patterns and signatures from the data and use them as features for attack detection and classification. Various Machine Learning (ML) techniques are used to design IDS for attack detection and classification. All the features captured from the network packets do not contribute in detecting or classifying attack. Therefore, the objective of our research work is to study the effect of various feature selection techniques on the performance of IDS. Feature selection techniques select relevant features and group them into subsets. This paper implements Chi-Square, Information Gain (IG), and Recursive Feature Elimination (RFE) feature selection techniques with ML classifiers namely Support Vector Machine, Naïve Bayes, Decision Tree Classifier, Random Forest Classifier, k-nearest neighbours, Logistic Regression, and Artificial Neural Networks. The methods are experimented on NSL-KDD dataset and comparative analysis of results is presented.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Agarwal N, Hussain SZ (2018) A closer look at intrusion detection system for web applications. Secur Commun Netw 2018:1–27. https://doi.org/10.1155/2018/9601357
Aljawarneh S, Aldwairi M, Yassein MB (2018) Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model. J Comput Sci 25:152–160
Allahyari M, Pouriyeh S, Assefi M, Safaei S, Trippe ED, Gutierrez JB, Kochut K (2017) A brief survey of text mining: Classification, clustering and extraction techniques. arXiv preprint arXiv:170702919
Almseidin M, Alzubi M, Kovacs S, Alkasassbeh M (2017) Evaluation of machine learning algorithms for intrusion detection system. In: 2017 IEEE 15th International Symposium on Intelligent Systems and Informatics (SISY), IEEE, pp 000277–000282
Amiri F, Yousefi MR, Lucas C, Shakery A, Yazdani N (2011) Mutual information-based feature selection for intrusion detection systems. J Netw Comput Appl 34(4):1184–1199
Balasaraswathi VR, Sugumaran M, Hamid Y (2017) Feature selection techniques for intrusion detection using non-bio-inspired and bio-inspired optimization algorithms. J Commun Inform Netw 2(4):107–119
Benaddi H, Ibrahimi K, Benslimane A (2018) Improving the intrusion detection system for nsl-kdd dataset based on pca-fuzzy clustering-knn. In: 2018 6th International Conference on Wireless Networks and Mobile Communications (WINCOM), IEEE, pp 1–6
Besharati E, Naderan M, Namjoo E (2019) Lr-hids: logistic regression host-based intrusion detection system for cloud environments. J Ambient Intell Human Comput 10(9):3669–3692
Biswas SK (2018) Intrusion detection using machine learning: a comparison study. Int J Pure Appl Math 118(19):101–114
Bitaab M, Hashemi S (2017) Hybrid intrusion detection: Combining decision tree and gaussian mixture model. In: 2017 14th International ISC (Iranian Society of Cryptology) Conference on Information Security and Cryptology (ISCISC), IEEE, pp 8–12
Breiman L (2017) Classification and regression trees. Routledge, Abingdon
Chomboon K, Chujai P, Teerarassamee P, Kerdprasop K, Kerdprasop N (2015) An empirical study of distance metrics for k-nearest neighbor algorithm. In: Proceedings of the 3rd international conference on industrial application engineering, pp 1–6
Chou TS, Yen KK, Luo J (2008) Network intrusion detection design using feature selection of soft computing paradigms. Int J Computat Intell 4(3):196–208
Da Silva IN, Spatti DH, Flauzino RA, Liboni LHB, dos Reis Alves SF (2017) Artificial neural networks. Springer International Publishing, Cham
Dash M, Liu H (1997) Feature selection for classification. Intelligent data analysis 1(1–4):131–156
Denning DE (1987) An intrusion-detection model. IEEE Trans Softw Eng 2:222–232
Deshmukh DH, Ghorpade T, Padiya P (2015) Improving classification using preprocessing and machine learning algorithms on nsl-kdd dataset. In: 2015 International Conference on Communication, Information & Computing Technology (ICCICT), IEEE, pp 1–6
Dogan Ü, Glasmachers T, Igel C (2016) A unified view on multi-class support vector classification. J Mach Learn Res 17(45):1–32
Ektefa M, Memar S, Sidi F, Affendey LS (2010) Intrusion detection using data mining techniques. In: 2010 International Conference on Information Retrieval & Knowledge Management (CAMP), IEEE, pp 200–203
Fadlil A, Riadi I, Aji S (2017) Ddos attacks classification using numeric attributebased gaussian naive bayes. Int J Adv Comput Sci Appl (IJACSA) 8(8):42–50
Hackeling G (2017) Mastering Machine Learning with scikit-learn. Packt Publishing Ltd, pp 1–254. https://www.packtpub.com/in/big-data-and-business-intelligence/mastering-machine-learning-scikit-learn-second-edition
Harrell FE Jr (2015) Regression modeling strategies: with applications to linear models, logistic and ordinal regression, and survival analysis. Springer, Berlin
Heba FE, Darwish A, Hassanien AE, Abraham A (2010) Principle components analysis and support vector machine based intrusion detection system. In: Intelligent Systems Design and Applications (ISDA), 2010 10th International Conference on, IEEE, pp 363–367
Ingre B, Yadav A (2015) Performance analysis of nsl-kdd dataset using ann. In: 2015 International Conference on Signal Processing and Communication Engineering Systems, IEEE, pp 92–96
Jović A, Brkić K, Bogunović N (2015) A review of feature selection methods with applications. In: 2015 38th International Convention on Information and Communication Technology. Electronics and Microelectronics (MIPRO), IEEE, pp 1200–1205
Kloft M, Brefeld U, Düessel P, Gehl C, Laskov P (2008) Automatic feature selection for anomaly detection. In: Proceedings of the 1st ACM workshop on Workshop on AISec, ACM, pp 71–76
Kumar K, Batth JS (2016) Network intrusion detection with feature selection techniques using machine-learning algorithms. Int J Comput Appl 150(12):1–13. https://doi.org/10.5120/ijca2016910764
Kumari B, Swarnkar T (2011) Filter versus wrapper feature subset selection in large dimensionality micro array: a review. Int J Comput Sci Inf Technol 2(3):1048–1053
Larson D (2016) Distributed denial of service attacks-holding back the flood. Netw Secur 2016(3):5–7
Li J, Cheng K, Wang S, Morstatter F, Trevino RP, Tang J, Liu H (2017) Feature selection: A data perspective. ACM Comput Surv 50:94:1–94:45
Maillo J, Ramírez S, Triguero I, Herrera F (2017) knn-is: an iterative spark-based design of the k-nearest neighbors classifier for big data. Knowl Based Syst 117:3–15
Mandal N, Jadhav S (2016) A survey on network security tools for open source. In: 2016 IEEE International Conference on Current Trends in Advanced Computing (ICCTAC), IEEE, pp 1–6
Mansournia MA, Geroldinger A, Greenland S, Heinze G (2017) Separation in logistic regression: causes, consequences, and control. Am J Epidemiol 187(4):864–870
Mayuranathan M, Murugan M, Dhanakoti V (2019) Best features based intrusion detection system by rbm model for detecting ddos in cloud environment. J Ambient Intell Human Comput: 1–11
McHugh J (2000) Testing intrusion detection systems: a critique of the 1998 and 1999 darpa intrusion detection system evaluations as performed by lincoln laboratory. ACM Trans Inform Syst Secu (TISSEC) 3(4):262–294
Meira J, Andrade R, Praça I, Carneiro J, Bolón-Canedo V, Alonso-Betanzos A, Marreiros G (2019) Performance evaluation of unsupervised techniques in cyber-attack anomaly detection. J Ambient Intell Human Comput. https://doi.org/10.1007/s12652-019-01417-9
Meyer D, Wien FT (2015) Support vector machines. Interf Libsvm Pack e1071:28
Mkuzangwe NN, Nelwamondo F (2017) Ensemble of classifiers based network intrusion detection system performance bound. In: 2017 4th International Conference on Systems and Informatics (ICSAI), IEEE, pp 970–974
Mousavi SM, Majidnezhad V, Naghipour A (2019) A new intelligent intrusion detector based on ensemble of decision trees. J Ambient Intell Human Comput. https://doi.org/10.1007/s12652-019-01596-5
Mukherjee S, Sharma N (2012) Intrusion detection using naive bayes classifier with feature reduction. Proc Technol 4:119–128
Nehinbe JO (2011) A critical evaluation of datasets for investigating idss and ipss researches. In: 2011 IEEE 10th International Conference on Cybernetic Intelligent Systems (CIS), IEEE, pp 92–97
Nguyen H, Franke K, Petrovic S (2010) Improving effectiveness of intrusion detection by correlation feature selection. In: Availability, Reliability, and Security, 2010. ARES’10 International Conference on, IEEE, pp 17–24
Olusola AA, Oladele AS, Abosede DO (2010) Analysis of kdd’99 intrusion detection dataset for selection of relevance features. Proc World Cong Eng Comput Sci Citeseer 1:20–22
Phutane MT, Pathan A (2015) Intrusion detection system using decision tree and apriori algorithm. J Comput Eng Technol 6(7):09–18
Puga JL, Krzywinski M, Altman N (2015) Points of significance: Bayes’ theorem. Nat Methods 12:277–278. https://doi.org/10.1038/nmeth.3335
Rajput D, Thakkar A (2019) A survey on different network intrusion detection systems and countermeasure. Emerging Research in Computing. Information, Communication and Applications, Springer, pp 497–506
Richhariya R, Manjhwar AK, Makwana RRS (2017) A hybrid approach for user to root and remote to local attack. Int J Comput Sci Eng 5(6):73–79
Russell SJ, Norvig P (2016) Artificial intelligence: a modern approach. Pearson Education Limited, Malaysia
Sahani R, Rout C, Badajena JC, Jena AK, Das H, et al. (2018) Classification of intrusion detection using data mining techniques. In: Progress in computing, analytics and networking, Springer, pp 753–764
Sharafaldin I, Lashkari AH, Ghorbani AA (2018) Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: ICISSP, pp 108–116
Smaha SE (1988) Haystack: An intrusion detection system. In: [Proceedings 1988] Fourth Aerospace Computer Security Applications, IEEE, pp 37–44
Song YY, Ying L (2015) Decision tree methods: applications for classification and prediction. Shanghai Arch Psychiatry 27(2):130
Subba B, Biswas S, Karmakar S (2016) Enhancing performance of anomaly based intrusion detection systems through dimensionality reduction using principal component analysis. In: 2016 IEEE International Conference on Advanced Networks and Telecommunications Systems (ANTS), IEEE, pp 1–6
Suthaharan S (2016) Support vector machine. In: Machine learning models and algorithms for big data classification, vol 36. Springer, pp 207–235
Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) A detailed analysis of the kdd cup 99 data set. In: 2009 IEEE symposium on computational intelligence for security and defense applications, IEEE, pp 1–6
Thakkar A, Lohiya R (2020a) A review of the advancement in intrusion detection datasets. Procedia Comput Sci 167:636–645. https://doi.org/10.1016/j.procs.2020.03.330
Thakkar A, Lohiya R (2020b) Role of swarm and evolutionary algorithms for intrusion detection system: a survey. Swarm Evolut Comput 53:100631
Thaseen IS, Kumar CA (2017) Intrusion detection model using fusion of chi-square feature selection and multi class svm. J King Saud Univ Comput Inform Sci 29(4):462–472
van Gerven M, Bohte S (2018) Artificial neural networks as models of neural information processing. Frontiers Media SA. https://www.frontiersin.org/research-topics/4817/artificial-neural-networks-as-models-of-neural-information-processing
Wahba Y, ElSalamouny E, ElTaweel G (2015) Improving the performance of multi-class intrusion detection systems using feature reduction. arXiv preprint arXiv:150706692
Witten IH, Frank E, Hall MA, Pal CJ (2016) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann, pp 1–629. ISBN 978-0-12-374856-0. https://doi.org/10.1016/B978-0-12-374856-0.00002-X
Zainal A, Maarof MA, Shamsuddin SM et al (2009) Ensemble classifiers for network intrusion detection system. J Inform Assur Secur 4(3):217–225
Zaman S, Karray F (2009) Features selection for intrusion detection systems based on support vector machines. In: Consumer Communications and Networking Conference, 2009. CCNC 2009. 6th IEEE, IEEE, pp 1–8
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Thakkar, A., Lohiya, R. Attack classification using feature selection techniques: a comparative study. J Ambient Intell Human Comput 12, 1249–1266 (2021). https://doi.org/10.1007/s12652-020-02167-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-020-02167-9