A practical intrusion detection system based on denoising autoencoder and LightGBM classifier with improved detection performance

Ayubkhan, Sheikh Abdul Hameed; Yap, Wun-She; Morris, Ezra; Rawthar, Mumtaj Begam Kasim

doi:10.1007/s12652-022-04449-w

A practical intrusion detection system based on denoising autoencoder and LightGBM classifier with improved detection performance

Original Research
Published: 08 November 2022

Volume 14, pages 7427–7452, (2023)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Sheikh Abdul Hameed Ayubkhan¹,
Wun-She Yap ORCID: orcid.org/0000-0002-0007-6174¹,
Ezra Morris¹ &
…
Mumtaj Begam Kasim Rawthar²

833 Accesses
8 Citations
Explore all metrics

Abstract

Autoencoder and conventional machine learning classifiers are widely used to design an intrusion detection system (IDS). However, noise and corruption in the high-dimensional network traffic samples will still affect the stability and performance of an autoencoder and other conventional machine learning based IDS models. The distortions in the input datasets cause deviations in the learnt patterns and always resulted in a low detection rate. Besides, the IDS classifiers use every single feature to train the samples, which makes the model consumes longer training time, computational resources and memory usage. The main aim of this proposal is to remove the distortions from the network traffic and train the IDS model in a faster manner to detect any category of intruders in the network traffic by achieving a higher detection rate in a short training time. To achieve this, we propose an intrusion detection system that combines a denoising autoencoder and LightGBM classifier. The denoising autoencoder removes the noise and corruptions in the network traffic, thereby possibly avoiding the deviations which can enhance the features learning capacity required for classification. Subsequently, to classify the samples, the LightGBM classifier is used. The classifier uses the feature histogram bins with larger gradients, thus avoiding using each feature at every iteration to accelerate the training speed and boost the predictive capacity of the model. The proposed model shows better detection performance improvement over nine benchmark datasets including CIDDS-001, CIDDS-002, ISCX-URL2016, UNSW-NB15, CIC-IDS-2017, ISCX-Tor2016, BoT-IoT, IoTID20 and Kyoto 2006+ for both binary classification and multi-classification tasks as compared to other existing IDS. The model achieves the maximum detection rate of over 99.60% for CIDDS-001, 99.90% for CIDDS-002, 97.00% for ISCX-Tor2016, 96.11% for UNSW-NB15, 99.86% for CIC-IDS17, 97.76% for ISCX-URL16, 99.91% for BoT-IoT, 97.43% for both IoTID2020 and Kyoto 2006+ datasets respectively, while the training time ranges from 1.10 to 21.78 s only. More importantly, the proposed model has higher learning and predictivity capacity which boosts the generalization capacity. The model also shows good performance in detecting all classes including the minority classes for all aforementioned datasets without any oversampling techniques. The efficiency of the model emphasizes that it can be deployed as a real-time model in any industrial network traffic that includes IoT based smart environment and fog-cloud computing network.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Efficient Hybrid Approach for Intrusion Detection in Cyber Traffic Using Autoencoders

Article 29 June 2023

Hybrid intrusion detection model based on a designed autoencoder

Article 09 September 2022

On the Use of Autoencoders in Unsupervised Learning for Intrusion Detection Systems

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Abualigah L, Diabat A, Mirjalili S, Elaziz MA, Gandomi AH (2021a) The arithmetic optimization algorithm. Comput Methods Appl Mech Eng 376(2):113609
MathSciNet MATH Google Scholar
Abualigah L, Yousri D, Elaziz MA, Ewees AA, Al-qaness MAA, Gandomi AH (2021b) Aquila optimizer: a novel meta-heuristic optimization algorithm. Comput Ind Eng 57(11):107250
Google Scholar
Abualigah L, Diabat A, Sumari P, Gandomi AH (2021c) Applications, deployments, and integration of internet of drones (IoD): a review. IEEE Sens J 21(22):25532–25546
Google Scholar
Ahmed AA, Jabbar WA, Sadiq AS, Patel H (2022) Deep learning based classification model for botnet attack detection. J Ambient Intell Human Comput 13:3457–3466
Google Scholar
Alsamiri J, Alsubhi K (2019) Internet of things cyber attacks detection using machine learning. Int J Adv Comput Sci Appl 10(12):627–634
Google Scholar
Anitha P, Kaarthick B (2019) Oppositional based Laplacian grey wolf optimization algorithm with SVM for data mining in intrusion detection system. J Ambient Intell Human Comput 12:3589–3600
Google Scholar
Attak H, Combalia M, Gardikis G, Gaston B et al (2018) Application of distributed computing and machine learning technologies to cybersecurity. In: The conference on artificial intelligence and cybersecurity, p 1–13
Aygun RC, Yavuz AG (2017) Network anomaly detection with stochastically improved autoencoder based models. In: 2017 IEEE 4th international conference on cyber security and cloud computing, IEEE, p 193–198
Baig MM, Awaisa MM, El-Alfy ESM (2017) A multi-class cascade of artificial neural network for network intrusion detection. J Intell Fuzzy Syst 32(4):2875–2883
Google Scholar
Bansal A, Kaur S (2018) Extreme gradient boosting based tuning for classification in intrusion detection systems. In: Singh M, Gupta P, Tyagi V, Flusser J, Oren T (eds) Advances in computing and data sciences. Springer, Singapore, pp 372–380
Google Scholar
Besharati E, Naderan M, Namjoo E (2018) LR-HIDS: logistic regression host-based intrusion detection system for cloud environments. J Ambient Intell Human Comput 10:3669–3692
Google Scholar
Catak FO, Mustacoglu AF (2019) Distributed denial of service attack detection using autoencoder and deep neural networks. J Intell Fuzzy Syst 37:3969–3979
Google Scholar
Chowdhury S, Liang B, Tizghadam A (2019) Explaining class-of-service oriented network traffic classification with super features. In:Proceedings of the 3rd ACM CoNEXT workshop on big data, machine learning and artificial intelligence for data communication networks. Association for computing machinery
Cuautla DG, Suarez AH, Perez GS (2020) Synthetic minority oversampling technique for optimizing classification tasks in botnet and intrusion-detection-system datasets. Appl Sci 10(3):794
Google Scholar
Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(1-4):131–156
Google Scholar
Dwibedi S, Pujari M, Sun W (2020) A comparative study on contemporary intrusion detection datasets for machine learning research. In: 2020 IEEE international conference on intelligence and security informatics (ISI), IEEE, 2020
Ferrag MA, Maglaras L, Ahmim A, Derdour M, Janicke H (2020a) RDTIDS: Rules and decision tree-based intrusion detection system for internet-of-things networks. Future Internet 12(3):44
Google Scholar
Ferrag MA, Maglaras L, Moschoyiannis S, Janicke H (2020b) Deep learning for cyber security intrusion detection: approaches, datasets, and comparative study. J Inf Secur Appl 50:102419
Google Scholar
Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38(4):367–378
MathSciNet MATH Google Scholar
Gu J, Lu S (2021) An effective intrusion detection approach using SVM with naïve Bayes feature embedding. Comput Secur 103:102158
Google Scholar
Gu Y, Li K, Guo Z, Wang Y (2019) Semi-supervised K-means DDoS detection method using hybrid feature selection algorithm. IEEE Access 7:64351–64365
Google Scholar
He W, Li H, Li J (2019) Ensemble features selection for improving intrusion detection classification accuracy. In: Proceedings of the 2019 international conference on artificial intelligence and computer science, p 28–33
Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Networks 2(5):359–366
MATH Google Scholar
Hsu YF, He ZY, Tarutani Y, Matsuoka M (2019) Toward an online network intrusion detection system based on ensemble learning. In: 12th international conference on cloud computing, IEEE, p 174–178
Idhammad M, Afde K, Belouch M (2018a) Distributed intrusion detection system for cloud environments based on data mining techniques. Procedia Comput Sci 127:35–41
Google Scholar
Idhammad M, Afde K, Belouch M (2018b) Detection system of HTTP DDoS attacks in a cloud environment based on information theoretic entropy and random forest. Secur Commun Netw 2018:1263123
Google Scholar
Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu TY (2017) LightGBM: a highly efficient gradient boosting decision tree. In: Proceedings of the 31st conference on advances in neural information processing systems (NIPS 2017), p 3149–3157
Khan FA, Gumaei A, Derhab A, Hussain A (2019) A novel two stage deep learning model for efficient network intrusion detection. IEEE Access 7:30373–30385
Google Scholar
Khraisat A, Gondal I, Vamplew P, Kamruzzaman J, Alazab A (2019) A novel ensemble of hybrid intrusion detection system for detecting internet of things attacks. Electronics 8(11):1210
Google Scholar
Koroniotis N, Moustafa N, Sitnikova E, Turnbull BP (2019) Towards the development of realistic botnet dataset in the internet of things for network forensic systems. Future Gener Comput Syst 100:779–796
Google Scholar
Kumar P, Gupta GP, Tripathi R (2021a) A distributed ensemble design based intrusion detection system using fog computing to protect the internet of things networks. J Ambient Intell Human Comput 12:9555–9572
Google Scholar
Kumar P, Gupta GP, Tripathi R (2021b) Toward design of an intelligent cyber attack detection system using hybrid feature reduced approach for IoT networks. Arab J Sci Eng 46:3749–3778
Google Scholar
Kumar P, Gupta GP, Tripathi R (2021c) An ensemble learning and fog-cloud architecture-driven cyber-attack detection framework for IoMT networks. Comput Commun 166:110–124
Google Scholar
Kumar P, Gupta GP, Tripathi R (2021d) Design of anomaly-based intrusion detection system using fog computing for IoT network. Autom Control Comput Sci 55:137–147
Google Scholar
Kunang YN, Nurmaini S, Stiawan D, Zarkasi A, Firdaus, Jasmir (2018) Automatic features extraction using autoencoder in intrusion detection system. In: 2018 International conference on electrical engineering and computer science, IEEE
Lashkari AH, Gil GD, Mamun MSI, Ghorbani AA (2017) Characterization of tor traffic using time based features. In: Proceedings of the 3rd international conference on information systems security and privacy (ICISSP 2017), SciTePress, p 253–262
Lee SC, Heinbuch DV (2001) Training a neural-network based intrusion detector to recognize novel attacks. IEEE Trans Syst Man Cybern Syst Hum 31(4):294–299
Google Scholar
Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Google Scholar
Li Q, Meng L, Zhang Y, Yan J (2019) DDoS attacks detection using machine learning algorithms. In: International forum on digital TV and wireless multimedia communications (IFTC 2018). Digital TV and multimedia communication 1009, p 205–216
Liao Y, Vemuri VR (2002) Use of K-nearest neighbour classifier for intrusion detection. Comput Secur 21:439–448
Google Scholar
Lopez AD, Mohan AP, Nair S (2019) Network traffic behavioural analytics for detection of DDoS attacks. SMU Data Sci Rev 2(1):14
Google Scholar
Mahajan HB, Badarla A (2020) Detecting HTTP vulnerabilities in IoT-based precision farming connected with cloud environment using artificial intelligence. Int J Adv Sci Technol 29(3):214–226
Google Scholar
Mamun MSI, Rathore MA, Lashkari AH, Stakhanova N, Ghorbani AA (2016) Detecting malicious URLs using lexical analysis. In: Chen J, Piuri V, Su C, Yung M (eds) Network and system security. Springer, Berlin, pp 467–482
Google Scholar
Meira J, Andrade R, Praça I, Carneiro J (2020) Performance evaluation of unsupervised techniques in cyber attack anomaly detection. J Ambient Intell Human Comput 11:4477–4489
Google Scholar
Meng L, Ding S, Xue Y (2016) Research on denoising autoencoder. Int J Mach Learn Cybern 8(5):1719–1729
Google Scholar
Mousavi SM, Majidnezhad V, Naghipour A (2022) A new intelligent intrusion detector based on ensemble of decision trees. J Ambient Intell Human Comput 13:3347–3359. https://link.springer.com/article/10.1007/s12652-019-01596-5
Article Google Scholar
Moustafa N, Slay J (2015). UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In: Proceedings of the 2015 Military Communications and Information Systems Conference (MilCIS), IEEE, pp 1–6.
Nazari Z, Noferesti M, Jalili R (2019) DSCA: an inline and adaptive application identification approach in encrypted network traffic. In: Proceedings of the 3rd international conference on cryptography, security and privacy, p 39–43.
Negandhi P, Trivedi Y, Mangrulkar R (2009) Intrusion detection system using random forest on the NSL-KDD dataset. In: Shetty N, Patnaik L, Nagaraj H, Hamsavath P, Nalini (eds) Emerging research in computing, information, communication and applications. Advances in intelligent systems and computing, vol 906. Springer, Berlin, pp 519–531
Google Scholar
Ossowicka AD, Pietrołaj M, Rumiński J (2021) A survey of neural networks usage for intrusion detection systems. J Ambient Intell Human Comput 12:497–514
Google Scholar
Panda M, Patra MR (2007) Network intrusion detection using naive bayes. Int J Comput Sci Netw Secur 7:258–263
Google Scholar
Park K, Song Y, Cheong YG (2018) Classification of attack types for intrusion detection systems using a machine learning algorithm. In: 2018 IEEE fourth international conference on big data computing service and applications (BigDataService), IEEE, p 282–286
Pattawaro A, Polprasert C (2018) Anomaly-based net work intrusion detection system through feature selection and hybrid machine learning technique. In: 2018 16th international conference on ICT and knowledge, IEEE, 2018.
Peng K, Leung VCM, Zheng L, Wang S, Huang C, Lin T (2018) Intrusion detection system based on decision tree over big data in fog environment. Wirel Commun Mob Comput 2018:4680867
Google Scholar
Qureshi AS, Khan A, Shamim N, Durad MH (2019) Intrusion detection using deep sparse auto-encoder and self-taught learning. Neural Comput Appl 32:3135–3147
Google Scholar
Razdan S, Gupta H, Seth A (2021) Performance analysis of network intrusion detection systems using J48 and naive Bayes algorithms. In: 2021 6th International conference for convergence in technology (I2CT), IEEE, p 1–7
Ring M, Wunderlich S, Gruedl D, Landes D, Hotho A (2017a) Creation of flow-based data sets for intrusion detection. J Inf Warf 16(4):41–54
Google Scholar
Ring M, Wunderlich S, Gruedl D, Landes D, Hotho A (2017b) Flow-based benchmark data sets for intrusion detection. In: Proceedings of the 16th European conference on cyber warfare and security, p 361–369
Ring M, Wunderlich S, Scheuring D, Landes D (2019) A survey of network-based intrusion detection data sets. Comput Secur 86:147–167
Google Scholar
Safaldin M, Otair M, Abualigah L (2021) Improved binary gray wolf optimizer and SVM for intrusion detection system in wireless sensor networks. J Ambient Intell Human Comput 12(11):1559–1576
Google Scholar
Sahu S, Mehtre BM (2015) Network intrusion detection system using J48 decision tree. In: International conference on advances in computing, communications and informatics, IEEE, p 2023–2026
Santikellur P, Haque T, Zewairi MA, Chakraborty R (2019) Optimized multi-layer hierarchical network intrusion detection system with genetic algorithms. In: 2019 2nd International conference on new trends in computing sciences, IEEE, p 1–7
Sharafaldin I, Lashkari AH, Ghorbani AA (2018) Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: Proceedings of the 4th international conference on informa tion systems security and privacy (ICISSP 2018), SciTePress, p 108–116.
Shukla P, Rai R (2017) Ara-mac: attacker identification using logistic regression. In: 2017 International conference on recent innovations in signal processing and embedded systems, IEEE, p 124–128
Song J, Takakura H, Okabe Y, Eto M et al (2011) Statistical analysis of honeypot data and building of Kyoto 2006+ dataset for NIDS evaluation. In: Proceedings of the first workshop on building analysis datasets and gathering experience returns for security, p 29–36
Sumathi S, Karthikeyan N (2020) Detection of distributed denial of service using deep learning neural network. J Ambient Intell Human Comput 12:5943–5953. https://link.springer.com/article/10.1007/s12652-020-02144-2
Article Google Scholar
Thakkar A, Lohiya R (2021) Attack classification using feature selection techniques: a comparative study. J Ambient Intell Human Comput 12:1249–1266
Google Scholar
Ucar E, Ucar M, Incetas MO (2019) A deep learning approach for detection of malicious URLs. In: Proceedings of the international management informa tion systems conference, IEEE, p 10–16
Ullah I, Mahmoud Q H (2020b) A scheme for generating a dataset for anomalous activity detection in IoT networks. In: Proceedings of the 33rd Canadian conference on artificial intelligence (Canadian AI 2020b), Spring, pp.508–520.
Ullah I, Mahmoud QH (2020a) A two-level flow-based anomalous activity system for IoT networks. Electronics 9(3):530
Google Scholar
Verma A, Ranga V (2020) Machine learning based intrusion detection systems for IoT applications. Wirel Pers Commun 111:2287–2310
Google Scholar
Vijayakumar R, Alazab M, Soman KP, Poornachandran P, Al-Nemrat A, Venkatrama S (2019) Deep learning approach for intelligent intrusion detection system. IEEE Access 7:41525–41550
Google Scholar
Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA (2010) Denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11:3371–3408
MathSciNet MATH Google Scholar
Yan J, Jin D, Lee CW, Liu PA (2018) Comparative study of off-line deep learning-based network intrusion detection. In: Tenth international conference on ubiquitous and future networks, IEEE, p 299–304
Zaman M, Lung CH (2018) Evaluation of machine learning techniques for network intrusion detection. In: 2018 IEEE/IFIP conference on network operations and management symposium, IEEE, p 1–5
Zhang J, Mucs D, Norinder U, Svensson F (2019) LightGBM: an effective and scalable algorithm for prediction of chemical toxicity—application to the Tox21 and Mutagenicity data sets. J Chem Inf Model 59(10):4150–4158
Google Scholar

Download references

Funding

Funding was provided by Ministry of Higher Education, Malaysia (grant no. TRGS/1/2016/UTAR/01/2/2).

Author information

Authors and Affiliations

Universiti Tunku Abdul Rahman, Kajang, Malaysia
Sheikh Abdul Hameed Ayubkhan, Wun-She Yap & Ezra Morris
University of Nottingham, Semenyih, Malaysia
Mumtaj Begam Kasim Rawthar

Authors

Sheikh Abdul Hameed Ayubkhan
View author publications
You can also search for this author in PubMed Google Scholar
Wun-She Yap
View author publications
You can also search for this author in PubMed Google Scholar
Ezra Morris
View author publications
You can also search for this author in PubMed Google Scholar
Mumtaj Begam Kasim Rawthar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wun-She Yap.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ayubkhan, S.A.H., Yap, WS., Morris, E. et al. A practical intrusion detection system based on denoising autoencoder and LightGBM classifier with improved detection performance. J Ambient Intell Human Comput 14, 7427–7452 (2023). https://doi.org/10.1007/s12652-022-04449-w

Download citation

Received: 18 September 2021
Accepted: 04 October 2022
Published: 08 November 2022
Issue Date: June 2023
DOI: https://doi.org/10.1007/s12652-022-04449-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A practical intrusion detection system based on denoising autoencoder and LightGBM classifier with improved detection performance

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Efficient Hybrid Approach for Intrusion Detection in Cyber Traffic Using Autoencoders

Hybrid intrusion detection model based on a designed autoencoder

On the Use of Autoencoders in Unsupervised Learning for Intrusion Detection Systems

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A practical intrusion detection system based on denoising autoencoder and LightGBM classifier with improved detection performance

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Efficient Hybrid Approach for Intrusion Detection in Cyber Traffic Using Autoencoders

Hybrid intrusion detection model based on a designed autoencoder

On the Use of Autoencoders in Unsupervised Learning for Intrusion Detection Systems

Explore related subjects

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation