Skip to main content
Log in

A convolutional neural network intrusion detection method based on data imbalance

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

With the rapid development of Internet technology, network attacks occur frequently and numerous hidden dangers appear in network security. Therefore, improving the performance of intrusion detection systems to detect and defend against attacks is the key to ensuring network security. However, in the face of complex and massive network data feature information, traditional machine learning methods suffer from data imbalance and feature redundancy, which results in low detection rates, high false alarm rates and poor real-time performance of intrusion detection systems. Therefore, to address these problems, this paper proposes a data imbalance-based Convolutional Neural Network Intrusion Detection Method (CNN-IDMDI). First, an oversampling method is used to solve the data imbalance problem by decomposing the increased number of samples for the few attacks with multiple sampling to form multiple sub-samples. Second, the gradient coordination mechanism and the improved loss function Focal Loss are combined to calculate the loss between the actual and expected values to detect network malicious attacks in high-dimensional and unbalanced data. Finally, the methods in this paper are compared with the current mainstream intrusion detection methods on the NSL-KDD dataset for binary and multi-classification detection. The experimental results show that the method in this paper can effectively improve the effectiveness of CNN intrusion detection and network anomaly. The average accuracy of the CNN intrusion detection method based on data imbalance for binary intrusion detection is 98.73% and the implementation time of the method is 1.42 s, which is 15.45%, 12.76%, and 2.91% higher than the average accuracy of the CNN, the CNN Long Short-Term Memory (CNN-LSTM) and the CNN Neural-induced Support Vector Machine (CNN-NSVM) methods, respectively, and the detection time is saved by 0.82 s, 0.72 s, and 0.61 s, respectively. The average accuracy of the CNN intrusion detection method based on data imbalance for multi-classification intrusion detection is 94.55% and the time required to complete the detection is 2.96 s. This improves the average accuracy by 16.09%, 12.71%, and 3.66% compared with the CNN, CNN-LSTM and CNN-NSVM methods, respectively. It is also quicker, as the time consumption of CNN is 8.84 s, CNN-LSTM is 8.31 s and CNN-NSVM is 6.43 s. Therefore, the CNN-IDMDI method for intrusion detection proposed in this paper has higher accuracy and faster speed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

A significant amount of data is presented in this article. The remaining data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.

References

  1. Xia LM, Li ZM (2021) A new method of abnormal behavior detection using LSTM network with temporal attention mechanism. J Supercomput 77(4):773223–773241. https://doi.org/10.1007/s11227-020-03391-y

    Article  Google Scholar 

  2. Zhu YK, Gaba GS, Almansour FM, Alroobaea R, Masud M (2021) Application of data mining technology in detecting network intrusion and security maintenance. J Intell Syst 30(1):664–676. https://doi.org/10.1515/jisys-2020-0146

    Article  Google Scholar 

  3. Lu HM, Wang T, Xu X, Wang T (2022) Cognitive memory-guided autoencoder for effective intrusion detection in internet of things. IEEE Trans Industr Inf 18(5):3358–3366. https://doi.org/10.1109/TII.2021.3102637

    Article  Google Scholar 

  4. Ramalingam GP, Annie RAX, Gopalakrishnan S (2022) Optimized fuzzy enabled semi-supervised intrusion detection system for attack prediction. Intell Autom Soft Comput 32(3):1479–1492. https://doi.org/10.32604/iasc.2022.022211

    Article  Google Scholar 

  5. Guo YQ, Wang X (2021) Applying TS-DBN model into sports behavior recognition with deep learning approach. J Supercomput 77(10):12192–12208. https://doi.org/10.1007/s11227-021-03772-x

    Article  Google Scholar 

  6. Czaplewski B, Zwonkowski MD (2022) A novel approach exploiting properties of convolutional neural networks for vessel movement anomaly detection and classification. ISA Trans 119:1–16. https://doi.org/10.1016/j.isatra.2021.02.030

    Article  Google Scholar 

  7. Abbas S, Alhwaiti Y, Fatima A, Khan MA (2022) Convolutional neural network based intelligent handwritten document recognition. CMC-Comput Mater Contin 70(3):4563–4581. https://doi.org/10.32604/cmc.2022.021102

    Article  Google Scholar 

  8. Dong YN, Liu QW, Du B, Zhang LP (2022) Weighted feature fusion of convolutional neural network and graph attention network for hyperspectral image classification. IEEE Trans Image Process 31:1559–1572. https://doi.org/10.1109/TIP.2022.3144017

    Article  Google Scholar 

  9. Liu GJ, Zhang JB (2020) CNID: research of network intrusion detection based on convolutional neural network. Discret Dyn Nat Soc. https://doi.org/10.1155/2020/4705982

    Article  Google Scholar 

  10. Yang J, Sheng YQ, Wang JL (2020) A GBDT-paralleled quadratic ensemble learning for intrusion detection system. IEEE Access 8:175467–175482. https://doi.org/10.1109/ACCESS.2020.3026044

    Article  Google Scholar 

  11. Wang H, Cao ZJ, Hong B (2020) A network intrusion detection system based on convolutional neural network. J Int Fuzzy Syst 38(6):7623–7637. https://doi.org/10.3233/JIFS-179833

    Article  Google Scholar 

  12. Lopez-Martin M, Sanchez-Esguevillas A, Arribas JI (2022) Supervised contrastive learning over prototype-label embeddings for network intrusion detection. Inform Fus 79:200–228. https://doi.org/10.1016/j.inffus.2021.09.014

    Article  Google Scholar 

  13. Alsaleh A, Binsaeedan W (2021) The influence of salp swarm algorithm-based feature selection on network anomaly intrusion detection. IEEE Access 9:112466–112477. https://doi.org/10.1109/ACCESS.2021.3102095

    Article  Google Scholar 

  14. Tang ZY, Hu HY, Xu CH (2021) A federated learning method for network intrusion detection. Concurr Comput-Pract Exp. https://doi.org/10.1002/cpe.6812

    Article  Google Scholar 

  15. Liu J, Zhao HB (2021) Application of convolution neural network in medical image processing. Technol Health Care 29(2):407–417. https://doi.org/10.3233/THC-202657

    Article  Google Scholar 

  16. Ortac G, Ozcan G (2021) Comparative study of hyperspectral image classification by multidimensional convolutional neural network approaches to improve accuracy. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2021.115280

    Article  Google Scholar 

  17. Zhang CY, Ren J, Liu FC, Li XQ (2022) Three-way selection random forest algorithm based on decision boundary entropy. Appl Intell. https://doi.org/10.1007/s10489-021-03033-7

    Article  Google Scholar 

  18. Lu JZ, Liu XL, Zhang SB, Chang Y (2020) Research and analysis of electromagnetic Trojan detection based on deep learning. Secur Commun Netw. https://doi.org/10.1155/2020/6641844

    Article  Google Scholar 

  19. Mao BF, Liu J, Lai YX, Sun MT (2021) MIF: a multi-step attack scenario reconstruction and attack chains extraction method based on multi-information fusion. Comput Netw. https://doi.org/10.1016/j.comnet.2021.108340

    Article  Google Scholar 

  20. Yu YW, Bian NZ (2020) An intrusion detection method using few-shot learning. IEEE Access 8:49730–49740. https://doi.org/10.1109/ACCESS.2020.2980136

    Article  Google Scholar 

  21. Liang XW, Jiang AP, Li T, Xue YY, Wang GT (2020) LR-SMOTE - An improved unbalanced data set oversampling based on K-means and SVM. Knowl-Based Syst. https://doi.org/10.1016/j.knosys.2020.105845

    Article  Google Scholar 

  22. Wang L, Han M, Li XJ, Zhang N, Cheng HD (2021) Review of classification methods on unbalanced data sets. IEEE Access 9:64606–64628. https://doi.org/10.1109/ACCESS.2021.3074243

    Article  Google Scholar 

  23. He Y, Leng X, Wan J (2021) Unbalanced data weighted boundary point integration undersampling method. J Xid Univ 48(4):176–183

    Google Scholar 

  24. Bendjoudi I, Vanderhaegen F, Hamad D, Dornaika F (2021) Multi-label, multi-task CNN approach for context-based emotion recognition. Inform Fus 76:422–428. https://doi.org/10.1016/j.inffus.2020.11.007

    Article  Google Scholar 

  25. Hossain MS, Betts JM, Paplinski AP (2021) Dual focal loss to address class imbalance in semantic segmentation. Neurocomputing 462:69–87. https://doi.org/10.1016/j.neucom.2021.07.055

    Article  Google Scholar 

  26. Wang Z, Shi PB (2021) CAPTCHA recognition method based on CNN with focal loss. Complexity. https://doi.org/10.1155/2021/6641329

    Article  Google Scholar 

  27. Dong YF, Shen XH, Jiang Z, Wang HY (2021) Recognition of imbalanced underwater acoustic datasets with exponentially weighted cross-entropy loss. Appl Acoust. https://doi.org/10.1016/j.apacoust.2020.107740

    Article  Google Scholar 

  28. Karabayir I, Akbilgic O, Tas N (2021) A novel learning algorithm to optimize deep neural networks: evolved gradient direction optimizer (EVGO). IEEE Trans Neural Netw Learning Syst 32(2):685–694. https://doi.org/10.1109/TNNLS.2020.2979121

    Article  MathSciNet  Google Scholar 

  29. Song CY, Pons A, Yen K (2021) AG-SGD: angle-based stochastic gradient descent. IEEE Access 9:23007–23024. https://doi.org/10.1109/ACCESS.2021.3055993

    Article  Google Scholar 

  30. Cheridito P, Jentzen A, Rossmannek F (2021) Non-convergence of stochastic gradient descent in the training of deep neural networks. J Complex. https://doi.org/10.1016/j.jco.2020.101540

    Article  MathSciNet  MATH  Google Scholar 

  31. Shin K, Han J, Kang S (2021) MI-MOTE: Multiple imputation-based minority oversampling technique for imbalanced and incomplete data classification. Inf Sci 575:80–89. https://doi.org/10.1016/j.ins.2021.06.043

    Article  MathSciNet  Google Scholar 

  32. Wang JR, Li SM (2019) Batch-normalized deep neural networks for achieving fast intelligent fault diagnosis of machines. Neurocomputing 329:53–65. https://doi.org/10.1016/j.neucom.2018.10.049

    Article  Google Scholar 

  33. Yu YL, Liu FX (2019) Effective neural network training with a new weighting mechanism-based optimization algorithm. IEEE Access 7:72403–72410. https://doi.org/10.1109/ACCESS.2019.2919987

    Article  Google Scholar 

  34. Gurung S, Ghose MK, Subedi A (2019) Deep learning approach on network intrusion detection system using NSL-KDD dataset. Int J Comput Netw Inform Secur 11(3):8–14. https://doi.org/10.5815/ijcnis.2019.03.02

    Article  Google Scholar 

  35. Li SQ, Zhang ZY, Liu Y (2020) A short-term traffic flow reliability prediction method considering traffic safety. Math Probl Eng. https://doi.org/10.1155/2020/6682216

    Article  MATH  Google Scholar 

  36. Hosseini S (2020) A new machine learning method consisting of GA-LR and ANN for attack detection. Wireless Netw 26(6):4149–4162. https://doi.org/10.1007/s11276-020-02321-3

    Article  Google Scholar 

  37. Wei Y, Hui C, Ze YC (2021) Tibetan text classification based on RNN. In: 4TH International Conference on Advanced Algorithms And Control Engineering (ICAACE 2021), 29-31 Jan, 2021, Sanya, China, Journal Of Physics: Conference series: 012139. https://doi.org/10.1088/1742-6596/1848/1/012139

  38. Zhao L, Fang W (2021) An Efficient and Flexible Automatic Search Algorithm for Convolution Network Architectures. In: 2021 IEEE Congress on Evolutionary Computation (CEC 2021), Jun 28-JUL 01, 2021, KRAKÓW, Poland, IEEE Congress on Evolutionary Computation:2203–2210. https://doi.org/10.1109/CEC45853.2021.9504945

  39. Zhao RJ, Li ZJ, Xue Z (2021) A Novel Approach based on Lightweight Deep Neural Network for Network Intrusion Detection. In: 2021 IEEE Wireless Communications And Networking Conference (WCNC), MAR 29-APR 01, 2021, Nanjing, China, IEEE Computer Society:1–6. https://doi.org/10.1109/WCNC49053.2021.9417568

  40. Yan XG, Gao L (2020) A feature extraction and classification algorithm based on improved sparse auto-encoder for round steel surface defects. Math Biosci Eng 17(5):5369–5394. https://doi.org/10.3934/mbe.2020290

    Article  MATH  Google Scholar 

  41. Li JM, Wu WF, Xue D (2020) Research on transfer learning algorithm based on support vector machine. J Int Fuzzy Syst 38(4):4091–4106. https://doi.org/10.3233/JIFS-190055

    Article  Google Scholar 

  42. Xing H, Cheng L (2019) A design method for deep belief network based on reinforcement learning. Control Engineering Of China: pp 262115–2120.

  43. Chen J, Miao YK (2021) Study on network security intrusion target detection method in big data environment. Int J Internet Protoc Technol 14(4):240–247. https://doi.org/10.1504/IJIPT.2021.118966

    Article  Google Scholar 

  44. Wang DB, Xu GY (2020) Research on the detection of network intrusion prevention with SVM based optimization algorithm. Inform-An Int J Comput Inform 44(2):269–273. https://doi.org/10.31449/inf.v44i2.3195

    Article  Google Scholar 

  45. Pan T, Zhao J, Wu W, Yang J (2020) Learning imbalanced datasets based on SMOTE and Gaussian distribution. Inf Sci 512:1214–1233. https://doi.org/10.1016/j.ins.2019.10.048

    Article  Google Scholar 

  46. Suwannalai E, Polprasert C (2020) Network intrusion detection systems using adversarial reinforcement learning with deep q-network. In: 18TH International Conference On ICT And Knowledge Engineering (ICT&KE),18–20 NOV 2020, Bangkok, Thailand, IEEE Computer Society:1–7. https://doi.org/10.1109/ICTKE50349.2020.9289884

Download references

Acknowledgements

This research reports results from the scientific research project of special projects in key areas of the Guangdong Provincial Department of Education (No.2021ZDZX1104); Basic and Applied Basic Research Project of Guangzhou Basic Research Program in 2022 (No. 202201010106); Guangzhou Philosophy and Social Science Planning Project (No. 2022GZGJ241); Key scientific research projects of Guangzhou Nanyang Polytechnic (No. NY2021KYZD01); Guangdong Provincial Department of Education (No. 2020ZDZX3096); Key projects of social science and technology development in Dongguan under Grant (No. 2020507156156); Special fund for Dongguan's Rural Revitalization Strategy in 2021 (No. 20211800400102); Dongguan special commissioner project (No. 20211800500182); Dongguan Joint fund for Basic and Applied Research of Guangdong Province (No. 2020A1515110162).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Baiqiang Gan or Yuqiang Chen.

Ethics declarations

Conflict of interest

The authors declare no potential conflict of interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gan, B., Chen, Y., Dong, Q. et al. A convolutional neural network intrusion detection method based on data imbalance. J Supercomput 78, 19401–19434 (2022). https://doi.org/10.1007/s11227-022-04633-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04633-x

Keywords

Navigation