Skip to main content
Log in

GAN-based imbalanced data intrusion detection system

  • Original Article
  • Published:
Personal and Ubiquitous Computing Aims and scope Submit manuscript

Abstract

According to the development of deep learning technologies, a wide variety of research is being performed to detect intrusion data by using vast amounts of data. Although deep learning performs more accurately than machine learning algorithms when learning large amounts of data, the performance declines significantly in the case of learning from imbalanced data. And, while there are many studies on imbalanced data, most have weaknesses that can result in data loss or overfitting. The purpose of this study is to solve data imbalance by using the Generative Adversarial Networks (GAN) model, which is an unsupervised learning method of deep learning which generates new virtual data similar to the existing data. It also proposed a model that would be classified as Random Forest to identify detection performance after addressing data imbalances based on a GAN. The results of the experiment showed that the performance of the model proposed in this paper was better than the model classified without addressing the imbalance of data. In addition, it was found that the performance of the model proposed in this paper was excellent when compared with other models that were previously used widely for the data imbalance problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Tsiropoulou EE, Baras JS, Papavassiliou S, Qu G (2016) On the mitigation of interference imposed by intruders in passive RFID networks. International Conference on Decision and Game Theory for Security, GameSec, vol 9996, pp 62–80

  2. Yan B, Han G (2018) Effective feature extraction via stacked sparse autoencoder to improve intrusion detection system. IEEE Access, IEEE, pp 41238–41248

  3. Su Y, Qi K, Di C, Ma Y, Li S (2018) Learning automata based feature selection for network traffic intrusion detection. 2018 IEEE Third International Conference on Data Science in Cyberspace, IEEE, pp 622–627

  4. Bitaab M, Hashemi S (2017) Hybrid intrusion detection: combining decision tree and Gaussian mixture model. 2017 14th International ISC (Iranian Society of Cryptology) Conference on Information Security and Cryptology (ISCISC), IEEE, pp 8–12

  5. Ebenuwa SH, Sharif MS, Alazab M, Al-Nemrat A (2019) Variance ranking attributes selection techniques for binary classification problem in imbalance data. IEEE Access, IEEE, pp 24649–24666

  6. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: Synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Article  Google Scholar 

  7. Zhu L, Chen Y, Ghamisi P, Benediktsson JA (2018) Generative adversarial networks for hyperspectral image classification. IEEE Trans Geosci Remote Sens, IEEE, pp 5046–5063

  8. Lin P, Ye K, Xu C-Z (2019) Dynamic network anomaly detection system by using deep learning techniques. In: International Conference on Cloud Computing. Springer, Berlin, pp 161–176

    Google Scholar 

  9. Ahmad I, Basheri M, Iqbal MJ, Rahim A (2018) Performance comparison of support vector machine, random forest, and extreme learning machine for intrusion detection. IEEE Access IEEE 6:33789–33795

    Article  Google Scholar 

  10. BogazZarpelão B, SanchesMiani R, ToshioKawakani C, Alvarenga SC (2017) A survey of intrusion detection in Internet of Things. J Netw Comput Appl ELSEVIER 84:25–37

    Article  Google Scholar 

  11. Hu J, Yang H, Lyu R, King I, So AM-C (2018) Online nonlinear AUC maximization for imbalanced datasets. IEEE Trans Neural Netw Learn Syst, IEEE, pp 882–895

  12. Yan Y, Liu R, Ding Z, Xiuquan D, Chen J, Zhang Y (2019) A parameter-free cleaning method for smote in imbalanced classification. IEEE Access, IEEE, Piscataway, pp 22537–23548

    Google Scholar 

  13. Yan B, Han G, Sun M, Ye S (2017) A novel region adaptive SMOTE algorithm for intrusion detection on imbalanced problem. In: Proc. of 3rd IEEE International Conference on Computer and Communications (ICCC)(2017). IEEE, Piscataway, pp 1281–1286

    Google Scholar 

  14. Sun Y, Liu F (2016) SMOTE-NCL: A re-sampling method with filter for network intrusion detection. Proc. of 2nd IEEE International Conference on Computer and Communications (ICCC), IEEE, pp 1157–1161.

  15. Kim J-Y, Seok-Jun B, Cho S-B (2017) Malware detection using deep transferred generative adversarial networks. In: ICONIP: International Conference on Neural Information Processing Neural Information Processing. Springer, Berlin, pp 556–564

    Chapter  Google Scholar 

  16. Sharafaldin I, Lashkari AH, Ghorbani AA (2019) Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: 4th International Conference on Information Systems Security and Privacy (ICISSP 2018). Scitepress, Setúbal, pp 108–116

    Google Scholar 

  17. Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Neural Information Processing Systems, NIPS, pp 2672–2680

  18. Park K, Song Y, Cheong Y (2018) Classification of attack types for intrusion detection systems using a machine learning algorithm. In: IEEE Fourth International Conference on Big Data Computing Service and Applications. IEEE, Piscataway, pp 282–286

    Google Scholar 

  19. Confusion matrix, Available online: https://en.wikipedia.org/wiki/Confusion_matrix (12. 3. 2018)

  20. Gey MBF (1994) The relationship between recall and precision. J Am Soc Inf Sci 45:12–19

    Article  Google Scholar 

  21. Abebe T, Lalitha Bhaskari D (2013) Intrusion detection using random forests classifier with SMOTE and feature reduction. In: International Conference on Cloud & Ubiquitous Computing & Emerging Technologies. IEEE, Piscataway, pp 127–132

    Google Scholar 

Download references

Funding

This research was supported by the Basic Science Research Programs through the National Research Foundation of Korea (NRF), funded by the Ministry of Education, Science and Technology (No. NRF-2018R1D1A1B07043982)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to KeeHyun Park.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, J., Park, K. GAN-based imbalanced data intrusion detection system. Pers Ubiquit Comput 25, 121–128 (2021). https://doi.org/10.1007/s00779-019-01332-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00779-019-01332-y

Keywords

Navigation