Abstract
In network intrusion detection, the frequency of some rare network attacks is low, and such samples collected are relatively few. It results in an imbalanced proportion of each category in the dataset. Training the classifier with imbalanced datasets will bias the classifier to majority class samples and affect the classification performance on minority class samples. In response to this problem, researchers usually increase minority class samples and reduce majority class samples to get a balanced dataset. Therefore, we propose a data balancing technique based on AutoEncoder-Flow (AE-Flow) Model. Firstly, we use AutoEncoder (AE) to improve the deep generative model-Flow, obtaining AE-Flow. Then we use it to learn the distribution of minority class samples and generate new samples. Secondly, we use K-means and OneSidedSelection (OSS) algorithms to finish the undersampling of majority class samples. Finally we get a balanced dataset and use machine learning (ML) classifier to finish intrusion detection. We conducted comparative experiments on NSL-KDD dataset. The experimental results show that the balanced dataset obtained by our proposed method can effectively improve the Recall rate on minority class samples and the classification performance on overall samples.
Supported by the National Natural Science Foundation of China (51808079), Chongqing Research Program of Basic Research and Frontier Technology (cstc2017jcyjAX0470, cstc2017jcyjAX0135), the Science and Technology Research Program of Chongqing Municipal Education Commission (No. KJQN201801908).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Besharati, E., Naderan, M., Namjoo, E.: LR-HIDS: logistic regression host-based intrusion detection system for cloud environments. J. Ambient. Intell. Humaniz. Comput. 10, 3669–3692 (2019)
Dong, R.H., Shui, Y.L., Zhang, Q.Y.: Intrusion detection model based on feature selection and random forest. Int. J. Netw. Security 23(6), 985–996 (2021)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, P.: SMOTE: Synthetic Minority oversampling Technique. Journal of Artificial Intelligence Research 16, 321–357 (2002)
Yang, Y.Q., Zheng, K.F., Wu, C.H., Yang, Y.X.: Improving the classification effectiveness of intrusion detection by using improved conditional variational autoencoder and deep neural network. Sensors 19(11), 1–20 (2019)
Azmin, S.H., Islam, A.B.: Network Intrusion Detection System based on Conditional Variational Laplace AutoEncoder. In: 7th International Conference on Networking, Systems and Security, pp. 82–87. ACM, Dhaka, Bangladesh (2020). https://doi.org/10.1145/3428363.3428371
Dlamini, G., Fahim, M.: DGM: a data generative model to improve minority class presence in anomaly detection domain. Neural Comput. Appl. 33(20), 13635–13646 (2021). https://doi.org/10.1007/s00521-021-05993-w
Liu, X.D., et al.: A GAN and feature selection-based oversampling technique for intrusion detection. Security Commun. Netw. 2021, 1–15 (2021)
Kingma, D.P., Dhariwal, P.: Glow: Generative Flow with Invertible 1x1 Convolutions. arXiv preprint arXiv:1807.03039 (2018)
Dinh, L., Krueger, D., Bengio, Y.: NICE: Non-linear Independent Components Estimation. arXiv preprint arXiv:1410.8516 (2014)
Rezende, D.J., Mohamed, S.: Variational Inference with Normalizing Flows. In: the 32nd International Conference on Machine Learning, pp. 1530–1538. ACM, Lille, France (2015)
Dinh, L., Sohl-Dickstein, J., Bengio, S.: Density estimation using Real NVP. arXiv preprint arXiv:1605.08803 (2017)
Andresini, G., Appice, A., Malerba, D.: Autoencoder-based deep metric learning for network intrusion detection. Inf. Sci. 569, 706–727 (2021)
Grover, A., Dhar, M., Ermon, S.: Flow-GAN: Combining Maximum Likelihood and Adversarial Learning in Generative Models. In: the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), pp. 3069–3076. AAAI, New Orleans, USA (2018). https://doi.org/10.1609/aaai.v32i1.11829
Guo, T., Lu, X.P., Yu, K.P., Zhang, Y.X., Wei, W.: Integration of light curve brightness information and layered discriminative constrained energy minimization for automatic binary asteroid detection. IEEE Trans. Aerosp. Electron. Syst. 2022, 1–20 (2022)
Khan, M.A., Khattk, M.A.K., Latif, S.: Voting Classifier-Based Intrusion Detection for IoT Networks. In: Advances on Smart and Soft Computing ICACIn 2021 (AAAI-18), pp. 313–328. Springer, Casablanca, Morocco (2021). https://doi.org/10.1007/978-981-16-5559-326
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Xiong, X. et al. (2023). Data Balancing Technique Based on AE-Flow Model for Network Instrusion Detection. In: Gao, F., Wu, J., Li, Y., Gao, H. (eds) Communications and Networking. ChinaCom 2022. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 500. Springer, Cham. https://doi.org/10.1007/978-3-031-34790-0_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-34790-0_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-34789-4
Online ISBN: 978-3-031-34790-0
eBook Packages: Computer ScienceComputer Science (R0)