Combining Oversampling with Recurrent Neural Networks for Intrusion Detection

Wang, Jenq-Haur; Septian, Tri Wanda

doi:10.1007/978-3-030-73216-5_21

Combining Oversampling with Recurrent Neural Networks for Intrusion Detection

Jenq-Haur Wang¹⁶ &
Tri Wanda Septian¹⁷

Conference paper
First Online: 06 April 2021

1044 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12680))

Abstract

Previous studies on intrusion detection focus on analyzing features from existing datasets. With various types of fast-changing attacks, we need to adapt to new features for effective protection. Since the real network traffic is very imbalanced, it’s essential to train appropriate classifiers that can deal with rare cases. In this paper, we propose to combine oversampling techniques with deep learning methods for intrusion detection in imbalanced network traffic. First, after preprocessing with data cleaning and normalization, we use feature importance weights generated from ensemble decision trees to select important features. Then, the Synthetic Minority Oversampling Technique (SMOTE) is used for creating synthetic samples from minority class. Finally, we use Recurrent Neural Networks (RNNs) including Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) for classification. In our experimental results, oversampling improves the performance of intrusion detection for both machine learning and deep learning methods. The best performance can be obtained for CIC-IDS2017 dataset using LSTM classifier with an F1-score of 98.9%, and for CSE-CIC-IDS2018 dataset using GRU with an F1-score of 98.8%. This shows the potential of our proposed approach in detecting new types of intrusion from imbalanced real network traffic.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

IDS 2017 | Datasets | Research | Canadian Institute for Cybersecurity | UNB. www.unb.ca, 2017. https://www.unb.ca/cic/datasets/ids-2017.html. Accessed 15 June 2019
A Realistic Cyber Defense Dataset (CSE-CIC-IDS2018) (2018). https://registry.opendata.aws/cse-cic-ids2018/. Accessed 15 June 2019
Sharafaldin, I., Lashkari, A.H., Ghorbani, A.A., Habibi Lashkari, A., Ghorbani, A.A.: Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: Proceedings of 4th International Conference. Information System Security Privacy, pp. 108–116 (2018)
Google Scholar
Louppe, G., Wehenkel, L., Sutera, A., Geurts, P.: Understanding variable importances in forests of randomized trees. In: Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 1 (NIPS 2013), pp. 431–439 (2013)
Google Scholar
Chawla, K.W., Bowyer, L., Hall, O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article Google Scholar
Chen, J., Luo, D., Mu, F.: An improved ID3 decision tree algorithm. In: 2009 4th International Conference on Computer Science & Education, pp. 127–130 (2009)
Google Scholar
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Mach. Learn 29, 131–163 (1997)
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Chung, J., Gülçehre, Ç., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS 2014 Workshop on Deep Learning (2014)
Google Scholar
Albayati, M., Issac, B.: Analysis of intelligent classifiers and enhancing the detection accuracy for intrusion detection system. Int. J. Comput. Intell. Syst. 841–853 (2015)
Google Scholar
Almseidin, M., Alzubi, M., Kovacs, S., Alkasassbeh, M.: Evaluation of machine learning algorithms for intrusion detection system. In: 2017 IEEE 15th International Symposium Intelligent System Informatics (SISY), pp. 277–282 (2017)
Google Scholar
Khuphiran, P., Leelaprute, P., Uthayopas, P., Ichikawa, K., Watanakeesuntorn, W.: Performance comparison of machine learning models for DDoS attacks detection. In: 2018 22nd International Computer Science and Engineering Conference (ICSEC), pp. 1–4 (2018)
Google Scholar
Althubiti, S.A., Jones, E.M., Roy, K.: LSTM for anomaly-based network intrusion detection. In: 2018 28th International Telecommunication Networks and Applications Conference (ITNAC), pp.1–3 (2018)
Google Scholar
Xu, C., Shen, J., Du, X., Zhang, F.: An intrusion detection system using a deep neural network with gated recurrent units. IEEE Access 6, 48697–48707 (2018)
Article Google Scholar
Alazzam, H., Sharieh, A., Sabri, K.E.: A feature selection algorithm for intrusion detection system based on Pigeon Inspired Optimizer. Expert Syst. Appl. 148 (2020)
Google Scholar
Wu, M.Y., Shen, C.-Y., Wang, E.T., Chen, A.L.P.: A deep architecture for depression detection using posting, behavior, and living environment data. J. Intell. Inf. Syst. 54(2), 225–244 (2018). https://doi.org/10.1007/s10844-018-0533-4
Article Google Scholar
Shuai, H.-H., et al.: A comprehensive study on social network mental disorders detection via online social media mining. IEEE Trans. Knowl. Data Eng. (TKDE) 30(7), 1212–1225 (2018)
Google Scholar
Smiti, S., Soui, M.: Bankruptcy prediction using deep learning approach based on borderline SMOTE. Inf. Syst. Front. 22(5), 1067–1083 (2020). https://doi.org/10.1007/s10796-020-10031-6
Article Google Scholar
Seo, J.-H., Kim, Y.-H.: Machine-learning approach to optimize SMOTE ratio in class imbalance dataset for intrusion detection. Comput. Intell. Neurosci. 1–11 (2018)
Google Scholar
Kurniabudi, D.S., Darmawijoyo, M.Y.B.I., Bamhdi, A.M., Budiarto, R.: CICIDS-2017 dataset feature analysis with information gain for anomaly detection. IEEE Access 8, 132911–132921 (2020)
Google Scholar
Kim, J., Kim, J., Kim, H., Shim, M., Choi, E.: CNN-based network intrusion detection against denial-of-service attacks. Electronics 9(6) (2020)
Google Scholar

Download references

Acknowledgements

This work was partially supported by research grants from Ministry of Science and Technology, Taiwan, under the grant number MOST109–2221-E-027–090, and partially supported by the National Applied Research Laboratories, Taiwan under the grant number of NARL- ISIM-109–002.

Author information

Authors and Affiliations

National Taipei University of Technology, Taipei, Taiwan
Jenq-Haur Wang
Sriwijaya University, Palembang, Indonesia
Tri Wanda Septian

Authors

Jenq-Haur Wang
View author publications
You can also search for this author in PubMed Google Scholar
Tri Wanda Septian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jenq-Haur Wang .

Editor information

Editors and Affiliations

Aalborg University, Aalborg, Denmark
Christian S. Jensen
Singapore Management University, Singapore, Singapore
Ee-Peng Lim
Academia Sinica, Taipei, Taiwan
De-Nian Yang
National Central University, Taoyuan City, Taiwan
Chia-Hui Chang
Hong Kong Baptist University, Kowloon Tong, Hong Kong
Jianliang Xu
National Chiao Tung University, Hsinchu, Taiwan
Wen-Chih Peng
National Cheng Kung University, Tainan City, Taiwan
Jen-Wei Huang
National Tsing Hua University, Hsinchu, Taiwan
Chih-Ya Shen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, JH., Septian, T.W. (2021). Combining Oversampling with Recurrent Neural Networks for Intrusion Detection. In: Jensen, C.S., et al. Database Systems for Advanced Applications. DASFAA 2021 International Workshops. DASFAA 2021. Lecture Notes in Computer Science(), vol 12680. Springer, Cham. https://doi.org/10.1007/978-3-030-73216-5_21

Download citation

DOI: https://doi.org/10.1007/978-3-030-73216-5_21
Published: 06 April 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73215-8
Online ISBN: 978-3-030-73216-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics