research-article

An Incremental Learning Algorithm on Imbalanced Data for Network Intrusion Detection Systems

Authors:

Masayoshi AritsugiAuthors Info & Claims

ICCCM '22: Proceedings of the 10th International Conference on Computer and Communications Management

Pages 191 - 199

https://doi.org/10.1145/3556223.3556252

Published: 16 October 2022 Publication History

Abstract

Incremental learning is a promising algorithm for creating an adaptive network intrusion detection system (IDS) model. In contrast with batch learning models, incremental learning models can be retrained easily when new network intrusion data emerge. Moreover, some incremental learning models, such as the Hoeffding Tree model, can be retrained only using latest training data. This advantage is appealing because computer networks produce enormous amounts of data every day. Using incremental learning models for detecting the ever-growing network intrusions can save computational resources while preserving the performance of the models. However, network data suffer from the imbalanced data problem where the data distribution of the classes in the training data is often severely disproportional. This imbalanced data problem is affecting the performance of incremental learning algorithms. To mitigate this problem, we propose an incremental learning algorithm for network IDSs that can learn from imbalanced data. Our proposed method is an ensemble incremental learning algorithm composed of the Hoeffding Tree, incremental Adaptive Boosting (AdaBoost), and Hard Sampling algorithms. The experimental results show that our proposed model has superior performance compared to the other incremental learning models tested in this study. Moreover, our proposed method increases the robustness of the incremental learning model against the imbalanced data problem.

References

[1]

Razan Abdulhammed, Miad Faezipour, Abdelshakour Abuzneid, and Arafat Abu Mallouh. 2019. Deep and Machine Learning Approaches for Anomaly-Based Intrusion Detection of Imbalanced Network Traffic. IEEE Sensors Letters 3, 1 (January 2019), 1–4. https://doi.org/10.1109/LSENS.2018.2879990

[2]

Uttam Adhikari, Thomas H. Morris, and Shengyi Pan. 2018. Applying Hoeffding Adaptive Trees for Real-Time Cyber-Power Event and Intrusion Classification. IEEE Transactions on Smart Grid 9, 5 (September 2018), 4049–4060. https://doi.org/10.1109/TSG.2017.2647778

[3]

Albert Bifet, Geoff Holmes, Bernhard Pfahringer, Richard Kirkby, and Ricard Gavaldà. 2009. New ensemble methods for evolving data streams. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’09. ACM Press, France, 139. https://doi.org/10.1145/1557019.1557041

Digital Library

[4]

C. Constantinides, S. Shiaeles, B. Ghita, and N. Kolokotronis. 2019. A Novel Online Incremental Learning Intrusion Prevention System. In 2019 10th IFIP International Conference on New Technologies, Mobility and Security (NTMS). IEEE, Spain, 1–6. https://doi.org/10.1109/NTMS.2019.8763842

[5]

Diego Guarnieri Correa, Fabricio Enembreck, and Carlos N. Silla. 2017. An investigation of the hoeffding adaptive tree for the problem of network intrusion detection. In 2017 International Joint Conference on Neural Networks (IJCNN). IEEE, USA, 4065–4072. https://doi.org/10.1109/IJCNN.2017.7966369

[6]

T. Daniya, K. Suresh Kumar, B. Santhosh Kumar, and Chandra Sekhar Kolli. 2021. A survey on anomaly based intrusion detection system. Materials Today: Proceedings In Press (April 2021), 1–4. https://doi.org/10.1016/j.matpr.2021.03.353

[7]

Mahendra Data and Masayoshi Aritsugi. 2021. T-DFNN: An Incremental Learning Algorithm for Intrusion Detection Systems. IEEE Access 9 (November 2021), 154156–154171. https://doi.org/10.1109/ACCESS.2021.3127985

[8]

Pedro Domingos and Geoff Hulten. 2000. Mining high-speed data streams. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’00. ACM Press, USA, 71–80. https://doi.org/10.1145/347090.347107

Digital Library

[9]

Yoav Freund and Robert E. Schapire. 1996. Experiments with a new boosting algorithm. In Proceedings of the Thirteenth International Conference on Machine Learnings. Morgan Kaufmann, Italy, 148–156. https://dl.acm.org/doi/10.5555/3091696.3091715

Digital Library

[10]

Xin Geng and Kate Smith-Miles. 2009. Incremental Learning. In Encyclopedia of Biometrics. Springer US, Boston, MA, 731–735. https://doi.org/10.1007/978-0-387-73003-5_304

[11]

A. Gharib, I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani. 2016. An Evaluation Framework for Intrusion Detection Dataset. In 2016 International Conference on Information Science and Security (ICISS). IEEE, Thailand, 1–6. https://doi.org/10.1109/ICISSEC.2016.7885840

[12]

Arash Habibi Lashkari, Gerard Draper Gil, Mohammad Saiful Islam Mamun, and Ali A. Ghorbani. 2017. Characterization of Tor Traffic using Time based Features. In Proceedings of the 3rd International Conference on Information Systems Security and Privacy - Volume 1: ICISSP. INSTICC, SciTePress, Portugal, 253–262. https://doi.org/10.5220/0006105602530262

[13]

Haibo He and E.A. Garcia. 2009. Learning from Imbalanced Data. IEEE Transactions on Knowledge and Data Engineering 21, 9 (September 2009), 1263–1284. https://doi.org/10.1109/TKDE.2008.239

Digital Library

[14]

H. Hindy, D. Brosset, E. Bayne, A. K. Seeam, C. Tachtatzis, R. Atkinson, and X. Bellekens. 2020. A Taxonomy of Network Threats and the Effect of Current Datasets on Intrusion Detection Systems. IEEE Access 8 (June 2020), 104650–104675. https://doi.org/10.1109/ACCESS.2020.3000179

[15]

Elike Hodo, Xavier Bellekens, Andrew Hamilton, Christos Tachtatzis, and Robert Atkinson. 2017. Shallow and deep networks intrusion detection system : a taxonomy and survey. (January 2017). https://strathprints.strath.ac.uk/63256/ Preprint.

[16]

Ansam Khraisat, Iqbal Gondal, Peter Vamplew, and Joarder Kamruzzaman. 2019. Survey of intrusion detection systems: techniques, datasets and challenges. Cybersecurity 2, 20 (July 2019), 1–22. https://doi.org/10.1186/s42400-019-0038-7

[17]

Joffrey L. Leevy and Taghi M. Khoshgoftaar. 2020. A survey and analysis of intrusion detection models based on CSE-CIC-IDS2018 Big Data. Journal of Big Data 7, 1 (December 2020), 104. https://doi.org/10.1186/s40537-020-00382-x

[18]

R.J. Lyon, J.M. Brooke, J.D. Knowles, and B.W. Stappers. 2014. Hellinger Distance Trees for Imbalanced Streams. In 2014 22nd International Conference on Pattern Recognition. IEEE, Sweeden, 1969–1974. https://doi.org/10.1109/ICPR.2014.344

Digital Library

[19]

Preeti Mishra, Emmanuel S. Pilli, Vijay Varadharajan, and Udaya Tupakula. 2017. Intrusion detection techniques in cloud environment: A survey. Journal of Network and Computer Applications 77 (January 2017), 18–47. https://doi.org/10.1016/j.jnca.2016.10.015

Digital Library

[20]

Jacob Montiel, Max Halford, Saulo Martiello Mastelini, Geoffrey Bolmier, Raphael Sourty, Robin Vaysse, Adil Zouitine, Heitor Murilo Gomes, Jesse Read, Talel Abdessalem, and Albert Bifet. 2021. River: machine learning for streaming data in Python. Journal of Machine Learning Research 22, 110 (November 2021), 1–8. http://jmlr.org/papers/v22/20-1380.html

[21]

Nikunj C. Oza and Stuart J. Russell. 2001. Online Bagging and Boosting. In Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, USA, 229–236. https://proceedings.mlr.press/r3/oza01a.html

[22]

Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Édouard Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12, 85 (December 2011), 2825–2830. http://jmlr.org/papers/v12/pedregosa11a.html

[23]

R. Polikar, L. Upda, S. S. Upda, and V. Honavar. 2001. Learn++: an incremental learning algorithm for supervised neural networks. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 31, 4 (November 2001), 497–508. https://doi.org/10.1109/5326.983933

Digital Library

[24]

Amin Shahraki, Mahmoud Abbasi, and Øystein Haugen. 2020. Boosting algorithms for network intrusion detection: A comparative evaluation of Real AdaBoost, Gentle AdaBoost and Modest AdaBoost. Engineering Applications of Artificial Intelligence 94, 103770 (September 2020), 1–14. https://doi.org/10.1016/j.engappai.2020.103770

[25]

Iman Sharafaldin, Arash Habibi Lashkari, and Ali A. Ghorbani. 2018. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization. In Proceedings of the 4th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP. INSTICC, SciTePress, Portugal, 108–116. https://doi.org/10.5220/0006639801080116

[26]

Weiming Hu, Wei Hu, and Steve Maybank. 2008. AdaBoost-Based Algorithm for Network Intrusion Detection. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 38, 2 (April 2008), 577–583. https://doi.org/10.1109/TSMCB.2007.914695

Digital Library

[27]

Binhan Xu, Shuyu Chen, Hancui Zhang, and Tianshu Wu. 2017. Incremental k-NN SVM Method in Intrusion Detection. In 2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS). IEEE, China, 712–717. https://doi.org/10.1109/ICSESS.2017.8343013

[28]

Yang Yi, Jiansheng Wu, and Wei Xu. 2011. Incremental SVM Based on Reserved Set for Network Intrusion Detection. Expert Systems with Applications 38, 6 (June 2011), 7698–7707. https://doi.org/10.1016/j.eswa.2010.12.141

Digital Library

[29]

Arif Yulianto, Parman Sukarno, and Novian Anggis Suwastika. 2019. Improving AdaBoost-based Intrusion Detection System (IDS) Performance on CIC IDS 2017 Dataset. Journal of Physics: Conference Series 1192, 1 (March 2019), 1–9. https://doi.org/10.1088/1742-6596/1192/1/012018

Cited By

Cerasuolo FBovenzi GCiuonzo DPescapè A(2025)Adaptable, incremental, and explainable network intrusion detection systems for internet of thingsEngineering Applications of Artificial Intelligence10.1016/j.engappai.2025.110143144(110143)Online publication date: Mar-2025
https://doi.org/10.1016/j.engappai.2025.110143
Jin ZZhou JLi BWu XDuan C(2024)FL-IIDSFuture Generation Computer Systems10.1016/j.future.2023.09.019151:C(57-70)Online publication date: 27-Feb-2024
https://dl.acm.org/doi/10.1016/j.future.2023.09.019
Bhurani PChouhan SMittal N(2024)ExBCIL: an exemplar-based class incremental learning for intrusion detection systemInternational Journal of Machine Learning and Cybernetics10.1007/s13042-024-02486-9Online publication date: 25-Dec-2024
https://doi.org/10.1007/s13042-024-02486-9
Show More Cited By

Index Terms

An Incremental Learning Algorithm on Imbalanced Data for Network Intrusion Detection Systems
1. Security and privacy
  1. Intrusion/anomaly detection and malware mitigation
    1. Intrusion detection systems

Recommendations

Ensemble and Incremental Learning for Norm Violation Detection
AAMAS '22: Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems

The use of norms to guide and coordinate interactions has gained tremendous attention in the multiagent community. However, as the interest moves towards dynamic socio-technical systems, where human and software agents interact and interactions are ...
Incremental learning by heterogeneous bagging ensemble
ADMA'10: Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II

Classifier ensemble is a main direction of incremental learning researches, and many ensemble-based incremental learning methods have been presented. Among them, Learn++, which is derived from the famous ensemble algorithm, AdaBoost, is special. Learn++ ...
Clustering-based incremental learning for imbalanced data classification
Abstract
Imbalanced data classification presents a significant challenge when there is a substantial disparity in sample sizes across different classes. This issue severely affects classifier accuracy in predicting minority classes, hampering numerous ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICCCM '22: Proceedings of the 10th International Conference on Computer and Communications Management

July 2022

289 pages

ISBN:9781450396349

DOI:10.1145/3556223

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 October 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICCCM 2022

ICCCM 2022: The 10th International Conference on Computer and Communications Management

July 29 - 31, 2022

Okayama, Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
258
Total Downloads

Downloads (Last 12 months)70
Downloads (Last 6 weeks)10

Reflects downloads up to 22 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Cerasuolo FBovenzi GCiuonzo DPescapè A(2025)Adaptable, incremental, and explainable network intrusion detection systems for internet of thingsEngineering Applications of Artificial Intelligence10.1016/j.engappai.2025.110143144(110143)Online publication date: Mar-2025
https://doi.org/10.1016/j.engappai.2025.110143
Jin ZZhou JLi BWu XDuan C(2024)FL-IIDSFuture Generation Computer Systems10.1016/j.future.2023.09.019151:C(57-70)Online publication date: 27-Feb-2024
https://dl.acm.org/doi/10.1016/j.future.2023.09.019
Bhurani PChouhan SMittal N(2024)ExBCIL: an exemplar-based class incremental learning for intrusion detection systemInternational Journal of Machine Learning and Cybernetics10.1007/s13042-024-02486-9Online publication date: 25-Dec-2024
https://doi.org/10.1007/s13042-024-02486-9
Fatyanosa TData M(2023)Hybrid Feature Selection Framework for Building Resource Efficient Intrusion Detection Systems Model in the Internet of ThingsProceedings of the 8th International Conference on Sustainable Information Engineering and Technology10.1145/3626641.3626923(16-22)Online publication date: 24-Oct-2023
https://dl.acm.org/doi/10.1145/3626641.3626923
Cerasuolo FBovenzi GMarescalco CCirillo FCiuonzo DPescapè A(2023)Adaptive Intrusion Detection Systems: Class Incremental Learning for IoT Emerging Threats2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386129(3547-3555)Online publication date: 15-Dec-2023
https://doi.org/10.1109/BigData59044.2023.10386129
S GR H(2022)Performance analysis of Incremental boosting based Transfer Learning in Deep CNN2022 3rd International Conference on Communication, Computing and Industry 4.0 (C2I4)10.1109/C2I456876.2022.10051386(1-6)Online publication date: 15-Dec-2022
https://doi.org/10.1109/C2I456876.2022.10051386
Firdausanti NFatyanosa TData MMendonca IAritsugi M(2022)Two-Stage Sampling: A Framework for Imbalanced Classification With Overlapped Classes2022 IEEE International Conference on Big Data (Big Data)10.1109/BigData55660.2022.10020788(271-280)Online publication date: 17-Dec-2022
https://doi.org/10.1109/BigData55660.2022.10020788

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten