skip to main content
10.1145/3556223.3556252acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicccmConference Proceedingsconference-collections
research-article

An Incremental Learning Algorithm on Imbalanced Data for Network Intrusion Detection Systems

Published: 16 October 2022 Publication History

Abstract

Incremental learning is a promising algorithm for creating an adaptive network intrusion detection system (IDS) model. In contrast with batch learning models, incremental learning models can be retrained easily when new network intrusion data emerge. Moreover, some incremental learning models, such as the Hoeffding Tree model, can be retrained only using latest training data. This advantage is appealing because computer networks produce enormous amounts of data every day. Using incremental learning models for detecting the ever-growing network intrusions can save computational resources while preserving the performance of the models. However, network data suffer from the imbalanced data problem where the data distribution of the classes in the training data is often severely disproportional. This imbalanced data problem is affecting the performance of incremental learning algorithms. To mitigate this problem, we propose an incremental learning algorithm for network IDSs that can learn from imbalanced data. Our proposed method is an ensemble incremental learning algorithm composed of the Hoeffding Tree, incremental Adaptive Boosting (AdaBoost), and Hard Sampling algorithms. The experimental results show that our proposed model has superior performance compared to the other incremental learning models tested in this study. Moreover, our proposed method increases the robustness of the incremental learning model against the imbalanced data problem.

References

[1]
Razan Abdulhammed, Miad Faezipour, Abdelshakour Abuzneid, and Arafat Abu Mallouh. 2019. Deep and Machine Learning Approaches for Anomaly-Based Intrusion Detection of Imbalanced Network Traffic. IEEE Sensors Letters 3, 1 (January 2019), 1–4. https://doi.org/10.1109/LSENS.2018.2879990
[2]
Uttam Adhikari, Thomas H. Morris, and Shengyi Pan. 2018. Applying Hoeffding Adaptive Trees for Real-Time Cyber-Power Event and Intrusion Classification. IEEE Transactions on Smart Grid 9, 5 (September 2018), 4049–4060. https://doi.org/10.1109/TSG.2017.2647778
[3]
Albert Bifet, Geoff Holmes, Bernhard Pfahringer, Richard Kirkby, and Ricard Gavaldà. 2009. New ensemble methods for evolving data streams. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’09. ACM Press, France, 139. https://doi.org/10.1145/1557019.1557041
[4]
C. Constantinides, S. Shiaeles, B. Ghita, and N. Kolokotronis. 2019. A Novel Online Incremental Learning Intrusion Prevention System. In 2019 10th IFIP International Conference on New Technologies, Mobility and Security (NTMS). IEEE, Spain, 1–6. https://doi.org/10.1109/NTMS.2019.8763842
[5]
Diego Guarnieri Correa, Fabricio Enembreck, and Carlos N. Silla. 2017. An investigation of the hoeffding adaptive tree for the problem of network intrusion detection. In 2017 International Joint Conference on Neural Networks (IJCNN). IEEE, USA, 4065–4072. https://doi.org/10.1109/IJCNN.2017.7966369
[6]
T. Daniya, K. Suresh Kumar, B. Santhosh Kumar, and Chandra Sekhar Kolli. 2021. A survey on anomaly based intrusion detection system. Materials Today: Proceedings In Press (April 2021), 1–4. https://doi.org/10.1016/j.matpr.2021.03.353
[7]
Mahendra Data and Masayoshi Aritsugi. 2021. T-DFNN: An Incremental Learning Algorithm for Intrusion Detection Systems. IEEE Access 9 (November 2021), 154156–154171. https://doi.org/10.1109/ACCESS.2021.3127985
[8]
Pedro Domingos and Geoff Hulten. 2000. Mining high-speed data streams. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’00. ACM Press, USA, 71–80. https://doi.org/10.1145/347090.347107
[9]
Yoav Freund and Robert E. Schapire. 1996. Experiments with a new boosting algorithm. In Proceedings of the Thirteenth International Conference on Machine Learnings. Morgan Kaufmann, Italy, 148–156. https://dl.acm.org/doi/10.5555/3091696.3091715
[10]
Xin Geng and Kate Smith-Miles. 2009. Incremental Learning. In Encyclopedia of Biometrics. Springer US, Boston, MA, 731–735. https://doi.org/10.1007/978-0-387-73003-5_304
[11]
A. Gharib, I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani. 2016. An Evaluation Framework for Intrusion Detection Dataset. In 2016 International Conference on Information Science and Security (ICISS). IEEE, Thailand, 1–6. https://doi.org/10.1109/ICISSEC.2016.7885840
[12]
Arash Habibi Lashkari, Gerard Draper Gil, Mohammad Saiful Islam Mamun, and Ali A. Ghorbani. 2017. Characterization of Tor Traffic using Time based Features. In Proceedings of the 3rd International Conference on Information Systems Security and Privacy - Volume 1: ICISSP. INSTICC, SciTePress, Portugal, 253–262. https://doi.org/10.5220/0006105602530262
[13]
Haibo He and E.A. Garcia. 2009. Learning from Imbalanced Data. IEEE Transactions on Knowledge and Data Engineering 21, 9 (September 2009), 1263–1284. https://doi.org/10.1109/TKDE.2008.239
[14]
H. Hindy, D. Brosset, E. Bayne, A. K. Seeam, C. Tachtatzis, R. Atkinson, and X. Bellekens. 2020. A Taxonomy of Network Threats and the Effect of Current Datasets on Intrusion Detection Systems. IEEE Access 8 (June 2020), 104650–104675. https://doi.org/10.1109/ACCESS.2020.3000179
[15]
Elike Hodo, Xavier Bellekens, Andrew Hamilton, Christos Tachtatzis, and Robert Atkinson. 2017. Shallow and deep networks intrusion detection system : a taxonomy and survey. (January 2017). https://strathprints.strath.ac.uk/63256/ Preprint.
[16]
Ansam Khraisat, Iqbal Gondal, Peter Vamplew, and Joarder Kamruzzaman. 2019. Survey of intrusion detection systems: techniques, datasets and challenges. Cybersecurity 2, 20 (July 2019), 1–22. https://doi.org/10.1186/s42400-019-0038-7
[17]
Joffrey L. Leevy and Taghi M. Khoshgoftaar. 2020. A survey and analysis of intrusion detection models based on CSE-CIC-IDS2018 Big Data. Journal of Big Data 7, 1 (December 2020), 104. https://doi.org/10.1186/s40537-020-00382-x
[18]
R.J. Lyon, J.M. Brooke, J.D. Knowles, and B.W. Stappers. 2014. Hellinger Distance Trees for Imbalanced Streams. In 2014 22nd International Conference on Pattern Recognition. IEEE, Sweeden, 1969–1974. https://doi.org/10.1109/ICPR.2014.344
[19]
Preeti Mishra, Emmanuel S. Pilli, Vijay Varadharajan, and Udaya Tupakula. 2017. Intrusion detection techniques in cloud environment: A survey. Journal of Network and Computer Applications 77 (January 2017), 18–47. https://doi.org/10.1016/j.jnca.2016.10.015
[20]
Jacob Montiel, Max Halford, Saulo Martiello Mastelini, Geoffrey Bolmier, Raphael Sourty, Robin Vaysse, Adil Zouitine, Heitor Murilo Gomes, Jesse Read, Talel Abdessalem, and Albert Bifet. 2021. River: machine learning for streaming data in Python. Journal of Machine Learning Research 22, 110 (November 2021), 1–8. http://jmlr.org/papers/v22/20-1380.html
[21]
Nikunj C. Oza and Stuart J. Russell. 2001. Online Bagging and Boosting. In Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, USA, 229–236. https://proceedings.mlr.press/r3/oza01a.html
[22]
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Édouard Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12, 85 (December 2011), 2825–2830. http://jmlr.org/papers/v12/pedregosa11a.html
[23]
R. Polikar, L. Upda, S. S. Upda, and V. Honavar. 2001. Learn++: an incremental learning algorithm for supervised neural networks. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 31, 4 (November 2001), 497–508. https://doi.org/10.1109/5326.983933
[24]
Amin Shahraki, Mahmoud Abbasi, and Øystein Haugen. 2020. Boosting algorithms for network intrusion detection: A comparative evaluation of Real AdaBoost, Gentle AdaBoost and Modest AdaBoost. Engineering Applications of Artificial Intelligence 94, 103770 (September 2020), 1–14. https://doi.org/10.1016/j.engappai.2020.103770
[25]
Iman Sharafaldin, Arash Habibi Lashkari, and Ali A. Ghorbani. 2018. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization. In Proceedings of the 4th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP. INSTICC, SciTePress, Portugal, 108–116. https://doi.org/10.5220/0006639801080116
[26]
Weiming Hu, Wei Hu, and Steve Maybank. 2008. AdaBoost-Based Algorithm for Network Intrusion Detection. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 38, 2 (April 2008), 577–583. https://doi.org/10.1109/TSMCB.2007.914695
[27]
Binhan Xu, Shuyu Chen, Hancui Zhang, and Tianshu Wu. 2017. Incremental k-NN SVM Method in Intrusion Detection. In 2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS). IEEE, China, 712–717. https://doi.org/10.1109/ICSESS.2017.8343013
[28]
Yang Yi, Jiansheng Wu, and Wei Xu. 2011. Incremental SVM Based on Reserved Set for Network Intrusion Detection. Expert Systems with Applications 38, 6 (June 2011), 7698–7707. https://doi.org/10.1016/j.eswa.2010.12.141
[29]
Arif Yulianto, Parman Sukarno, and Novian Anggis Suwastika. 2019. Improving AdaBoost-based Intrusion Detection System (IDS) Performance on CIC IDS 2017 Dataset. Journal of Physics: Conference Series 1192, 1 (March 2019), 1–9. https://doi.org/10.1088/1742-6596/1192/1/012018

Cited By

View all
  • (2025)Adaptable, incremental, and explainable network intrusion detection systems for internet of thingsEngineering Applications of Artificial Intelligence10.1016/j.engappai.2025.110143144(110143)Online publication date: Mar-2025
  • (2024)FL-IIDSFuture Generation Computer Systems10.1016/j.future.2023.09.019151:C(57-70)Online publication date: 27-Feb-2024
  • (2024)ExBCIL: an exemplar-based class incremental learning for intrusion detection systemInternational Journal of Machine Learning and Cybernetics10.1007/s13042-024-02486-9Online publication date: 25-Dec-2024
  • Show More Cited By

Index Terms

  1. An Incremental Learning Algorithm on Imbalanced Data for Network Intrusion Detection Systems

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICCCM '22: Proceedings of the 10th International Conference on Computer and Communications Management
    July 2022
    289 pages
    ISBN:9781450396349
    DOI:10.1145/3556223
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 October 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. adaptive boosting.
    2. hard sampling
    3. hoeffding tree
    4. imbalanced dataset
    5. incremental learning

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICCCM 2022

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)70
    • Downloads (Last 6 weeks)10
    Reflects downloads up to 22 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Adaptable, incremental, and explainable network intrusion detection systems for internet of thingsEngineering Applications of Artificial Intelligence10.1016/j.engappai.2025.110143144(110143)Online publication date: Mar-2025
    • (2024)FL-IIDSFuture Generation Computer Systems10.1016/j.future.2023.09.019151:C(57-70)Online publication date: 27-Feb-2024
    • (2024)ExBCIL: an exemplar-based class incremental learning for intrusion detection systemInternational Journal of Machine Learning and Cybernetics10.1007/s13042-024-02486-9Online publication date: 25-Dec-2024
    • (2023)Hybrid Feature Selection Framework for Building Resource Efficient Intrusion Detection Systems Model in the Internet of ThingsProceedings of the 8th International Conference on Sustainable Information Engineering and Technology10.1145/3626641.3626923(16-22)Online publication date: 24-Oct-2023
    • (2023)Adaptive Intrusion Detection Systems: Class Incremental Learning for IoT Emerging Threats2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386129(3547-3555)Online publication date: 15-Dec-2023
    • (2022)Performance analysis of Incremental boosting based Transfer Learning in Deep CNN2022 3rd International Conference on Communication, Computing and Industry 4.0 (C2I4)10.1109/C2I456876.2022.10051386(1-6)Online publication date: 15-Dec-2022
    • (2022)Two-Stage Sampling: A Framework for Imbalanced Classification With Overlapped Classes2022 IEEE International Conference on Big Data (Big Data)10.1109/BigData55660.2022.10020788(271-280)Online publication date: 17-Dec-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media