skip to main content
10.1145/3497737.3497740acmotherconferencesArticle/Chapter ViewAbstractPublication PageshpcctConference Proceedingsconference-collections
research-article

A Survey on Machine Learning based Intrusion Detection Systems Using Apache Spark

Published: 23 December 2021 Publication History

Abstract

The emergence and wide application of the Internet have brought convenience to people's lives, but at the same time, it has also brought many security problems. How to protect network security and prevent intrusion detection is the focus of current research. This article adopts the method of review, first introduces the application examples of big data technology and machine learning technology in intrusion detection respectively, and then introduces intrusion detection system, machine learning algorithm and deep learning algorithm in detail. Finally, the model of spark applied to intrusion detection system is listed, and it is concluded that the combination of spark and machine learning technology for intrusion detection system can make it more efficient.

References

[1]
Smaha, Stephen E. "Haystack: An intrusion detection system." Fourth Aerospace Computer Security Applications Conference. Vol. 44. 1988.
[2]
Camacho, José, "Multivariate Big Data Analysis for intrusion detection: 5 steps from the haystack to the needle." Computers & Security 87 (2019): 101603.
[3]
Wang L . Big Data in Intrusion Detection Systems and Intrusion Prevention Systems[J]. 2017.
[4]
Kato K, Klyuev V . Development of a network intrusion detection system using Apache Hadoop and Spark[C]// IEEE Conference on Dependable & Secure Computing. IEEE, 2017:416-423.
[5]
Hassan M M, Gumaei A, Alsanad A, A Hybrid Deep Learning Model for Efficient Intrusion Detection in Big Data Environment[J]. Information Sciences, 2019, 513.
[6]
Al-Jarrah O Y, Siddiqui A, Elsalamouny M, Machine-Learning-Based Feature Selection Techniques for Large-Scale Network Intrusion Detection[C]// IEEE International Conference on Distributed Computing Systems Workshops. IEEE, 2014.
[7]
Performance evaluation of intrusion detection based on machine learning using Apache Spark[J]. Procedia Computer Science, 2018, 127:1-6.
[8]
Liu H, Lang B . Machine Learning and Deep Learning Methods for Intrusion Detection Systems: A Survey[J]. Applied Sciences, 2019, 9(20):4396.
[9]
Julisch K . Mining Alarm Clusters to Improve Alarm Handling Efficiency[C]// Computer Security Applications Conference, 2001. ACSAC 2001. Proceedings 17th Annual. IEEE Computer Society, 2002.
[10]
Karatas G, Demir O, Sahingoz O K . Deep Learning in Intrusion Detection Systems[C]// 2018 International Congress on Big Data, Deep Learning and Fighting Cyber Terrorism (IBIGDELFT). 2018.
[11]
DARPA1998 Dataset. 1998. Available online: http://www.ll.mit.edu/r-d/datasets/1998-darpa-intrusion-detection-evaluation-dataset (accessed on 16 October 2019).
[12]
CSE-CIC-IDS2018 Dataset. Available online: https://www.unb.ca/cic/datasets/nsl.html (accessed on 16 October 2019).
[13]
Khraisat A, Gondal I, Vamplew P, Survey of intrusion detection systems: techniques, datasets and challenges[J]. Cybersecurity, 2019, 2(1).
[14]
Al-Jarrah O Y, Siddiqui A, Elsalamouny M, Machine-Learning-Based Feature Selection Techniques for Large-Scale Network Intrusion Detection[C]// IEEE International Conference on Distributed Computing Systems Workshops. IEEE, 2014.
[15]
Buczak A, Guven E . A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection[J]. IEEE Communications Surveys & Tutorials, 2017, 18(2):1153-1176.
[16]
Peng K, Leung V, Zheng L, Intrusion Detection System Based on Decision Tree over Big Data in Fog Environment[J]. Wireless Communications & Mobile Computing, 2018, 2018:1-10.
[17]
Zhang H, Dai S, Li Y, Real-time Distributed-Random-Forest-Based Network Intrusion Detection System Using Apache Spark[C]// 2018 IEEE 37th International Performance Computing and Communications Conference (IPCCC). IEEE, 2018.
[18]
Cortes C, Vapnik V . Support-Vector Networks[J]. Machine Learning, 1995, 20(3):273-297.
[19]
F Gumus, Sakar C O, Erdem Z, Online Naive Bayes classification for network intrusion detection[C]// IEEE/ACM International Conference on Advances in Social Networks Analysis & Mining. ACM, 2014.
[20]
Lecun Y, Bengio Y . Convolutional Networks for Images, Speech, and Time-Series[J]. Handbook of Brain Theory & Neural Networks, 1995.
[21]
Cai C, Mei S, Zhong W . Configuration of intrusion prevention systems based on a legal user: the case for using intrusion prevention systems instead of intrusion detection systems[J]. Information Technology & Management, 2019, 20(2):55-71.
[22]
Yogesh K, Karthik M, Naveen T, Design and Evaluation of Scalable Intrusion Detection System Using Machine Learning and Apache Spark[C]// 2019 5th International Conference On Computing, Communication, Control And Automation (ICCUBEA). 2019.
[23]
Gita, Donkal, Gyanendra, A multimodal fusion based framework to reinforce IDS for securing Big Data environment using Spark - ScienceDirect[J]. Journal of Information Security and Applications, 43:1-11.

Cited By

View all
  • (2024)Elevating IDS Capabilities: The Convergence of SVM, Deep Learning, and RFECV in Network Security2024 Second International Conference on Emerging Trends in Information Technology and Engineering (ICETITE)10.1109/ic-ETITE58242.2024.10493239(1-16)Online publication date: 22-Feb-2024
  1. A Survey on Machine Learning based Intrusion Detection Systems Using Apache Spark

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    HPCCT '21: Proceedings of the 2021 5th High Performance Computing and Cluster Technologies Conference
    July 2021
    58 pages
    ISBN:9781450390132
    DOI:10.1145/3497737
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 December 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tag

    1. Intrusion Detection System, Machine Learning, Deep Learning, Apache Spark

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    HPCCT 2021

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)23
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 08 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Elevating IDS Capabilities: The Convergence of SVM, Deep Learning, and RFECV in Network Security2024 Second International Conference on Emerging Trends in Information Technology and Engineering (ICETITE)10.1109/ic-ETITE58242.2024.10493239(1-16)Online publication date: 22-Feb-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media