Towards Data Science for Cybersecurity: Machine Learning Advances as Glowing Perspective

Mihailescu, Marius Iulian; Nita, Stefania Loredana

doi:10.1007/978-3-031-16078-3_2

Marius Iulian Mihailescu¹⁰ &
Stefania Loredana Nita¹¹

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 543))

Included in the following conference series:

Proceedings of SAI Intelligent Systems Conference

957 Accesses

Abstract

The current computing context has developed important opportunities and challenges by the new attacks that occurred recently due to the pandemic situation (COVID-19), cybersecurity has crossed and still passing through significant changes by the technology and its operation. Many computer security incident response teams (CSIRT) and cybersecurity centers had reported significant behaviors of the attacks and they raised multiple warning signs, some of them being ignored by different third parties and others were taken into consideration and new frameworks started to be translated into research directions as a cross-collaboration between researchers and professionals. As a conclusion of CSIRTs, data science is the leader and gives the tone of the change. Identifying properly the security incident patterns or different types of insights within the cybersecurity data and implementing the right data-driven model, represents the main task is to achieve for an automated and intelligent security system. In this paper, we will propose a machine learning framework for cybersecurity, focusing on data science for cybersecurity, where the data collected from trusted sources t are relevant for cybersecurity. Our work will kickstart discussion on various research challenges which are open for improvements and will also point out the most challenging future research directions. Altogether, our purpose is not limited to discussing data science within the cybersecurity context and relevant methods/algorithms, but also to focus on the applicability of taking the most intelligent decisions based on data to protect the systems against cyber attacks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Cybersecurity data science: an overview from machine learning perspective

Article Open access 01 July 2020

Machine Learning for Intelligent Data Analysis and Automation in Cybersecurity: Current and Future Prospects

Article Open access 19 September 2022

Cybersecurity for Data Science: Issues, Opportunities, and Challenges

References

Hajny, J., Ricci, S., Piesarskas, E., Levillain, O., Galletta, L., De Nicola, R.: Framework, tools and good practices for cybersecurity curricula. IEEE Access 9, 94723–94747 (2021). https://doi.org/10.1109/ACCESS.2021.3093952
Article Google Scholar
Megantara, A.A., Ahmad, T.: A hybrid machine learning method for increasing the performance of network intrusion detection systems. J. Big Data 8(1), 1–19 (2021). https://doi.org/10.1186/s40537-021-00531-w
Article Google Scholar
Sennaike, O.A., et al.: Towards intelligent open data platforms: Discovering relatedness in datasets. In: 2017 Intelligent Systems Conference (IntelliSys), pp. 414-421 (2017). https://doi.org/10.1109/IntelliSys.2017.8324327
Haas, L.: Leveraging data and people to accelerate data science. In: 2017 IEEE 33rd International Conference on Data Engineering (ICDE), p. 4 (2017). https://doi.org/10.1109/ICDE.2017.9
Tahtaci, B., Canbay, B.: Android malware detection using machine learning. In: 2020 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–6 (2020). https://doi.org/10.1109/ASYU50717.2020.9259834
Firdausi, I., lim, C., Erwin, A., Nugroho, A.S.: Analysis of machine learning techniques used in behavior-based malware detection. In: 2010 Second International Conference on Advances in Computing, Control, and Telecommunication Technologies, pp. 201–203 (2010). https://doi.org/10.1109/ACT.2010.33
Choudhary, S., Sharma, A.: Malware detection & classification using machine learning. In: 2020 International Conference on Emerging Trends in Communication, Control and Computing (ICONC3), pp. 1–4 (2020). https://doi.org/10.1109/ICONC345789.2020.9117547
Vanjire, S., Lakshmi, M.: Behavior-based malware detection system approach for mobile security using machine learning. In: 2021 International Conference on Artificial Intelligence and Machine Vision (AIMV), pp. 1–4 (2021). https://doi.org/10.1109/AIMV53313.2021.9671009
Jin, S., Chung, J.-G., Xu, Y.: Signature-based intrusion detection system (IDS) for in-vehicle CAN bus network. In: 2021 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–5 (2021). https://doi.org/10.1109/ISCAS51556.2021.9401087
Abri, F., Siami-Namini, S., Khanghah, M.A., Soltani, F.M., Namin, A.S.: Can machine/deep learning classifiers detect zero-day malware with high accuracy? In: 2019 IEEE International Conference on Big Data (Big Data), pp. 3252–3259 (2019). https://doi.org/10.1109/BigData47090.2019.9006514
Qadir, S., Noor, B.: Applications of machine learning in digital forensics. In: 2021 International Conference on Digital Futures and Transformative Technologies (ICoDT2), pp. 1–8 (2021). https://doi.org/10.1109/ICoDT252288.2021.9441543
L’Heureux, A., Grolinger, K., Elyamany, H.F., Capretz, M.A.M.: Machine learning with big data: challenges and approaches. IEEE Access 5, 7776–7797 (2017). https://doi.org/10.1109/ACCESS.2017.2696365
Article Google Scholar
Verkerken, M., D’hooge, L., Wauters, T., Volckaert, B., De Turck, F.: Unsupervised machine learning techniques for network intrusion detection on modern data. In: 2020 4th Cyber Security in Networking Conference (CSNet), pp. 1–8 (2020). https://doi.org/10.1109/CSNet50428.2020.9265461
Fadhlillah, A., Karna, N., Irawan, A.: IDS performance analysis using anomaly-based detection method for DOS attack. In: 2020 IEEE International Conference on Internet of Things and Intelligence System (IoTaIS), pp. 18-22 (2021). https://doi.org/10.1109/IoTaIS50849.2021.9359719
Chavan, A., Kerakalamatti, K., Srivastva, S.: Implementation of portable antivirus system using signature-based detection and heuristic analysis. In: 2021 5th International Conference on Trends in Electronics and Informatics (ICOEI), pp. 1481–1486 (2021). https://doi.org/10.1109/ICOEI51242.2021.9452909
Min, B., Yoo, J., Kim, S., Shin, D., Shin, D.: Network anomaly detection using memory-augmented deep autoencoder. IEEE Access 9, 104695–104706 (2021). https://doi.org/10.1109/ACCESS.2021.3100087
Article Google Scholar
Sarker, I.H., Kayes, A.S.M., Badsha, S., Alqahtani, H., Watters, P., Ng, A.: Cybersecurity data science: an overview from machine learning perspective. J. Big Data 7(1), 1–29 (2020). https://doi.org/10.1186/s40537-020-00318-5
Article Google Scholar
Sarker, I.H.: Data science and analytics: an overview from data-driven smart computing, decision-making and applications perspective. SN Comput. Sci. 2, 377 (2021). https://doi.org/10.1007/s42979-021-00765-8
Article Google Scholar
Maxwell, P., Alhajjar, E., Bastian, N.D.: Intelligent feature engineering for cybersecurity. In: 2019 IEEE International Conference on Big Data (Big Data), pp. 5005–5011 (2019). https://doi.org/10.1109/BigData47090.2019.9006122
Ahsan, M., Rahul Gomes, M., Chowdhury, M., Nygard, K.E.: Enhancing machine learning prediction in cybersecurity using dynamic feature selector. J. Cybersecurity Priv. 1(1), 199–218 (2021). https://doi.org/10.3390/jcp1010011
Article Google Scholar
Mukherjee, S.: Top 10 Breakthroughs in Big Data Science in 2017 (2017). https://www.datacamp.com/community/blog/breakthroughs-big-data-science-2017. Last accessed 22 Jan 2022
Akhmetov, B., Lakhno, V., Akhmetov, B., Alimseitova, Z.: Development of sectoral intellectualized expert systems and decision making support systems in cybersecurity. In: Silhavy, R., Silhavy, P., Prokopova, Z. (eds.) Intelligent Systems in Cybernetics and Automation Control Theory, pp. 162–171. Springer International Publishing, Cham (2019). https://doi.org/10.1007/978-3-030-00184-1_15
Chapter Google Scholar
Langford, G.O., et al.: Cybersecurity Planning for Artificial Intelligent Systems in Space. In: 2019 Portland International Conference on Management of Engineering and Technology (PICMET), pp. 1–8. IEEE (2019)
Google Scholar
Rodriguez, A., Okamura, K.: Cybersecurity text data classification and optimization for CTI systems. In: Barolli, L., Amato, F., Moscato, F., Enokido, T., Takizawa, M. (eds.) Web, Artificial Intelligence and Network Applications: Proceedings of the Workshops of the 34th International Conference on Advanced Information Networking and Applications (WAINA-2020), pp. 410–419. Springer International Publishing, Cham (2020). https://doi.org/10.1007/978-3-030-44038-1_37
Chapter Google Scholar
Gnatyuk, S., Sydorenko, V., Polozhentsev, A., Fesenko, A., Akatayev, N., Zhilkishbayeva, G.: Method of cybersecurity level determining for the critical information infrastructure of the state. In: COAPSN, pp. 332–341 (2020)
Google Scholar
Zhang, S., Xie, X., Xu, Y.: A brute-force black-box method to attack machine learning-based systems in cybersecurity. IEEE Access 8, 128250–128263 (2020)
Article Google Scholar
Teixeira, M.A., Salman, T., Zolanvari, M., Jain, R., Meskin, N., Samaka, M.: SCADA system testbed for cybersecurity research using machine learning approach. Future Internet 10(8), 76 (2018)
Article Google Scholar
Chesney, S., Roy, K., Khorsandroo, S.: Machine learning algorithms for preventing IoT cybersecurity attacks. In: Arai, K., Kapoor, S., Bhatia, R. (eds.) Intelligent Systems and Applications: Proceedings of the 2020 Intelligent Systems Conference (IntelliSys) Volume 3, pp. 679–686. Springer International Publishing, Cham (2021). https://doi.org/10.1007/978-3-030-55190-2_53
Chapter Google Scholar
Hariharan, A., Gupta, A., Pal, T.: Camlpad: cybersecurity autonomous machine learning platform for anomaly detection. In: Arai, K., Kapoor, S., Bhatia, R. (eds.) FICC 2020. AISC, vol. 1130, pp. 705–720. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-39442-4_52
Chapter Google Scholar
Puthran, S., Shah, K.: Intrusion detection using improved decision tree algorithm with binary and quad split. In: Mueller, P., Thampi, S.M., Bhuiyan, M.Z.A., Ko, R., Doss, R., Alcaraz Calero, J.M. (eds.) Security in Computing and Communications: 4th International Symposium, SSCC 2016, Jaipur, India, September 21–24, 2016, Proceedings, pp. 427–438. Springer Singapore, Singapore (2016). https://doi.org/10.1007/978-981-10-2738-3_37
Chapter Google Scholar
Alves, F., Bettini, A., Ferreira, P.M., Bessani, A.: Processing tweets for cybersecurity threat awareness. Inf. Syst. 95, 101586 (2021)
Article Google Scholar
Sarker, I.H., Abushark, Y.B., Alsolami, F., Khan, A.I.: IntruDTree: a machine learning-based cyber security intrusion detection model. Symmetry 12(5), 754 (2020)
Article Google Scholar
Aliabadi, F., Majidi, M.-H., Khorashadizadeh, S.: Chaos synchronization using adaptive quantum neural networks and its application in secure communication and cryptography. Neural Comput. Appl. 34, 6521–6533 (2021). https://doi.org/10.1007/s00521-021-06768-z
Article Google Scholar
Abubakar, A., Garko, A.B.: A Predictive model for network intrusion detection system using deep neural network. Dutse J. Pure Appl. Sci. 7(3a), 113–128 (2021). https://doi.org/10.4314/dujopas.v7i3a.12
Article Google Scholar
Wang, S., Nie, L., Li, G., Wu, Y., Ning, Z.: A multi-task learning-based network traffic prediction approach for SDN-enabled Industrial Internet of Things. IEEE Trans. Industr. Inf. (2022). https://doi.org/10.1109/TII.2022.3141743
Almohamade, S.S., Clark, J. A., Law, J.: Behaviour-based biometrics for continuous user authentication to industrial collaborative robots. In: Maimut, D., Oprina, A.-G., Sauveron, D. (eds.) SecITC 2020. LNCS, vol. 12596, pp. 185–197. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-69255-1_12
Chapter Google Scholar
Dasgupta, S., Piplai, A., Kotal, A., Joshi, A.: A comparative study of deep learning based named entity recognition algorithms for cybersecurity. In: 2020 IEEE International Conference on Big Data (Big Data), pp. 2596-2604 (2020). https://doi.org/10.1109/BigData50022.2020.9378482
Li, L., Thakur, K., Ali, M.L.: Potential development on cyberattack and prospect analysis for cybersecurity. In: 2020 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), pp. 1–6 (2020). https://doi.org/10.1109/IEMTRONICS51293.2020.9216374
Fontugne, R., Borgnat, P., Abry, P., Fukuda, K.: MAWILab: Combining diverse anomaly detectors for automated anomaly labeling and performance benchmarking. In: ACM CoNEXT 2010, Philadelphia, PA (2010)
Google Scholar
Catillo, M., Del Vecchio, A., Ocone, L., Pecchia, A., Villano, U.: USB-IDS-1: a public multilayer dataset of labeled network flows for IDS evaluation. In: 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W), pp. 1–6 (2021). https://doi.org/10.1109/DSN-W52860.2021.00012
Mäses, S., Maennel, K., Toussaint, M., Rosa, V.: Success factors for designing a cybersecurity exercise on the example of incident response. In: 2021 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), pp. 259–268 (2021). https://doi.org/10.1109/EuroSPW54576.2021.00033
Phadke, A., Kulkarni, M., Bhawalkar, P., Bhattad, R.: A review of machine learning methodologies for network intrusion detection. In: 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC), pp. 272–275 (2019). https://doi.org/10.1109/ICCMC.2019.8819748
Al-Asli, M., Ghaleb, T.A.: Review of signature-based techniques in antivirus products. In: 2019 International Conference on Computer and Information Sciences (ICCIS), pp. 1–6 (2019). https://doi.org/10.1109/ICCISci.2019.8716381
Korba, A.A., Nafaa, M., Ghamri-Doudane, Y.: Anomaly-based intrusion detection system for ad hoc networks. In: 2016 7th International Conference on the Network of the Future (NOF), pp. 1–3 (2016). https://doi.org/10.1109/NOF.2016.7810132
Vengatesan, K., Kumar, A., Naik, R., Verma, D.K.: Anomaly based novel intrusion detection system for network traffic reduction. In: 2018 2nd International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC)I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), 2018 2nd International Conference on, pp. 688–690 (2018). https://doi.org/10.1109/I-SMAC.2018.8653735
Kumari, U., Soni, U.: A review of intrusion detection using anomaly based detection. In: 2017 2nd International Conference on Communication and Electronics Systems (ICCES), pp. 824–826 (2017). https://doi.org/10.1109/CESYS.2017.8321199
von Rueden, L., et al.: Informed machine learning - a taxonomy and survey of integrating prior knowledge into learning systems. In: IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2021.3079836
Ligo, A.K., Kott, A., Linkov, I.: Autonomous cyberdefense introduces risk: can we manage the risk? Computer 54(10), 106–110 (2021). https://doi.org/10.1109/MC.2021.3099042
Article Google Scholar
Souza, M.A., Sabourin, R., Cavalcanti, G.D.C., Cruz, R.M.O.: Multi-label learning for dynamic model type recommendation. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–10 (2020). https://doi.org/10.1109/IJCNN48605.2020.9207644.
Puzis, N.S.R., Angappan, K.: Deep learning for threat actor attribution from threat reports. In: 2020 4th International Conference on Computer, Communication and Signal Processing (ICCCSP), pp. 1–6 (2020). https://doi.org/10.1109/ICCCSP49186.2020.9315219
Das, P., Kalbande, D.: Behavioural analysis of multi-source social network data using object-centric behavioural constraints and data mining technique. In: 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1–8 (2020). https://doi.org/10.1109/ICCCNT49239.2020.9225323
Bokan, B., Santos, J.: Managing cybersecurity risk using threat based methodology for evaluation of cybersecurity architectures. In: 2021 Systems and Information Engineering Design Symposium (SIEDS), pp. 1–6 (2021). https://doi.org/10.1109/SIEDS52267.2021.9483736
Sahakian, M.G., Musuvathy, S., Thorpe, J., Verzi, S., Vugrin, E., Dykstra, M.: Threat data generation for space systems. In: 2021 IEEE Space Computing Conference (SCC), pp. 100–109 (2021). https://doi.org/10.1109/SCC49971.2021.00018
Stergiopoulos, G., Gritzalis, D.A., Limnaios, E.: Cyber-attacks on the oil & gas sector: a survey on incident assessment and attack patterns. IEEE Access 8, 128440–128475 (2020). https://doi.org/10.1109/ACCESS.2020.3007960
Article Google Scholar

Download references

Author information

Authors and Affiliations

Scientific Research Center in Mathematics and Computer Science, SPIRU HARET University, Bucharest, Romania
Marius Iulian Mihailescu
Department of Computers and Cybersecurity, “FERDINAND I” Military Technical Academy, Bucharest, Romania
Stefania Loredana Nita

Authors

Marius Iulian Mihailescu
View author publications
You can also search for this author in PubMed Google Scholar
Stefania Loredana Nita
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marius Iulian Mihailescu .

Editor information

Editors and Affiliations

Faculty of Science and Engineering, Saga University, Saga, Japan
Kohei Arai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mihailescu, M.I., Nita, S.L. (2023). Towards Data Science for Cybersecurity: Machine Learning Advances as Glowing Perspective. In: Arai, K. (eds) Intelligent Systems and Applications. IntelliSys 2022. Lecture Notes in Networks and Systems, vol 543. Springer, Cham. https://doi.org/10.1007/978-3-031-16078-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-16078-3_2
Published: 01 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16077-6
Online ISBN: 978-3-031-16078-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics