Skip to main content

Towards Data Science for Cybersecurity: Machine Learning Advances as Glowing Perspective

  • Conference paper
  • First Online:
Intelligent Systems and Applications (IntelliSys 2022)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 543))

Included in the following conference series:

  • 957 Accesses

Abstract

The current computing context has developed important opportunities and challenges by the new attacks that occurred recently due to the pandemic situation (COVID-19), cybersecurity has crossed and still passing through significant changes by the technology and its operation. Many computer security incident response teams (CSIRT) and cybersecurity centers had reported significant behaviors of the attacks and they raised multiple warning signs, some of them being ignored by different third parties and others were taken into consideration and new frameworks started to be translated into research directions as a cross-collaboration between researchers and professionals. As a conclusion of CSIRTs, data science is the leader and gives the tone of the change. Identifying properly the security incident patterns or different types of insights within the cybersecurity data and implementing the right data-driven model, represents the main task is to achieve for an automated and intelligent security system. In this paper, we will propose a machine learning framework for cybersecurity, focusing on data science for cybersecurity, where the data collected from trusted sources t are relevant for cybersecurity. Our work will kickstart discussion on various research challenges which are open for improvements and will also point out the most challenging future research directions. Altogether, our purpose is not limited to discussing data science within the cybersecurity context and relevant methods/algorithms, but also to focus on the applicability of taking the most intelligent decisions based on data to protect the systems against cyber attacks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Hajny, J., Ricci, S., Piesarskas, E., Levillain, O., Galletta, L., De Nicola, R.: Framework, tools and good practices for cybersecurity curricula. IEEE Access 9, 94723–94747 (2021). https://doi.org/10.1109/ACCESS.2021.3093952

    Article  Google Scholar 

  2. Megantara, A.A., Ahmad, T.: A hybrid machine learning method for increasing the performance of network intrusion detection systems. J. Big Data 8(1), 1–19 (2021). https://doi.org/10.1186/s40537-021-00531-w

    Article  Google Scholar 

  3. Sennaike, O.A., et al.: Towards intelligent open data platforms: Discovering relatedness in datasets. In: 2017 Intelligent Systems Conference (IntelliSys), pp. 414-421 (2017). https://doi.org/10.1109/IntelliSys.2017.8324327

  4. Haas, L.: Leveraging data and people to accelerate data science. In: 2017 IEEE 33rd International Conference on Data Engineering (ICDE), p. 4 (2017). https://doi.org/10.1109/ICDE.2017.9

  5. Tahtaci, B., Canbay, B.: Android malware detection using machine learning. In: 2020 Innovations in Intelligent Systems and Applications Conference (ASYU), pp. 1–6 (2020). https://doi.org/10.1109/ASYU50717.2020.9259834

  6. Firdausi, I., lim, C., Erwin, A., Nugroho, A.S.: Analysis of machine learning techniques used in behavior-based malware detection. In: 2010 Second International Conference on Advances in Computing, Control, and Telecommunication Technologies, pp. 201–203 (2010). https://doi.org/10.1109/ACT.2010.33

  7. Choudhary, S., Sharma, A.: Malware detection & classification using machine learning. In: 2020 International Conference on Emerging Trends in Communication, Control and Computing (ICONC3), pp. 1–4 (2020). https://doi.org/10.1109/ICONC345789.2020.9117547

  8. Vanjire, S., Lakshmi, M.: Behavior-based malware detection system approach for mobile security using machine learning. In: 2021 International Conference on Artificial Intelligence and Machine Vision (AIMV), pp. 1–4 (2021). https://doi.org/10.1109/AIMV53313.2021.9671009

  9. Jin, S., Chung, J.-G., Xu, Y.: Signature-based intrusion detection system (IDS) for in-vehicle CAN bus network. In: 2021 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–5 (2021). https://doi.org/10.1109/ISCAS51556.2021.9401087

  10. Abri, F., Siami-Namini, S., Khanghah, M.A., Soltani, F.M., Namin, A.S.: Can machine/deep learning classifiers detect zero-day malware with high accuracy? In: 2019 IEEE International Conference on Big Data (Big Data), pp. 3252–3259 (2019). https://doi.org/10.1109/BigData47090.2019.9006514

  11. Qadir, S., Noor, B.: Applications of machine learning in digital forensics. In: 2021 International Conference on Digital Futures and Transformative Technologies (ICoDT2), pp. 1–8 (2021). https://doi.org/10.1109/ICoDT252288.2021.9441543

  12. L’Heureux, A., Grolinger, K., Elyamany, H.F., Capretz, M.A.M.: Machine learning with big data: challenges and approaches. IEEE Access 5, 7776–7797 (2017). https://doi.org/10.1109/ACCESS.2017.2696365

    Article  Google Scholar 

  13. Verkerken, M., D’hooge, L., Wauters, T., Volckaert, B., De Turck, F.: Unsupervised machine learning techniques for network intrusion detection on modern data. In: 2020 4th Cyber Security in Networking Conference (CSNet), pp. 1–8 (2020). https://doi.org/10.1109/CSNet50428.2020.9265461

  14. Fadhlillah, A., Karna, N., Irawan, A.: IDS performance analysis using anomaly-based detection method for DOS attack. In: 2020 IEEE International Conference on Internet of Things and Intelligence System (IoTaIS), pp. 18-22 (2021). https://doi.org/10.1109/IoTaIS50849.2021.9359719

  15. Chavan, A., Kerakalamatti, K., Srivastva, S.: Implementation of portable antivirus system using signature-based detection and heuristic analysis. In: 2021 5th International Conference on Trends in Electronics and Informatics (ICOEI), pp. 1481–1486 (2021). https://doi.org/10.1109/ICOEI51242.2021.9452909

  16. Min, B., Yoo, J., Kim, S., Shin, D., Shin, D.: Network anomaly detection using memory-augmented deep autoencoder. IEEE Access 9, 104695–104706 (2021). https://doi.org/10.1109/ACCESS.2021.3100087

    Article  Google Scholar 

  17. Sarker, I.H., Kayes, A.S.M., Badsha, S., Alqahtani, H., Watters, P., Ng, A.: Cybersecurity data science: an overview from machine learning perspective. J. Big Data 7(1), 1–29 (2020). https://doi.org/10.1186/s40537-020-00318-5

    Article  Google Scholar 

  18. Sarker, I.H.: Data science and analytics: an overview from data-driven smart computing, decision-making and applications perspective. SN Comput. Sci. 2, 377 (2021). https://doi.org/10.1007/s42979-021-00765-8

    Article  Google Scholar 

  19. Maxwell, P., Alhajjar, E., Bastian, N.D.: Intelligent feature engineering for cybersecurity. In: 2019 IEEE International Conference on Big Data (Big Data), pp. 5005–5011 (2019). https://doi.org/10.1109/BigData47090.2019.9006122

  20. Ahsan, M., Rahul Gomes, M., Chowdhury, M., Nygard, K.E.: Enhancing machine learning prediction in cybersecurity using dynamic feature selector. J. Cybersecurity Priv. 1(1), 199–218 (2021). https://doi.org/10.3390/jcp1010011

    Article  Google Scholar 

  21. Mukherjee, S.: Top 10 Breakthroughs in Big Data Science in 2017 (2017). https://www.datacamp.com/community/blog/breakthroughs-big-data-science-2017. Last accessed 22 Jan 2022

  22. Akhmetov, B., Lakhno, V., Akhmetov, B., Alimseitova, Z.: Development of sectoral intellectualized expert systems and decision making support systems in cybersecurity. In: Silhavy, R., Silhavy, P., Prokopova, Z. (eds.) Intelligent Systems in Cybernetics and Automation Control Theory, pp. 162–171. Springer International Publishing, Cham (2019). https://doi.org/10.1007/978-3-030-00184-1_15

    Chapter  Google Scholar 

  23. Langford, G.O., et al.: Cybersecurity Planning for Artificial Intelligent Systems in Space. In: 2019 Portland International Conference on Management of Engineering and Technology (PICMET), pp. 1–8. IEEE (2019)

    Google Scholar 

  24. Rodriguez, A., Okamura, K.: Cybersecurity text data classification and optimization for CTI systems. In: Barolli, L., Amato, F., Moscato, F., Enokido, T., Takizawa, M. (eds.) Web, Artificial Intelligence and Network Applications: Proceedings of the Workshops of the 34th International Conference on Advanced Information Networking and Applications (WAINA-2020), pp. 410–419. Springer International Publishing, Cham (2020). https://doi.org/10.1007/978-3-030-44038-1_37

    Chapter  Google Scholar 

  25. Gnatyuk, S., Sydorenko, V., Polozhentsev, A., Fesenko, A., Akatayev, N., Zhilkishbayeva, G.: Method of cybersecurity level determining for the critical information infrastructure of the state. In: COAPSN, pp. 332–341 (2020)

    Google Scholar 

  26. Zhang, S., Xie, X., Xu, Y.: A brute-force black-box method to attack machine learning-based systems in cybersecurity. IEEE Access 8, 128250–128263 (2020)

    Article  Google Scholar 

  27. Teixeira, M.A., Salman, T., Zolanvari, M., Jain, R., Meskin, N., Samaka, M.: SCADA system testbed for cybersecurity research using machine learning approach. Future Internet 10(8), 76 (2018)

    Article  Google Scholar 

  28. Chesney, S., Roy, K., Khorsandroo, S.: Machine learning algorithms for preventing IoT cybersecurity attacks. In: Arai, K., Kapoor, S., Bhatia, R. (eds.) Intelligent Systems and Applications: Proceedings of the 2020 Intelligent Systems Conference (IntelliSys) Volume 3, pp. 679–686. Springer International Publishing, Cham (2021). https://doi.org/10.1007/978-3-030-55190-2_53

    Chapter  Google Scholar 

  29. Hariharan, A., Gupta, A., Pal, T.: Camlpad: cybersecurity autonomous machine learning platform for anomaly detection. In: Arai, K., Kapoor, S., Bhatia, R. (eds.) FICC 2020. AISC, vol. 1130, pp. 705–720. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-39442-4_52

    Chapter  Google Scholar 

  30. Puthran, S., Shah, K.: Intrusion detection using improved decision tree algorithm with binary and quad split. In: Mueller, P., Thampi, S.M., Bhuiyan, M.Z.A., Ko, R., Doss, R., Alcaraz Calero, J.M. (eds.) Security in Computing and Communications: 4th International Symposium, SSCC 2016, Jaipur, India, September 21–24, 2016, Proceedings, pp. 427–438. Springer Singapore, Singapore (2016). https://doi.org/10.1007/978-981-10-2738-3_37

    Chapter  Google Scholar 

  31. Alves, F., Bettini, A., Ferreira, P.M., Bessani, A.: Processing tweets for cybersecurity threat awareness. Inf. Syst. 95, 101586 (2021)

    Article  Google Scholar 

  32. Sarker, I.H., Abushark, Y.B., Alsolami, F., Khan, A.I.: IntruDTree: a machine learning-based cyber security intrusion detection model. Symmetry 12(5), 754 (2020)

    Article  Google Scholar 

  33. Aliabadi, F., Majidi, M.-H., Khorashadizadeh, S.: Chaos synchronization using adaptive quantum neural networks and its application in secure communication and cryptography. Neural Comput. Appl. 34, 6521–6533 (2021). https://doi.org/10.1007/s00521-021-06768-z

    Article  Google Scholar 

  34. Abubakar, A., Garko, A.B.: A Predictive model for network intrusion detection system using deep neural network. Dutse J. Pure Appl. Sci. 7(3a), 113–128 (2021). https://doi.org/10.4314/dujopas.v7i3a.12

    Article  Google Scholar 

  35. Wang, S., Nie, L., Li, G., Wu, Y., Ning, Z.: A multi-task learning-based network traffic prediction approach for SDN-enabled Industrial Internet of Things. IEEE Trans. Industr. Inf. (2022). https://doi.org/10.1109/TII.2022.3141743

  36. Almohamade, S.S., Clark, J. A., Law, J.: Behaviour-based biometrics for continuous user authentication to industrial collaborative robots. In: Maimut, D., Oprina, A.-G., Sauveron, D. (eds.) SecITC 2020. LNCS, vol. 12596, pp. 185–197. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-69255-1_12

    Chapter  Google Scholar 

  37. Dasgupta, S., Piplai, A., Kotal, A., Joshi, A.: A comparative study of deep learning based named entity recognition algorithms for cybersecurity. In: 2020 IEEE International Conference on Big Data (Big Data), pp. 2596-2604 (2020). https://doi.org/10.1109/BigData50022.2020.9378482

  38. Li, L., Thakur, K., Ali, M.L.: Potential development on cyberattack and prospect analysis for cybersecurity. In: 2020 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), pp. 1–6 (2020). https://doi.org/10.1109/IEMTRONICS51293.2020.9216374

  39. Fontugne, R., Borgnat, P., Abry, P., Fukuda, K.: MAWILab: Combining diverse anomaly detectors for automated anomaly labeling and performance benchmarking. In: ACM CoNEXT 2010, Philadelphia, PA (2010)

    Google Scholar 

  40. Catillo, M., Del Vecchio, A., Ocone, L., Pecchia, A., Villano, U.: USB-IDS-1: a public multilayer dataset of labeled network flows for IDS evaluation. In: 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W), pp. 1–6 (2021). https://doi.org/10.1109/DSN-W52860.2021.00012

  41. Mäses, S., Maennel, K., Toussaint, M., Rosa, V.: Success factors for designing a cybersecurity exercise on the example of incident response. In: 2021 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), pp. 259–268 (2021). https://doi.org/10.1109/EuroSPW54576.2021.00033

  42. Phadke, A., Kulkarni, M., Bhawalkar, P., Bhattad, R.: A review of machine learning methodologies for network intrusion detection. In: 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC), pp. 272–275 (2019). https://doi.org/10.1109/ICCMC.2019.8819748

  43. Al-Asli, M., Ghaleb, T.A.: Review of signature-based techniques in antivirus products. In: 2019 International Conference on Computer and Information Sciences (ICCIS), pp. 1–6 (2019). https://doi.org/10.1109/ICCISci.2019.8716381

  44. Korba, A.A., Nafaa, M., Ghamri-Doudane, Y.: Anomaly-based intrusion detection system for ad hoc networks. In: 2016 7th International Conference on the Network of the Future (NOF), pp. 1–3 (2016). https://doi.org/10.1109/NOF.2016.7810132

  45. Vengatesan, K., Kumar, A., Naik, R., Verma, D.K.: Anomaly based novel intrusion detection system for network traffic reduction. In: 2018 2nd International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC)I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), 2018 2nd International Conference on, pp. 688–690 (2018). https://doi.org/10.1109/I-SMAC.2018.8653735

  46. Kumari, U., Soni, U.: A review of intrusion detection using anomaly based detection. In: 2017 2nd International Conference on Communication and Electronics Systems (ICCES), pp. 824–826 (2017). https://doi.org/10.1109/CESYS.2017.8321199

  47. von Rueden, L., et al.: Informed machine learning - a taxonomy and survey of integrating prior knowledge into learning systems. In: IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2021.3079836

  48. Ligo, A.K., Kott, A., Linkov, I.: Autonomous cyberdefense introduces risk: can we manage the risk? Computer 54(10), 106–110 (2021). https://doi.org/10.1109/MC.2021.3099042

    Article  Google Scholar 

  49. Souza, M.A., Sabourin, R., Cavalcanti, G.D.C., Cruz, R.M.O.: Multi-label learning for dynamic model type recommendation. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–10 (2020). https://doi.org/10.1109/IJCNN48605.2020.9207644.

  50. Puzis, N.S.R., Angappan, K.: Deep learning for threat actor attribution from threat reports. In: 2020 4th International Conference on Computer, Communication and Signal Processing (ICCCSP), pp. 1–6 (2020). https://doi.org/10.1109/ICCCSP49186.2020.9315219

  51. Das, P., Kalbande, D.: Behavioural analysis of multi-source social network data using object-centric behavioural constraints and data mining technique. In: 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1–8 (2020). https://doi.org/10.1109/ICCCNT49239.2020.9225323

  52. Bokan, B., Santos, J.: Managing cybersecurity risk using threat based methodology for evaluation of cybersecurity architectures. In: 2021 Systems and Information Engineering Design Symposium (SIEDS), pp. 1–6 (2021). https://doi.org/10.1109/SIEDS52267.2021.9483736

  53. Sahakian, M.G., Musuvathy, S., Thorpe, J., Verzi, S., Vugrin, E., Dykstra, M.: Threat data generation for space systems. In: 2021 IEEE Space Computing Conference (SCC), pp. 100–109 (2021). https://doi.org/10.1109/SCC49971.2021.00018

  54. Stergiopoulos, G., Gritzalis, D.A., Limnaios, E.: Cyber-attacks on the oil & gas sector: a survey on incident assessment and attack patterns. IEEE Access 8, 128440–128475 (2020). https://doi.org/10.1109/ACCESS.2020.3007960

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marius Iulian Mihailescu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mihailescu, M.I., Nita, S.L. (2023). Towards Data Science for Cybersecurity: Machine Learning Advances as Glowing Perspective. In: Arai, K. (eds) Intelligent Systems and Applications. IntelliSys 2022. Lecture Notes in Networks and Systems, vol 543. Springer, Cham. https://doi.org/10.1007/978-3-031-16078-3_2

Download citation

Publish with us

Policies and ethics