Skip to main content

On the Use of VGs for Feature Selection in Supervised Machine Learning - A Use Case to Detect Distributed DoS Attacks

  • Conference paper
  • First Online:
Optimization, Learning Algorithms and Applications (OL2A 2023)

Abstract

Information systems depend on security mechanisms to detect and respond to cyber-attacks. One of the most frequent attacks is the Distributed Denial of Service (DDoS): it impairs the performance of systems and, in the worst case, leads to prolonged periods of downtime that prevent business processes from running normally. To detect this attack, several supervised Machine Learning (ML) algorithms have been developed and companies use them to protect their servers. A key stage in these algorithms is feature pre-processing, in which, input data features are assessed and selected to obtain the best results in the subsequent stages that are required to implement supervised ML algorithms. In this article, an innovative approach for feature selection is proposed: the use of Visibility Graphs (VGs) to select features for supervised machine learning algorithms used to detect distributed DoS attacks. The results show that VG can be quickly implemented and can compete with other methods to select ML features, as they require low computational resources and they offer satisfactory results, at least in our example based on the early detection of distributed DoS. The size of the processed data appears as the main implementation constraint for this novel feature selection method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abbas, L.B., Sadiq, M.A., Ahmad, M.O.: Machine learning-based detection of DDoS attacks: a review. Futur. Gener. Comput. Syst. 111, 799–811 (2020)

    Google Scholar 

  2. Alpaydin, E.: Introduction to Machine Learning. MIT Press, Cambridge (2010)

    Google Scholar 

  3. Asonye, EA., Anwuna, I., Musa, S.M.: Securing Zig-Bee IoT network against HULK distributed denial of service attack. In: 2020 IEEE 17th International Conference on Smart Communities: Improving Quality of Life Using ICT, IoT and AI (HONET), pp. 156–162 (2020). https://doi.org/10.1109/HONET50430.2020.9322808

  4. Bagheri, R.: Introduction to SHAP Values and their Application in Machine Learning. Towards Data Science (2022). https://towardsdatascience.com/introduction-to-shap-values-and-their-application-in-machine-learning-8003718e6827

  5. Barrera-Animas, A.Y., et al.: Rainfall prediction: a comparative analysis of modern machine learning algorithms for time-series forecasting. Mach. Learn. Appl. 7, 100204 (2022). ISSN 2666-8270. https://doi.org/10.1016/j.mlwa.2021.100204

  6. Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. In: International Conference on Learning Representations (2012)

    Google Scholar 

  7. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)

    Google Scholar 

  8. Boccaletti, S., et al.: Complex networks: structure and dynamics. Phys. Rep. 424(4), 175–308 (2006). ISSN 0370-1573. https://doi.org/10.1016/j.physrep.2005.10.009

  9. Boccaletti, S., et al.: Complex networks: structure and dynamics. Phys. Rep. 424(4–5), 175–308 (2006)

    Article  MathSciNet  Google Scholar 

  10. Brown, C.: Data division strategies in machine learning. In: Proceedings of the International Conference on Machine Learning, pp. 234–245 (2017)

    Google Scholar 

  11. Chippalakatti, S., Renumadhavi, C.H., Pallavi, A.: Comparison of unsupervised machine learning Algorithm F or dimensionality reduction. In: 2022 International Conference on Knowledge Engineering and Communication Systems (ICKES), pp. 1–7 (2022). https://doi.org/10.1109/ICKECS56523.2022.10060625.

  12. Cortes, C.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995). https://doi.org/10.1007/bf00994018

    Article  Google Scholar 

  13. Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theor. 13(1), 21–27 (1967). https://doi.org/10.1109/TIT.1967.1053964

    Article  Google Scholar 

  14. Domingos, P.: A few useful things to know about machine learning. Commun. ACM 55(10), 78–87 (2012)

    Article  Google Scholar 

  15. Falkner, S., Klein, A., Hutter, F.: BOHB: robust and efficient hyperparameter optimization at scale. In: Proceedings of the 35th International Conference on Machine Learning, pp. 1436–1445 (2018)

    Google Scholar 

  16. Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugenics 7(2), 179–188. 1469–1809 (1936). https://doi.org/10.1111/j.1469-1809.1936.tb02137.x.

  17. Gani, A., Ullah, S., Khan, K.: Detection of Denial of Service (DoS) attacks using machine learning techniques. In: 2019 International Conference on Computer and Information Sciences (ICCIS), pp. 1–6. IEEE (2019)

    Google Scholar 

  18. Géron, A.: Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media (2019)

    Google Scholar 

  19. Gonzalez, M.: Algorithm Applications in Machine Learning. Springer, Heidelberg (2019)

    Google Scholar 

  20. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016)

    Google Scholar 

  21. Gupta, B.B., Badve, O.P.: Taxonomy of DoS and DDoS attacks and desirable defense mechanism in a Cloud computing environment. Neural Comput. Appl. 28(12 ), 3655–3682 (2017). ISSN 1433–3058. https://doi.org/10.1007/s00521-016-2317-5

  22. Gupta, B., Gupta, R., Tyagi, S.K.: Taxonomy of DDoS attacks and their prevention techniques: a review. J. Netw. Comput. Appl. 126, 48–73 (2019). ISSN 1084-8045. https://doi.org/10.1016/j.jnca.2018.10.009

  23. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. SSS, Springer, New York (2009). https://doi.org/10.1007/978-0-387-84858-7

    Book  Google Scholar 

  24. Islam, S.M.R., et al.: Detecting DDoS attacks with machine learning techniques. Inf. Sci. 254, 1–14 (2014)

    Google Scholar 

  25. Johnson, M., Smith, L.: Visibility graphs: a survey. IEEE Trans. Vis. Comput. Graph. 21(8), 933–952 (2015)

    Google Scholar 

  26. Jones, M., Brown, E.: Data pre-processing techniques in machine learning. Int. J. Data Sci. 8(2), 789–804 (2016)

    Google Scholar 

  27. Kelleher, J.D., Tierney, B., Tierney, B.: Data Science: An Introduction, 2nd edn. CRC Press (2018). Chap. 5

    Google Scholar 

  28. Khosravi, A., Machado, L., Nunes, R.O.: Time-series prediction of wind speed using machine learning algorithms: a case study Osorio wind farm, Brazil. Appl. Energy 224, 550–566 (2018). ISSN 0306-2619. https://doi.org/10.1016/j.apenergy.2018.05.043

  29. Lacasa, L., et al.: From time series to complex networks: the visibility graph. Proc. Natl. Acad. Sci. 105(13), 4972–4975 (2008)

    Article  MathSciNet  Google Scholar 

  30. Liu, J., Chen, J.: Visibility graphs for analyzing complex systems: a review. Chaos Interdisc. J. Nonlinear Sci. 28(4), 041101 (2018)

    Google Scholar 

  31. Lucas, T., da Fontoura Costa, L., da Rocha, L.E.C.: Visibility graph analysis: a review. J. Stat. Mech. Theor. Exp. 2014(8), 08001 (2014)

    Google Scholar 

  32. Mangalathu, S., Hwang, S.-H., Jeon, J.-S.: Failure mode and effects analysis of RC members based on machine-learning-based SHapley Additive exPlanations (SHAP) approach. Eng. Struct. 219, 110927 (2020). ISSN 0141-0296. https://doi.org/10.1016/j.engstruct.2020.110927

  33. McCallum, A., Nigam, K.: A comparison of event models for Naive Bayes text classification. In: AAAI-98 Workshop on Learning for Text Categorization, vol. 752, pp. 41–48 (1998)

    Google Scholar 

  34. Mishra, D.K., Singh, V.P., Tripathi, R.: Network security situation awareness using visibility graph. J. Netw. Comput. Appl. 58, 49–62 (2015). ISSN 1084-8045. https://doi.org/10.1016/j.jnca.2015.09.007

  35. Müller, A.C., Guido, S.: Introduction to Machine Learning with Python: A Guide for Data Scientists. O’Reilly Media (2016)

    Google Scholar 

  36. Murty, M.N., Raghava, R.: Support Vector Machines and Perceptrons. Learning, Optimization, Classification, and Application to Social Networks. SCS, Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41063-0

    Book  Google Scholar 

  37. Myles, A.J., et al.: An introduction to decision tree modeling. J. Chemom. J. Chemometr. Soc. 18(6), 275–285 (2004)

    Google Scholar 

  38. Nasteski, V.: An overview of the supervised machine learning methods. In: HORIZONS.B 4, pp. 51–62, December 2017. https://doi.org/10.20544/HORIZONS.B.04.1.17.P05

  39. Newman, M.E.J.: The structure and function of complex networks. SIAM Rev. 45(2), 167–257 (2003)

    Article  MathSciNet  Google Scholar 

  40. Ng, A.: Machine learning yearning. Draft (2018). https://www.mlyearning.org/

  41. Partida, A., Criado, R., Romance, M.: Visibility graph analysis of IOTA and IoTeX price series: an intentional risk-based strategy to use 5G for IoT. Electronics 10(18) (2021). ISSN 2079-9292. https://doi.org/10.3390/electronics10182282

  42. Partida, A., et al.: The chaotic, self-similar and hierarchical patterns in Bitcoin and Ethereum price series. Chaos Solitons Fractals 165, 112806 (2022). ISSN 0960-0779. https://doi.org/10.1016/j.chaos.2022.112806

  43. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986). https://doi.org/10.1038/323533a0

    Article  Google Scholar 

  44. šarčević, A., et al.: Cybersecurity knowledge extraction using XAI. Appl. Sci. 12(17) (2022). ISSN 2076-3417. https://doi.org/10.3390/app12178669

  45. Shorey, T., et al.: Performance comparison and analysis of Slowloris, GoldenEye and Xerxes DDoS attack tools. In: 2018 International Conference on Advances in Computing, Communications and Informatics, ICA CCI 2018, pp. 318–322 (2018). https://doi.org/10.1109/ICACCI.2018.8554590

  46. Smith, J., Johnson, S.: Data collection for machine learning. J. Mach. Learn. Res. 12(4), 1234–1256 (2018)

    Google Scholar 

  47. Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: International Conference on Neural Information Processing Systems (2012)

    Google Scholar 

  48. Stefano, B.: Multiscale vulnerability of complex networks. Chaos 17(4), 175–308 (2007). https://doi.org/10.1063/1.2801687

    Article  Google Scholar 

  49. Wang, X., Zhang, W.: Visibility graph analysis: a novel approach for network traffic modeling. In: Proceedings of the International Conference on Communications, pp. 123–130 (2017)

    Google Scholar 

  50. Warda: Application-Layer DDoS Dataset (2020). https://www.kaggle.com/datasets/wardac/applicationlayer-ddos-dataset?select=test_mosaic.csv

  51. Xiang, J., Small, M.: Visibility graphlet approach to chaotic time series. Phys. Rev. E 92(6), 062817 (2015)

    Google Scholar 

  52. Zhang, J., Small, M.: Complex network from pseudoperiodic time series: topology versus dynamics. Phys. Rev. Lett. 96, 238701 (2006). https://doi.org/10.1103/PhysRevLett.96.238701

    Article  Google Scholar 

Download references

Acknowledgements

This work was partially supported by the Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF), within the project “Cybers SeC IP” (NORTE-01-0145-FEDER-000044).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to João Lopes .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lopes, J., Partida, A., Pinto, P., Pinto, A. (2024). On the Use of VGs for Feature Selection in Supervised Machine Learning - A Use Case to Detect Distributed DoS Attacks. In: Pereira, A.I., Mendes, A., Fernandes, F.P., Pacheco, M.F., Coelho, J.P., Lima, J. (eds) Optimization, Learning Algorithms and Applications. OL2A 2023. Communications in Computer and Information Science, vol 1981. Springer, Cham. https://doi.org/10.1007/978-3-031-53025-8_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-53025-8_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-53024-1

  • Online ISBN: 978-3-031-53025-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics