On the Use of VGs for Feature Selection in Supervised Machine Learning - A Use Case to Detect Distributed DoS Attacks

Lopes, João; Partida, Alberto; Pinto, Pedro; Pinto, António

doi:10.1007/978-3-031-53025-8_19

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1981))

Included in the following conference series:

International Conference on Optimization, Learning Algorithms and Applications

159 Accesses

Abstract

Information systems depend on security mechanisms to detect and respond to cyber-attacks. One of the most frequent attacks is the Distributed Denial of Service (DDoS): it impairs the performance of systems and, in the worst case, leads to prolonged periods of downtime that prevent business processes from running normally. To detect this attack, several supervised Machine Learning (ML) algorithms have been developed and companies use them to protect their servers. A key stage in these algorithms is feature pre-processing, in which, input data features are assessed and selected to obtain the best results in the subsequent stages that are required to implement supervised ML algorithms. In this article, an innovative approach for feature selection is proposed: the use of Visibility Graphs (VGs) to select features for supervised machine learning algorithms used to detect distributed DoS attacks. The results show that VG can be quickly implemented and can compete with other methods to select ML features, as they require low computational resources and they offer satisfactory results, at least in our example based on the early detection of distributed DoS. The size of the processed data appears as the main implementation constraint for this novel feature selection method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abbas, L.B., Sadiq, M.A., Ahmad, M.O.: Machine learning-based detection of DDoS attacks: a review. Futur. Gener. Comput. Syst. 111, 799–811 (2020)
Google Scholar
Alpaydin, E.: Introduction to Machine Learning. MIT Press, Cambridge (2010)
Google Scholar
Asonye, EA., Anwuna, I., Musa, S.M.: Securing Zig-Bee IoT network against HULK distributed denial of service attack. In: 2020 IEEE 17th International Conference on Smart Communities: Improving Quality of Life Using ICT, IoT and AI (HONET), pp. 156–162 (2020). https://doi.org/10.1109/HONET50430.2020.9322808
Bagheri, R.: Introduction to SHAP Values and their Application in Machine Learning. Towards Data Science (2022). https://towardsdatascience.com/introduction-to-shap-values-and-their-application-in-machine-learning-8003718e6827
Barrera-Animas, A.Y., et al.: Rainfall prediction: a comparative analysis of modern machine learning algorithms for time-series forecasting. Mach. Learn. Appl. 7, 100204 (2022). ISSN 2666-8270. https://doi.org/10.1016/j.mlwa.2021.100204
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. In: International Conference on Learning Representations (2012)
Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)
Google Scholar
Boccaletti, S., et al.: Complex networks: structure and dynamics. Phys. Rep. 424(4), 175–308 (2006). ISSN 0370-1573. https://doi.org/10.1016/j.physrep.2005.10.009
Boccaletti, S., et al.: Complex networks: structure and dynamics. Phys. Rep. 424(4–5), 175–308 (2006)
Article MathSciNet Google Scholar
Brown, C.: Data division strategies in machine learning. In: Proceedings of the International Conference on Machine Learning, pp. 234–245 (2017)
Google Scholar
Chippalakatti, S., Renumadhavi, C.H., Pallavi, A.: Comparison of unsupervised machine learning Algorithm F or dimensionality reduction. In: 2022 International Conference on Knowledge Engineering and Communication Systems (ICKES), pp. 1–7 (2022). https://doi.org/10.1109/ICKECS56523.2022.10060625.
Cortes, C.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995). https://doi.org/10.1007/bf00994018
Article Google Scholar
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theor. 13(1), 21–27 (1967). https://doi.org/10.1109/TIT.1967.1053964
Article Google Scholar
Domingos, P.: A few useful things to know about machine learning. Commun. ACM 55(10), 78–87 (2012)
Article Google Scholar
Falkner, S., Klein, A., Hutter, F.: BOHB: robust and efficient hyperparameter optimization at scale. In: Proceedings of the 35th International Conference on Machine Learning, pp. 1436–1445 (2018)
Google Scholar
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugenics 7(2), 179–188. 1469–1809 (1936). https://doi.org/10.1111/j.1469-1809.1936.tb02137.x.
Gani, A., Ullah, S., Khan, K.: Detection of Denial of Service (DoS) attacks using machine learning techniques. In: 2019 International Conference on Computer and Information Sciences (ICCIS), pp. 1–6. IEEE (2019)
Google Scholar
Géron, A.: Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media (2019)
Google Scholar
Gonzalez, M.: Algorithm Applications in Machine Learning. Springer, Heidelberg (2019)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016)
Google Scholar
Gupta, B.B., Badve, O.P.: Taxonomy of DoS and DDoS attacks and desirable defense mechanism in a Cloud computing environment. Neural Comput. Appl. 28(12 ), 3655–3682 (2017). ISSN 1433–3058. https://doi.org/10.1007/s00521-016-2317-5
Gupta, B., Gupta, R., Tyagi, S.K.: Taxonomy of DDoS attacks and their prevention techniques: a review. J. Netw. Comput. Appl. 126, 48–73 (2019). ISSN 1084-8045. https://doi.org/10.1016/j.jnca.2018.10.009
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. SSS, Springer, New York (2009). https://doi.org/10.1007/978-0-387-84858-7
Book Google Scholar
Islam, S.M.R., et al.: Detecting DDoS attacks with machine learning techniques. Inf. Sci. 254, 1–14 (2014)
Google Scholar
Johnson, M., Smith, L.: Visibility graphs: a survey. IEEE Trans. Vis. Comput. Graph. 21(8), 933–952 (2015)
Google Scholar
Jones, M., Brown, E.: Data pre-processing techniques in machine learning. Int. J. Data Sci. 8(2), 789–804 (2016)
Google Scholar
Kelleher, J.D., Tierney, B., Tierney, B.: Data Science: An Introduction, 2nd edn. CRC Press (2018). Chap. 5
Google Scholar
Khosravi, A., Machado, L., Nunes, R.O.: Time-series prediction of wind speed using machine learning algorithms: a case study Osorio wind farm, Brazil. Appl. Energy 224, 550–566 (2018). ISSN 0306-2619. https://doi.org/10.1016/j.apenergy.2018.05.043
Lacasa, L., et al.: From time series to complex networks: the visibility graph. Proc. Natl. Acad. Sci. 105(13), 4972–4975 (2008)
Article MathSciNet Google Scholar
Liu, J., Chen, J.: Visibility graphs for analyzing complex systems: a review. Chaos Interdisc. J. Nonlinear Sci. 28(4), 041101 (2018)
Google Scholar
Lucas, T., da Fontoura Costa, L., da Rocha, L.E.C.: Visibility graph analysis: a review. J. Stat. Mech. Theor. Exp. 2014(8), 08001 (2014)
Google Scholar
Mangalathu, S., Hwang, S.-H., Jeon, J.-S.: Failure mode and effects analysis of RC members based on machine-learning-based SHapley Additive exPlanations (SHAP) approach. Eng. Struct. 219, 110927 (2020). ISSN 0141-0296. https://doi.org/10.1016/j.engstruct.2020.110927
McCallum, A., Nigam, K.: A comparison of event models for Naive Bayes text classification. In: AAAI-98 Workshop on Learning for Text Categorization, vol. 752, pp. 41–48 (1998)
Google Scholar
Mishra, D.K., Singh, V.P., Tripathi, R.: Network security situation awareness using visibility graph. J. Netw. Comput. Appl. 58, 49–62 (2015). ISSN 1084-8045. https://doi.org/10.1016/j.jnca.2015.09.007
Müller, A.C., Guido, S.: Introduction to Machine Learning with Python: A Guide for Data Scientists. O’Reilly Media (2016)
Google Scholar
Murty, M.N., Raghava, R.: Support Vector Machines and Perceptrons. Learning, Optimization, Classification, and Application to Social Networks. SCS, Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41063-0
Book Google Scholar
Myles, A.J., et al.: An introduction to decision tree modeling. J. Chemom. J. Chemometr. Soc. 18(6), 275–285 (2004)
Google Scholar
Nasteski, V.: An overview of the supervised machine learning methods. In: HORIZONS.B 4, pp. 51–62, December 2017. https://doi.org/10.20544/HORIZONS.B.04.1.17.P05
Newman, M.E.J.: The structure and function of complex networks. SIAM Rev. 45(2), 167–257 (2003)
Article MathSciNet Google Scholar
Ng, A.: Machine learning yearning. Draft (2018). https://www.mlyearning.org/
Partida, A., Criado, R., Romance, M.: Visibility graph analysis of IOTA and IoTeX price series: an intentional risk-based strategy to use 5G for IoT. Electronics 10(18) (2021). ISSN 2079-9292. https://doi.org/10.3390/electronics10182282
Partida, A., et al.: The chaotic, self-similar and hierarchical patterns in Bitcoin and Ethereum price series. Chaos Solitons Fractals 165, 112806 (2022). ISSN 0960-0779. https://doi.org/10.1016/j.chaos.2022.112806
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986). https://doi.org/10.1038/323533a0
Article Google Scholar
šarčević, A., et al.: Cybersecurity knowledge extraction using XAI. Appl. Sci. 12(17) (2022). ISSN 2076-3417. https://doi.org/10.3390/app12178669
Shorey, T., et al.: Performance comparison and analysis of Slowloris, GoldenEye and Xerxes DDoS attack tools. In: 2018 International Conference on Advances in Computing, Communications and Informatics, ICA CCI 2018, pp. 318–322 (2018). https://doi.org/10.1109/ICACCI.2018.8554590
Smith, J., Johnson, S.: Data collection for machine learning. J. Mach. Learn. Res. 12(4), 1234–1256 (2018)
Google Scholar
Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: International Conference on Neural Information Processing Systems (2012)
Google Scholar
Stefano, B.: Multiscale vulnerability of complex networks. Chaos 17(4), 175–308 (2007). https://doi.org/10.1063/1.2801687
Article Google Scholar
Wang, X., Zhang, W.: Visibility graph analysis: a novel approach for network traffic modeling. In: Proceedings of the International Conference on Communications, pp. 123–130 (2017)
Google Scholar
Warda: Application-Layer DDoS Dataset (2020). https://www.kaggle.com/datasets/wardac/applicationlayer-ddos-dataset?select=test_mosaic.csv
Xiang, J., Small, M.: Visibility graphlet approach to chaotic time series. Phys. Rev. E 92(6), 062817 (2015)
Google Scholar
Zhang, J., Small, M.: Complex network from pseudoperiodic time series: topology versus dynamics. Phys. Rev. Lett. 96, 238701 (2006). https://doi.org/10.1103/PhysRevLett.96.238701
Article Google Scholar

Download references

Acknowledgements

This work was partially supported by the Norte Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF), within the project “Cybers SeC IP” (NORTE-01-0145-FEDER-000044).

Author information

Authors and Affiliations

ADiT-Lab, Instituto Politécnico de Viana do Castelo, Viana do Castelo, Portugal
João Lopes & Pedro Pinto
Data, Complex Networks and Cybersecurity Sciences Technological Institute, Rey Juan Carlos University, Madrid, Spain
Alberto Partida
Universidade da Maia, Maia, Portugal
Pedro Pinto
CIICESI, ESTG, Instituto Politécnico do Porto, Felgueiras, Portugal
António Pinto
INESC TEC, Porto, Portugal
Pedro Pinto & António Pinto

Authors

João Lopes
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Partida
View author publications
You can also search for this author in PubMed Google Scholar
Pedro Pinto
View author publications
You can also search for this author in PubMed Google Scholar
António Pinto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to João Lopes .

Editor information

Editors and Affiliations

Instituto Politécnico de Bragança, Bragança, Portugal
Ana I. Pereira
University of Azores, Ponta Delgada, Portugal
Armando Mendes
Instituto Politécnico de Bragança, Bragança, Portugal
Florbela P. Fernandes
Instituto Politécnico de Bragança, Bragança, Portugal
Maria F. Pacheco
Instituto Politécnico de Bragança, Bragança, Portugal
João P. Coelho
Instituto Politécnico de Bragança, Bragança, Portugal
José Lima

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lopes, J., Partida, A., Pinto, P., Pinto, A. (2024). On the Use of VGs for Feature Selection in Supervised Machine Learning - A Use Case to Detect Distributed DoS Attacks. In: Pereira, A.I., Mendes, A., Fernandes, F.P., Pacheco, M.F., Coelho, J.P., Lima, J. (eds) Optimization, Learning Algorithms and Applications. OL2A 2023. Communications in Computer and Information Science, vol 1981. Springer, Cham. https://doi.org/10.1007/978-3-031-53025-8_19

Download citation

DOI: https://doi.org/10.1007/978-3-031-53025-8_19
Published: 01 February 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53024-1
Online ISBN: 978-3-031-53025-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

On the Use of VGs for Feature Selection in Supervised Machine Learning - A Use Case to Detect Distributed DoS Attacks