Empirical Evaluations of Machine Learning Effectiveness in Detecting Web Application Attacks

Ismail, Muhusina; Alrabaee, Saed; Harous, Saad; Choo, Kim-Kwang Raymond

doi:10.1007/978-3-031-50051-0_8

Muhusina Ismail¹⁷,
Saed Alrabaee¹⁷,
Saad Harous¹⁸ &
…
Kim-Kwang Raymond Choo¹⁹

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 542))

Included in the following conference series:

International Conference on Future Access Enablers of Ubiquitous and Intelligent Infrastructures

106 Accesses

Abstract

Web applications remain a significant attack vector for cybercriminals seeking to exploit application vulnerabilities and gain unauthorized access to privileged data. In this research, we evaluate the efficacy of eight supervised machine learning algorithms - Naive Bayes, Decision Tree, AdaBoost, Random Forest, Logistic Regression, K-Nearest Neighbor (KNN), Support Vector Machine (SVM), and Artificial Neural Network (ANN) - in detecting and countering web application attacks. Our results indicate that KNN and Random Forest classifiers achieve an accuracy rate of 89% and an area under the curve of 94% on the CSIC HTTP dataset, a commonly used benchmark in the field. Meanwhile, the Naive Bayes classifier proves the most efficient, taking the least computational time when differentiating between malicious and benign HTTP requests. These findings may help direct future efforts towards more efficient, machine learning-driven defenses against web application attacks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Mikheeva, O.I., Gatchin Yuri, A., Savkov, S.V., Khammatova, R.M., et al.: Search methods for abnormal activities of web applications. J. Sci. Tech. Inf. Technol. Mech. Optics 126(2), 233–242 (2020)
Google Scholar
Holz, T., Marechal, S., Raynal, F.: New threats and attacks on the world wide web. IEEE Secur. Priv. 4(2), 72–75 (2006)
Article Google Scholar
Moshchuk, A., Bragin, T., Deville, D., Gribble, S.D., Levy, H.M.: SpyProxy: Execution-based detection of malicious web content. In: USENIX Security Symposium, pp. 1–16 (2007)
Google Scholar
Tekerek, A.: A novel architecture for web-based attack detection using convolutional neural network. Comput. Secur. 100, 102096 (2021)
Article Google Scholar
Huang, Y., Li, T., Zhang, L., Li, B., Liu, X.: JSContana: malicious javascript detection using adaptable context analysis and key feature extraction. Comput. Secur. 104, 102218 (2021)
Article Google Scholar
Phung, N.M., Mimura, M.: Detection of malicious javascript on an imbalanced dataset. Internet of Things 13, 100357 (2021)
Article Google Scholar
Nithya, V., Pandian, S.L., Malarvizhi, C.: A survey on detection and prevention of cross-site scripting attack. Int. J. Secur. Its Appl. 9(3), 139–152 (2015)
Google Scholar
Tariq, I., Sindhu, M.A., Abbasi, R.A., Khattak, A.S., Maqbool, O., Siddiqui, G.F.: Resolving cross-site scripting attacks through genetic algorithm and reinforcement learning. Expert Syst. Appl. 168, 114386 (2021)
Article Google Scholar
Jeitner, P., Shulman, H.: Injection attacks reloaded: tunnelling malicious payloads over DNS. In: 30th \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 21), pp. 3165–3182 (2021)
Google Scholar
Kc, G.S., Keromytis, A.D., Prevelakis, V.: Countering code-injection attacks with instruction-set randomization. In: Proceedings of the 10th ACM conference on Computer and communications security, pp. 272–280 (2003)
Google Scholar
Hazel, P.: Perl compatible regular expressions, The University of Cambridge, p. 114 (2012)
Google Scholar
Erlacher, F., Dressler, F.: On high-speed flow-based intrusion detection using snort-compatible signatures. IEEE Trans. Dependable Secur. Comput
Google Scholar
Fredj, O.B., Cheikhrouhou, O., Krichen, M., Hamam, H., Derhab, A.: An OWASP top ten driven survey on web application protection methods. In: Garcia-Alfaro, J., Leneutre, J., Cuppens, N., Yaich, R. (eds.) CRiSIS 2020. LNCS, vol. 12528, pp. 235–252. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-68887-5_14
Chapter Google Scholar
Perl-compatible regular expressions (PCRE), http://www.pcre.org (2021)
Kozik, R., Choraś, M., Renk, R., Hołubowicz, W.: A proposal of algorithm for web applications cyber attack detection. In: Saeed, K., Snášel, V. (eds.) CISIM 2014. LNCS, vol. 8838, pp. 680–687. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-45237-0_61
Chapter Google Scholar
Sharma, S., Zavarsky, P., Butakov, S.: Machine learning based intrusion detection system for web-based attacks. In: 2020 IEEE 6th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS), IEEE, pp. 227–230 (2020)
Google Scholar
Oumaima, C., Abdeslam, R., Yassine, S., Abderrazek, F.: Experimental study on the effectiveness of machine learning methods in web intrusion detection. In: Maleh, Y., Alazab, M., Gherabi, N., Tawalbeh, L., Abd El-Latif, A.A. (eds.) ICI2C 2021. LNNS, vol. 357, pp. 486–494. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-91738-8_44
Chapter Google Scholar
J. Offutt, Y. Wu, X. Du, H. Huang, Bypass testing of web applications. In: 15th International Symposium on Software Reliability Engineering, IEEE, pp. 187–197 (2004)
Google Scholar
Sun, F., Zhang, P., White, J., Schmidt, D., Staples, J., Krause, L.: A feasibility study of autonomically detecting in-process cyber-attacks. In: 2017 3rd IEEE International Conference on Cybernetics (CYBCONF), IEEE, pp. 1–8 (2017)
Google Scholar
Cova, M., Kruegel, C., Vigna, G.: Detection and analysis of drive-by-download attacks and malicious JavaScript code. In: Proceedings of the 19th international conference on World wide web, pp. 281–290 (2010)
Google Scholar
Pazos, J.C., Légaré, J.-S., Beschastnikh, I.: XSnare: application-specific client-side cross-site scripting protection. In: 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), IEEE, pp. 154–165 (2021)
Google Scholar
Johns, M., Engelmann, B., Posegga, J., Xssds: Server-side detection of cross-site scripting attacks. In: Annual Computer Security Applications Conference (ACSAC). IEEE, vol. 2008, pp. 335–344 (2008)
Google Scholar
Fang, Y., Li, Y., Liu, L., Huang, C.: DeepXSS: cross site scripting detection based on deep learning. In: Proceedings of the 2018 International Conference on Computing and Artificial Intelligence, pp. 47–51 (2018)
Google Scholar
Rodríguez, G.E., Torres, J.G., Flores, P., Benavides, D.E.: Cross-site scripting (XSS) attacks and mitigation: a survey. Comput. Netw. 166, 106960 (2020)
Article Google Scholar
Kaur, G., Malik, Y., Samuel, H., Jaafar, F.: Detecting blind cross-site scripting attacks using machine learning. In: Proceedings of the 2018 International Conference on Signal Processing and Machine Learning, pp. 22–25 (2018)
Google Scholar
Kemalis, K., Tzouramanis, T.: SQL-IDS: a specification-based approach for SQL-injection detection. In: Proceedings of the 2008 ACM symposium on Applied computing, pp. 2153–2158 (2008)
Google Scholar
Zhang, L., Zhang, D., Wang, C., Zhao, J., Zhang, Z.: ART4SQLI: the art of SQL injection vulnerability discovery. IEEE Trans. Reliab. 68(4), 1470–1489 (2019)
Article Google Scholar
Medeiros, I., Beatriz, M., Neves, N., Correia, M.: SEPTIC: detecting injection attacks and vulnerabilities inside the DBMS. IEEE Trans. Reliab. 68(3), 1168–1188 (2019)
Article Google Scholar
Fredj, O.B.: SPHERES: an efficient server-side web application protection system. Int. J. Inf. Comput. Secur. 11(1), 33–60 (2019)
Google Scholar
Zhuo, Z., Cai, T., Zhang, X., Lv, F.: Long short-term memory on abstract syntax tree for SQL injection detection. IET Softw. 15(2), 188–197 (2021)
Article Google Scholar
Li, Q., Li, W., Wang, J., Cheng, M.: A SQL injection detection method based on adaptive deep forest. IEEE Access 7, 145385–145394 (2019)
Article Google Scholar
Gu, H., et al.: DIAVA: a traffic-based framework for detection of SQL injection attacks and vulnerability analysis of leaked data. IEEE Trans. Reliab. 69(1), 188–202 (2019)
Article Google Scholar
Batista, L.O.: Fuzzy neural networks to create an expert system for detecting attacks by SQL injection, arXiv preprint arXiv:1901.02868
Fang, Y., Peng, J., Liu, L., Huang, C.: WOVSQLI: detection of SQL injection behaviors using word vector and LSTM. In: Proceedings of the 2nd International Conference on Cryptography, Security and Privacy, pp. 170–174 (2018)
Google Scholar
Liu, M., Li, K., Chen, T.: DeepSQLi: deep semantic learning for testing SQL injection. In: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 286–297 (2020)
Google Scholar
D. Chen, Q. Yan, C. Wu, J. Zhao, Sql injection attack detection and prevention techniques using deep learning. J. Phys. Conf. Series 1757, 012055 IOP Publishing (2021)
Google Scholar
Nguyen, H.T., Torrano-Gimenez, C., Alvarez, G., Petrović, S., Franke, K.: Application of the generic feature selection measure in detection of web attacks. In: Herrero, Á., Corchado, E. (eds.) CISIS 2011. LNCS, vol. 6694, pp. 25–32. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21323-6_4
Chapter Google Scholar
Yavanoglu, O., Aydos, M.: A review on cyber security datasets for machine learning algorithms. In: IEEE International Conference on Big Data (big data). IEEE, vol. 2017, pp. 2186–2193 (2017)
Google Scholar
Kascheev, S., Olenchikova, T.: The detecting cross-site scripting (XSS) using machine learning methods. In: Global Smart Industry Conference (GloSIC). IEEE, vol. 2020, pp. 265–270 (2020)
Google Scholar
Mereani, F.A., Howe, J.M.: Detecting cross-site scripting attacks using machine learning. In: Hassanien, A.E., Tolba, M.F., Elhoseny, M., Mostafa, M. (eds.) AMLTA 2018. AISC, vol. 723, pp. 200–210. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-74690-6_20
Chapter Google Scholar
Halfond, W.G., Viegas, J., Orso, A., et al.: A classification of SQL-injection attacks and countermeasures. In: Proceedings of the IEEE International Symposium on Secure Software Engineering, IEEE, vol. 1, pp. 13–15 (2006)
Google Scholar
Saritas, M.M., Yasar, A.: Performance analysis of ANN and naive Bayes classification algorithm for data classification. Int. J. Intell. Syst. Appl. Eng. 7(2), 88–91 (2019)
Article Google Scholar
Garg, A., Roth, D.: Understanding probabilistic classifiers. In: De Raedt, L., Flach, P. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 179–191. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44795-4_16
Chapter Google Scholar
Kulkarni, C.C., Kulkarni, S.: Human agent knowledge transfer applied to web security. In: 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT), IEEE, pp. 1–4 (2013)
Google Scholar
Zhang, H.: The optimality of naive Bayes. Aa 1(2), 3 (2004)
Google Scholar
Myles, A.J., Feudale, R.N., Liu, Y., Woody, N.A., Brown, S.D.: An introduction to decision tree modeling. A J. Chemom. Soc. 18(6), 275–285 (2004)
Google Scholar
Liaw, A., Wiener, M., et al.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)
Google Scholar
Howe, J., Mereani, F.: Detecting cross-site scripting attacks using machine learning. In: Advances in Intelligent Systems and Computing 723
Google Scholar
Zhang, Z.: Introduction to machine learning: k-nearest neighbors. Anna. Transl. Med. 4(11)
Google Scholar
Bhor, R., Khanuja, H.: Analysis of web application security mechanism and attack detection using vulnerability injection technique. In: 2016 International Conference on Computing Communication Control and automation (ICCUBEA), IEEE, pp. 1–6 (2016)
Google Scholar
Jakkula, V.: Tutorial on support vector machine (SVM), School of EECS, Washington State University 37
Google Scholar
Rawat, R., Shrivastav, S.K.: SQL injection attack detection using SVM. Int. J. Comput. Appl. 42(13), 1–4 (2012)
Google Scholar
Braspenning, P.J., Thuijsman, F., Weijters, A.J.M.M. (eds.): Neural Network School 1999. LNCS, vol. 931. Springer, Heidelberg (1995). https://doi.org/10.1007/BFb0027019
Book Google Scholar
Manzoor, I., Kumar, N., et al.: A feature reduced intrusion detection system using ANN classifier. Expert Syst. Appl. 88, 249–257 (2017)
Article Google Scholar
CSIC 2010 Dataset, https://petescully.co.uk/research/csic-2010-http-dataset-in-csv-format-for-weka-analysis/ (2021)
Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: International Symposium ELMAR. IEEE, vol. 2022, pp. 57–60 (2022)
Google Scholar
Ramos Júnior, L.S., Macêdo, D., Oliveira, A.L.I., Zanchettin, C.: Detecting Malicious HTTP Requests Without Log Parser Using RequestBERT-BiLSTM. In: Xavier-Junior, J.C., Rios, R.A. (eds) Intelligent Systems. BRACIS 2022. LNCS(), vol 13654 . Springer, Cham (2022). https://doi.org/10.1007/978-3-031-21689-3_24
Ghazal, S.F., Mjlae, S.A.: Cybersecurity in deep learning techniques: Detecting network attacks. Int. J. Adv. Comput. Sci. Appl. 13(11)
Google Scholar
Li, W., Zhang, X.Y.: GBLNet: Detecting Intrusion Traffic with Multi-granularity BiLSTM. In: Groen, D., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2022. ICCS 2022. LNCS, vol 13353. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-08760-8_32
Tan, S., Sun, R., Liang, Z.: Detection of malicious web requests using neural networks with multi granularity features. In: Proceedings of the 5th International Conference on Big Data Technologies, pp. 83–89 (2022)
Google Scholar
Shaheed, A., Kurdy, M.: Web application firewall using machine learning and features engineering, Secur. Commun. Netw. (2022)
Google Scholar
Toprak, S., Yavuz, A.G.: Web application firewall based on anomaly detection using deep learning. Acta Infologica 6(2), 219–244 (2022)
Google Scholar
J. J. Davis, A. J. Clark, Data preprocessing for anomaly based network intrusion detection: a review. Comput. Secur. 30(6–7), 353–375 (2011)
Google Scholar
Kotsiantis, S.B., Kanellopoulos, D., Pintelas, P.E.: Data preprocessing for supervised leaning. Int. J. Comput. Sci. 1(2), 111–117 (2006)
Google Scholar
Performance metrics, https://towardsdatascience.com/20-popular-machine-learning-metrics-part-1-classification-regression-evaluation-metrics1ca3e282a2ce (2021)

Download references

Acknowledgment

This work was supported by grant number 12R170.

Author information

Authors and Affiliations

Department of Information Systems and Security, CIT, United Arab Emirates University, Al Ain, UAE
Muhusina Ismail & Saed Alrabaee
Department of Computer Science, College of Computing and Informatics, University of Sharjah, Sharjah, UAE
Saad Harous
Department of Information Systems and Cyber Security, University of Texas at San Antonio, San Antonio, USA
Kim-Kwang Raymond Choo

Authors

Muhusina Ismail
View author publications
You can also search for this author in PubMed Google Scholar
Saed Alrabaee
View author publications
You can also search for this author in PubMed Google Scholar
Saad Harous
View author publications
You can also search for this author in PubMed Google Scholar
Kim-Kwang Raymond Choo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Saed Alrabaee .

Editor information

Editors and Affiliations

University of Zagreb, Zagreb, Croatia
Dragan Perakovic
Technical University of Košice, Prešov, Slovakia
Lucia Knapcikova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ismail, M., Alrabaee, S., Harous, S., Choo, KK.R. (2024). Empirical Evaluations of Machine Learning Effectiveness in Detecting Web Application Attacks. In: Perakovic, D., Knapcikova, L. (eds) Future Access Enablers for Ubiquitous and Intelligent Infrastructures. FABULOUS 2023. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 542. Springer, Cham. https://doi.org/10.1007/978-3-031-50051-0_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-50051-0_8
Published: 15 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50050-3
Online ISBN: 978-3-031-50051-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Empirical Evaluations of Machine Learning Effectiveness in Detecting Web Application Attacks