Skip to main content

Detection and Classification of Web Application Attacks

  • Conference paper
  • First Online:
Advances and Trends in Artificial Intelligence. Theory and Applications (IEA/AIE 2023)

Abstract

Web applications have become ubiquitous and offer a wide range of services, from content management and e-commerce to social networking. However, these applications are also prime targets for cyberattacks that exploit a variety of vulnerabilities. With the rise in use of Ubiquitous Web Applications (UWA) which can be accessed globally from various devices, it is imperative to automate the detection and classification of these attacks. In this study, we detect and classify web attacks using several classification machine learning models. We conduct a comparative analysis of the web attack classification results from Decision Trees, Random Forest, Support Vector Classifier (SVC) and K-Nearest Neighbor (KNN) machine learning models, using multiple text feature vectorization techniques such as the context-insensitive TF-IDF vectorizer, the bi-directional context-aware BERT transformer, and a combination of both techniques on the Webserver logs. We find that the Random Forest classifier performs best using BERT transformer for text features captured by the Webserver logs with 99% accuracy and \(F_{1}\) score for classifying web attacks. We also find that there is no significant gain in the accuracy of transformers over TF-IDF vectorizer for these text features presumably because of the preprocessing techniques we use on the command like syntax. Also, with TF-IDF text vectorization, both SVC and KNN classification models performed better than Random Forest classification model against Webserver logs to detect and classify Web application attacks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  2. BritneyMuller: Bert 101 - state of the art NLP model explained. https://huggingface.co/blog/bert-101#4-berts-performance-on-common-language-tasks

  3. Center, V.S.R.C.I.: 2022 data breach investigations report. https://github.com/vz-risk/dbir/tree/gh-pages/2022

  4. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  MATH  Google Scholar 

  5. Clincy, V., Shahriar, H.: Web service injection attack detection. In: 2017 12th International Conference for Internet Technology and Secured Transactions (ICITST), pp. 173–178 (2017). https://doi.org/10.23919/ICITST.2017.8356371

  6. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    Article  MATH  Google Scholar 

  7. Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)

    Article  MATH  Google Scholar 

  8. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2019)

  9. Gupta, S., Gupta, B.B.: Cross-site scripting (XSS) attacks and defense mechanisms: classification and state-of-the-art. Int. J. Syst. Assur. Eng. Manage. 8(1), 512–530 (2017)

    Article  Google Scholar 

  10. Liaw, A., Wiener, M.: Classification and regression by randomForest. R News 2(3), 18–22 (2002)

    Google Scholar 

  11. Liu, C.z., Sheng, Y.x., Wei, Z.q., Yang, Y.Q.: Research of text classification based on improved TF-IDF algorithm. In: 2018 IEEE International Conference of Intelligent Robotic and Control Engineering (IRCE), pp. 218–222. IEEE (2018)

    Google Scholar 

  12. Moh, M., Pininti, S., Doddapaneni, S., Moh, T.S.: Detecting web attacks using multi-stage log analysis. In: 2016 IEEE 6th International Conference on Advanced Computing (IACC), pp. 733–738 (2016). https://doi.org/10.1109/IACC.2016.141

  13. OWASP.org: Owasp top ten. https://owasp.org/www-project-top-ten/

  14. Profile, T.G.A.V.: The 10 most common website security attacks and how to protect yourself. https://www.tripwire.com/state-of-security/most-common-website-security-attacks-and-how-to-protect-yourself

  15. Quinlan, J.R.: C4.5: programs for machine learning. In: Proceedings of the 5th Australian Joint Conference on Artificial Intelligence. Lecture Notes in Computer Science, vol. 717, pp. 424–427. Springer, Cham (1993)

    Google Scholar 

  16. Ren, X., Hu, Y., Kuang, W., Souleymanou, M.B.: A web attack detection technology based on bag of words and hidden Markov model. In: 2018 IEEE 15th International Conference on Mobile Ad Hoc and Sensor Systems (MASS), pp. 526–531 (2018). https://doi.org/10.1109/MASS.2018.00081

  17. Riera, T.S., Higuera, J.R.B., Higuera, J.B., Herraiz, J.J.M., Montalvo, J.A.S.: A new multi-label dataset for web attacks CAPEC classification using machine learning techniques. Comput. Secur. 120, 102788 (2022). https://doi.org/10.1016/j.cose.2022.102788, https://www.sciencedirect.com/science/article/pii/S0167404822001833

  18. Seyyar, Y.E., Yavuz, A.G., Ünver, H.M.: Detection of web attacks using the BERT model. In: 2022 30th Signal Processing and Communications Applications Conference (SIU), pp. 1–4 (2022). https://doi.org/10.1109/SIU55565.2022.9864721

  19. Shah, S., Bhatnagar, D.: Feature selection using logistic regression and support vector machine. Int. J. Eng. Res. Appl. 5(10), 29–33 (2015)

    Google Scholar 

  20. Sharma, C., Jain, S.: Analysis and classification of SQL injection vulnerabilities and attacks on web applications. In: 2014 International Conference on Advances in Engineering & Technology Research (ICAETR-2014), pp. 1–6. IEEE (2014)

    Google Scholar 

  21. Sharma, S., Zavarsky, P., Butakov, S.: Machine learning based intrusion detection system for web-based attacks. In: 2020 IEEE 6th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS), pp. 227–230 (2020). https://doi.org/10.1109/BigDataSecurity-HPSC-IDS49724.2020.00048

  22. Conde Camillo da Silva, R., Oliveira Camargo, M.P., Sanches Quessada, M., Claiton Lopes, A., Diassala Monteiro Ernesto, J., Pontara da Costa, K.A.: An intrusion detection system for web-based attacks using IBM Watson. IEEE Latin Am. Trans. 20(2), 191–197 (2022). https://doi.org/10.1109/TLA.2022.9661457

  23. Technologies, P.: Web application attack trends (2020). https://www.ptsecurity.com/ww-en/analytics/web-application-attack-trends-2017/

  24. Tian, J.W., Zhu, H.Y., Li, X., Tian, Z.: Real-time online detection method for web attack based on flow data analysis. In: 2018 IEEE 9th International Conference on Software Engineering and Service Science (ICSESS), pp. 991–994 (2018). https://doi.org/10.1109/ICSESS.2018.8663848

  25. Zhang, Y., Gudmundsson, M., Leiringer, R.: A comparative study of supervised machine learning algorithms for credit scoring purposes. J. Credit Risk 13(1), 1–32 (2017)

    Google Scholar 

  26. Zuech, R., Hancock, J., Khoshgoftaar, T.M.: Detecting web attacks using random undersampling and ensemble learners. J. Big Data 8(1), 1–20 (2021). https://doi.org/10.1186/s40537-021-00460-8

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jayanthi Ramamoorthy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ramamoorthy, J., Oladimeji, D., Garland, L., Liu, Q. (2023). Detection and Classification of Web Application Attacks. In: Fujita, H., Wang, Y., Xiao, Y., Moonis, A. (eds) Advances and Trends in Artificial Intelligence. Theory and Applications. IEA/AIE 2023. Lecture Notes in Computer Science(), vol 13926. Springer, Cham. https://doi.org/10.1007/978-3-031-36822-6_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-36822-6_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-36821-9

  • Online ISBN: 978-3-031-36822-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics