research-article

Detecting Web Attacks using Stacked Denoising Autoencoder and Ensemble Learning Methods

Authors:

Tung BuiAuthors Info & Claims

SoICT '19: Proceedings of the 10th International Symposium on Information and Communication Technology

Pages 267 - 272

https://doi.org/10.1145/3368926.3369715

Published: 04 December 2019 Publication History

Abstract

Web-based anomalies remains a serious security threat on the Internet. This paper proposes the use of Sum Rule and Xgboost to combine the outputs related to various Stacked Denoising Autoencoders (SDAEs) in order to detect abnormal HTTP queries. Sum Rule and Xgboost inherit the distinct advantage of SDAE that does not require handcrafted features to be extracted. Furthermore, these methods can cope with the changing web vulnerabilities, where malicious code is added into different parts of the request header and body. Experiments were carried out on the DVWA dataset and the dataset that obtained from a real-world application. Sum Rule and Xgboost demonstrate to achieve higher F1-score as compared to the state-of-the-art Regularized Deep Autoencoders, Isolation Forest, C4.5 decision tree and Long Short-term Memory network.

References

[1]

Guillaume Alain and Yoshua Bengio. 2014. What regularized autoencoders learn from the data-generating distribution. The Journal of Machine Learning Research 15, 1 (2014), 3563--3593.

Digital Library

[2]

Simon Bennetts. 2013. Owasp zed attack proxy. AppSec USA (2013).

[3]

Gaik-Yee Chan, Chien-Sing Lee, and Swee-Huay Heng. 2013. Discovering fuzzy association rule patterns and increasing sensitivity analysis of XML-related attacks. Journal of Network and Computer Applications 36, 2 (2013), 829--842.

Digital Library

[4]

Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. ACM, 785--794.

Digital Library

[5]

François Chollet et al. 2015. Keras. https://keras.io/.

[6]

Ryan Dewhurst. 2012. Damn Vulnerable Web Application (DVWA). http://www.dvwa.co.uk//.

[7]

Kenneth L Ingham and Hajime Inoue. 2007. Comparing anomaly detection techniques for http. In International Workshop on Recent Advances in Intrusion Detection. Springer, 42--62.

Digital Library

[8]

Josef Kittler, Mohamad Hatef, Robert PW Duin, and Jiri Matas. 1998. On combining classifiers. IEEE transactions on pattern analysis and machine intelligence 20, 3 (1998), 226--239.

[9]

Christopher Kruegel, Giovanni Vigna, and William Robertson. 2005. A multi-model approach to the detection of web-based attacks. Computer Networks 48, 5 (2005), 717--738.

Digital Library

[10]

Jingxi Liang, Wen Zhao, and Wei Ye. 2017. Anomaly-Based Web Attack Detection: A Deep Learning Approach. In Proceedings of the 2017 VI International Conference on Network, Communication and Computing. ACM, 80--85.

Digital Library

[11]

Hieu Mac, Dung Truong, Lam Nguyen, Hoa Nguyen, Hai Anh Tran, and Duc Tran. 2018. Detecting Attacks on Web Applications using Autoencoder. In The 9th International Symposium on Information and Communication Technology (SoICT 2018), Da Nang, Vietnam (2018-12-06).

Digital Library

[12]

Hai Thanh Nguyen, Carmen Torrano-Gimenez, Gonzalo Alvarez, Slobodan Petrović, and Katrin Franke. 2011. Application of the generic feature selection measure in detection of web attacks. In Computational Intelligence in Security for Information Systems. Springer, 25--32.

Digital Library

[13]

Top OWASP. 10. Application Security Risks-2019. Open Web Application Security Project (OWASP).

[14]

Yao Pan, Fangzhou Sun, Zhongwei Teng, Jules White, Douglas C Schmidt, Jacob Staples, and Lee Krause. 2019. Detecting web attacks with end-to-end deep learning. Journal of Internet Services and Applications 10, 1 (2019), 1--22.

[15]

Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. 2011. Scikit-learn: Machine learning in Python. Journal of machine learning research 12, Oct (2011), 2825--2830.

Digital Library

[16]

Ofer Shezaf. 2007. ModSecurity Core Rule Set": An Open Source Rule Set for Generic Detection of Attacks against Web Applications. In OWASP AppSec Conference.

[17]

Yingbo Song, Angelos D Keromytis, and Salvatore Stolfo. 2009. Spectrogram: A mixture-of-markov-chains model for anomaly detection in web traffic. (2009).

[18]

Carmen Torrano-Gimenez, Hai Thanh Nguyen, Gonzalo Alvarez, and Katrin Franke. 2015. Combining expert knowledge with automatic feature extraction for reliable web attack detection. Security and Communication Networks 8, 16 (2015), 2750--2767.

Digital Library

[19]

Quang Duc Tran and Panos Liatsis. 2016. RABOC: An approach to handle class imbalance in multimodal biometric authentication. Neurocomputing 188 (2016), 167--177.

[20]

Ali Moradi Vartouni, Saeed Sedighian Kashi, and Mohammad Teshnehlab. 2018. An anomaly detection method to detect web attacks using Stacked Auto-Encoder. In Fuzzy and Intelligent Systems (CFIS), 2018 6th Iranian Joint Congress on. IEEE, 131--134.

[21]

Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio, and Pierre-Antoine Manzagol. 2010. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of machine learning research 11, Dec (2010), 3371--3408.

Digital Library

[22]

Jiabao Wang, Zhenji Zhou, and Jun Chen. 2018. Evaluating CNN and LSTM for Web Attack Detection. In Proceedings of the 2018 10th International Conference on Machine Learning and Computing. ACM, 283--287.

Digital Library

[23]

Ke Wang and Salvatore J Stolfo. 2004. Anomalous payload-based network intrusion detection. In International Workshop on Recent Advances in Intrusion Detection. Springer, 203--222.

[24]

Chen Xing, Li Ma, and Xiaoquan Yang. 2016. Stacked denoise autoencoder based feature extraction and classification for hyper-spectral images. Journal of Sensors 2016 (2016).

[25]

Guiqin Yuan, Bo Li, Yiyang Yao, and Simin Zhang. 2017. A deep learning enabled subspace spectral ensemble clustering approach for web anomaly detection. In Neural Networks (IJCNN), 2017 International Joint Conference on. IEEE, 3896--3903.

[26]

Quan Zou, Sifa Xie, Ziyu Lin, Meihong Wu, and Ying Ju. 2016. Finding the best classification threshold in imbalanced classification. Big Data Research 5 (2016), 2--8.

Cited By

Moreira DSeara JPavia JSerrão C(2024)Intelligent Platform for Automating Vulnerability Detection in Web ApplicationsElectronics10.3390/electronics1401007914:1(79)Online publication date: 27-Dec-2024
https://doi.org/10.3390/electronics14010079
Jagat RSisodia DSingh P(2024)Detecting Web Attacks From HTTP Weblogs Using Variational LSTM Autoencoder Deviation NetworkIEEE Transactions on Services Computing10.1109/TSC.2024.345374817:5(2210-2222)Online publication date: Sep-2024
https://doi.org/10.1109/TSC.2024.3453748
Chen PDeng YZhang XMa LYan YWu YLi C(2022)Degradation Trend Prediction of Pumped Storage Unit Based on MIC-LGBM and VMD-GRU Combined ModelEnergies10.3390/en1502060515:2(605)Online publication date: 15-Jan-2022
https://doi.org/10.3390/en15020605
Show More Cited By

Recommendations

Detecting Attacks on Web Applications using Autoencoder
SoICT '18: Proceedings of the 9th International Symposium on Information and Communication Technology

Web attacks have become a real threat to the Internet. This paper proposes the use of autoencoder to detect malicious pattern in the HTTP/HTTPS requests. The autoencoder is able to operate on the raw data and thus, does not require the hand-crafted ...
A selective deep stacked denoising autoencoders ensemble with negative correlation learning for gearbox fault diagnosis
Graphical abstract
SSDAE-NCL-based gearbox fault diagnosis.

Display Omitted
Highlights
- A novel selective DNN ensemble is proposed for gearbox fault diagnosis.
- A ...
Abstract
Vibration signals are widely used as an effective way to fulfill gearbox fault diagnosis. However, it is quite challenging to extract effective fault features from noisy vibration signals and then to construct a reliable fault ...
Assessing cognitive mental workload via EEG signals and an ensemble deep learning classifier based on denoising autoencoders
Abstract
To estimate the reliability and cognitive states of operator performance in a human-machine collaborative environment, we propose a novel human mental workload (MW) recognizer based on deep learning principles and utilizing the ...
Highlights
- Temporal and frequential features of EEG are filtered by deep denoising autoencoders.

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

SoICT '19: Proceedings of the 10th International Symposium on Information and Communication Technology

December 2019

551 pages

ISBN:9781450372459

DOI:10.1145/3368926

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

SOICT: School of Information and Communication Technology - HUST
NAFOSTED: The National Foundation for Science and Technology Development

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 December 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

SoICT 2019

SoICT 2019: The Tenth International Symposium on Information and Communication Technology

December 4 - 6, 2019

Ha Long Bay, Hanoi, Viet Nam

Acceptance Rates

Overall Acceptance Rate 147 of 318 submissions, 46%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
212
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Moreira DSeara JPavia JSerrão C(2024)Intelligent Platform for Automating Vulnerability Detection in Web ApplicationsElectronics10.3390/electronics1401007914:1(79)Online publication date: 27-Dec-2024
https://doi.org/10.3390/electronics14010079
Jagat RSisodia DSingh P(2024)Detecting Web Attacks From HTTP Weblogs Using Variational LSTM Autoencoder Deviation NetworkIEEE Transactions on Services Computing10.1109/TSC.2024.345374817:5(2210-2222)Online publication date: Sep-2024
https://doi.org/10.1109/TSC.2024.3453748
Chen PDeng YZhang XMa LYan YWu YLi C(2022)Degradation Trend Prediction of Pumped Storage Unit Based on MIC-LGBM and VMD-GRU Combined ModelEnergies10.3390/en1502060515:2(605)Online publication date: 15-Jan-2022
https://doi.org/10.3390/en15020605
Chu TLabiod MTran HMellouk A(2022)GADaM: Generic Adaptive Deep-learning-based Multipath Scheduler Selector for Dynamic Heterogeneous EnvironmentICC 2022 - IEEE International Conference on Communications10.1109/ICC45855.2022.9838658(4908-4913)Online publication date: 16-May-2022
https://doi.org/10.1109/ICC45855.2022.9838658
Riera THiguera JHiguera JHerraiz JMontalvo J(2022)A new multi-label dataset for Web attacks CAPEC classification using machine learning techniquesComputers and Security10.1016/j.cose.2022.102788120:COnline publication date: 25-Aug-2022
https://dl.acm.org/doi/10.1016/j.cose.2022.102788
Hammad MHewahi NElmedany W(2022)MMM-RF: A novel high accuracy multinomial mixture model for network intrusion detection systemsComputers & Security10.1016/j.cose.2022.102777120(102777)Online publication date: Sep-2022
https://doi.org/10.1016/j.cose.2022.102777
Muslihi MAlghazzawi D(2020)Detecting SQL Injection On Web Application Using Deep Learning Techniques: A Systematic Literature Review2020 Third International Conference on Vocational Education and Electrical Engineering (ICVEE)10.1109/ICVEE50212.2020.9243198(1-6)Online publication date: 3-Oct-2020
https://doi.org/10.1109/ICVEE50212.2020.9243198
Chapaneri RShah S(2020)Multi-level Gaussian mixture modeling for detection of malicious network trafficThe Journal of Supercomputing10.1007/s11227-020-03447-z77:5(4618-4638)Online publication date: 16-Oct-2020
https://doi.org/10.1007/s11227-020-03447-z

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten