research-article

Public Access

Scan Me If You Can: Understanding and Detecting Unwanted Vulnerability Scanning

Authors:

Babak Amin Azad,

Nick NikiforakisAuthors Info & Claims

WWW '23: Proceedings of the ACM Web Conference 2023

Pages 2284 - 2294

https://doi.org/10.1145/3543507.3583394

Published: 30 April 2023 Publication History

All formats PDF

Abstract

Web vulnerability scanners (WVS) are an indispensable tool for penetration testers and developers of web applications, allowing them to identify and fix low-hanging vulnerabilities before they are discovered by attackers. Unfortunately, malicious actors leverage the very same tools to identify and exploit vulnerabilities in third-party websites. Existing research in the WVS space is largely concerned with how many vulnerabilities these tools can discover, as opposed to trying to identify the tools themselves when they are used illicitly.

In this work, we design a testbed to characterize web vulnerability scanners using browser-based and network-based fingerprinting techniques. We conduct a measurement study over 12 web vulnerability scanners as well as 159 users who were recruited to interact with the same web applications that were targeted by the evaluated WVSs. By contrasting the traffic and behavior of these two groups, we discover tool-specific and type-specific behaviors in WVSs that are absent from regular users. Based on these observations,

we design and build ScannerScope, a machine-learning-based, web vulnerability scanner detection system. ScannerScope consists of a transparent reverse proxy that injects fingerprinting modules on the fly without the assistance (or knowledge) of the protected web applications. Our evaluation results show that ScannerScope can effectively detect WVSs and protect web applications against unwanted vulnerability scanning, with a detection accuracy of over 99% combined with near-zero false positives on human-visitor traffic. Finally, we show that the asynchronous design of ScannerScope results in a negligible impact on server performance and demonstrate that its classifier can resist adversarial ML attacks launched by sophisticated adversaries.

References

[1]

2022. Acunetix online scanner. https://www.acunetix.com/online-vulnerability-scanner/.

[2]

2022. Tenable cloud web scanner. https://www.tenable.com/products/tenable-io.

[3]

apachebench 2022. AB - Apache HTTP server benchmarking tool. https://httpd.apache.org/docs/2.4/programs/ab.html.

[4]

arachni 2022. Arachni Web Application Security Scanner Framework. https://www.arachni-scanner.com/.

[5]

Andrew Austin and Laurie Williams. 2011. One technique is not enough: A comparison of vulnerability discovery techniques. In 2011 International Symposium on Empirical Software Engineering and Measurement.

Digital Library

[6]

Babak Amin Azad, Oleksii Starov, Pierre Laperdrix, and Nick Nikiforakis. 2020. Web runner 2049: Evaluating third-party anti-bot services. In International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer.

[7]

Lee Brotherston. 2022. TLS Fingerprinting library. https://github.com/LeeBrotherston/tls-fingerprinting. https://github.com/LeeBrotherston/tls-fingerprinting

[8]

BugCrowd. 2022. Trello bug bounty program. https://bugcrowd.com/trello.

[9]

Elie Bursztein, Artem Malyshev, Tadek Pietraszek, and Kurt Thomas. 2016. Picasso: Lightweight device class fingerprinting for web clients. In Proceedings of the 6th Workshop on Security and Privacy in Smartphones and Mobile Devices.

Digital Library

[10]

Cybersecurity and Infrastructure Security Agency. 2021. Apache Log4j Vulnerability Guidance. https://www.cisa.gov/uscert/apache-log4j-vulnerability-guidance.

[11]

Adam Doupé, Ludovico Cavedon, Christopher Kruegel, and Giovanni Vigna. 2012. Enemy of the state: A state-aware black-box web vulnerability scanner. In 21st { USENIX} Security Symposium ({ USENIX} Security 2012).

[12]

Adam Doupé, Marco Cova, and Giovanni Vigna. 2010. Why Johnny can’t pentest: An analysis of black-box web vulnerability scanners. In International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer.

[13]

Peter Eckersley. 2010. How unique is your web browser¿. In International Symposium on Privacy Enhancing Technologies Symposium. Springer.

Digital Library

[14]

Benjamin Eriksson, Giancarlo Pellegrino, and Andrei Sabelfeld. 2021. Black Widow: Blackbox Data-driven Web Scanning. IEEE Symposium on Security and Privacy (2021).

[15]

Alicia Hope. 2021. Massive Cyber Attacks Target F5 BIG-IP Critical Vulnerabilities After Firm Releases Updates. https://www.cpomagazine.com/cyber-security/massive-cyber-attacks-target-f5-big-ip-critical-vulnerabilities-after-firm-releases-updates/.

[16]

Umar Iqbal, Steven Englehardt, and Zubair Shafiq. 2021. Fingerprinting the fingerprinters: Learning to detect browser fingerprinting behaviors. In 2021 IEEE Symposium on Security and Privacy.

[17]

Gregoire Jacob, Engin Kirda, Christopher Kruegel, and Giovanni Vigna. 2012. PUBCRAWL: Protecting Users and Businesses from CRAWLers. In 21st USENIX Security Symposium (USENIX Security 12). USENIX Association, Bellevue, WA, 507–522. https://www.usenix.org/conference/usenixsecurity12/technical-sessions/presentation/jacob

[18]

Steve TK Jan, Qingying Hao, Tianrui Hu, Jiameng Pu, Sonal Oswal, Gang Wang, and Bimal Viswanath. 2020. Throwing darts in the dark¿ detecting bots with limited data using neural data augmentation. In 2020 IEEE Symposium on Security and Privacy.

[19]

Brian Kondracki, Babak Amin Azad, Oleksii Starov, and Nick Nikiforakis. 2021. Catching Transparent Phish: Analyzing and Detecting MITM Phishing Toolkits. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security.

Digital Library

[20]

Xigao Li, Babak Amin Azad, Amir Rahmati, and Nick Nikiforakis. 2021. Good bot, bad bot: Characterizing automated browsing activity. In 2021 IEEE symposium on security and privacy.

[21]

Anália G Lourenço and Orlando O Belo. 2006. Catching web crawlers in the act. In Proceedings of the 6th international Conference on Web Engineering.

Digital Library

[22]

majestic 2022. Majestic Million. https://majestic.com/reports/majestic-million.

[23]

Yuma Makino and Vitaly Klyuev. 2015. Evaluation of web vulnerability scanners. In 2015 IEEE 8th International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS).

Digital Library

[24]

Balume Mburano and Weisheng Si. 2018. Evaluation of web vulnerability scanners based on owasp benchmark. In 2018 26th International Conference on Systems Engineering (ICSEng). IEEE.

[25]

mturk 2022. Amazon Mechanical Turk. https://www.mturk.com/.

[26]

NIST. 2021. CVSS Severity Distribution Over Time. https://nvd.nist.gov/general/visualizations/vulnerability-visualizations/cvss-severity-distribution-over-time.

[27]

Charlie Osborne. 2021. Critical remote code execution flaw in thousands of VMWare vCenter servers remains unpatched. https://www.zdnet.com/article/critical-remote-code-execution-flaw-in-thousands-of-vmware-vcenter-servers-remains-unpatched/.

[28]

owasp 2022. Free for Open Source Application Security Tools. https://owasp.org/www-community/Free_for_Open_Source_Application_Security_Tools.

[29]

owasp 2022. OWASP Zed Attack Proxy (ZAP). https://owasp.org/www-project-zap/.

[30]

owasp 2022. WSTG - v4.1, Testing Tools Resource. https://owasp.org/www-project-web-security-testing-guide/v41/6-Appendix/A-Testing_Tools_Resource.

[31]

KyoungSoo Park, Vivek S Pai, Kang-Won Lee, and Seraphin B Calo. 2006. Securing Web Service by Automatic Robot Detection. In USENIX Annual Technical Conference, General Track.

[32]

Davor Petreski. 2019. Integrating Web Vulnerability Scanners in Continuous Integration: DAST for CI/CD. https://blog.probely.com/integrating-web-vulnerability-scanners-in-continuous-integration-dast-for-ci-cd-7637eaff26bd.

[33]

Piwik. 2022. Piwik Pro bug bounty program. https://piwik.pro/security-bug-bounty-programat-piwik-pro/.

[34]

PortSwigger. 2019. CI/CD security testing. https://portswigger.net/developers/ci-cd-security.

[35]

Sugandh Shah and Babu M Mehtre. 2015. An overview of vulnerability assessment and penetration testing techniques. Journal of Computer Virology and Hacking Techniques (2015).

[36]

Larry Suto. 2010. Analyzing the accuracy and time costs of web application security scanners. San Francisco, February (2010).

[37]

Pang-Ning Tan and Vipin Kumar. 2004. Discovery of web robot sessions based on their navigational patterns. In Intelligent Technologies for Information Analysis. Springer.

[38]

UnitedAirlines. 2022. United Airlines bug bounty program. https://www.united.com/ual/en/us/fly/contact/vdppolicy.html.

[39]

Tom Van Goethem, Frank Piessens, Wouter Joosen, and Nick Nikiforakis. 2014. Clubbing seals: Exploring the ecosystem of third-party security seals. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security.

Digital Library

[40]

Nikos Virvilis, Bart Vanautgaerden, and Oscar Serrano Serrano. 2021. Changing the game: The art of deceiving sophisticated attackers. In 2014 6th International Conference On Cyber Conflict (CyCon 2014). IEEE.

[41]

w3techs 2022. Usage statistics and market share of Joomla. https://w3techs.com/technologies/details/cm-joomla.

[42]

w3techs 2022. Usage statistics and market share of WordPress. https://w3techs.com/technologies/details/cm-wordpress.

[43]

Guowu Xie, Huy Hang, and Michalis Faloutsos. 2014. Scanner hunter: Understanding http scanning traffic. In Proceedings of the 9th ACM symposium on Information, computer and communications security.

Digital Library

[44]

Ting-Fang Yen, Yinglian Xie, Fang Yu, Roger Peng Yu, and Martin Abadi. 2012. Host Fingerprinting and Tracking on the Web: Privacy and Security Implications. In NDSS.

Index Terms

Scan Me If You Can: Understanding and Detecting Unwanted Vulnerability Scanning
1. Security and privacy

Recommendations

Performance of automated network vulnerability scanning at remediating security issues

This paper evaluates how large portion of an enterprises network security holes that would be remediated if one would follow the remediation guidelines provided by seven automated network vulnerability scanners. Remediation performance was assessed for ...
Software-driven Security Attacks: From Vulnerability Sources to Durable Hardware Defenses

There is an increasing body of work in the area of hardware defenses for software-driven security attacks. A significant challenge in developing these defenses is that the space of security vulnerabilities and exploits is large and not fully understood. ...
Prioritizing Vulnerability Remediation by Determining Attacker-Targeted Vulnerabilities

This article attempts to empirically analyze which vulnerabilities attackers tend to target in order to prioritize vulnerability remediation. This analysis focuses on the link between malicious connections and vulnerabilities, where each connection is ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '23: Proceedings of the ACM Web Conference 2023

April 2023

4293 pages

ISBN:9781450394161

DOI:10.1145/3543507

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 April 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Conference

WWW '23

Sponsor:

SIGWEB

WWW '23: The ACM Web Conference 2023

April 30 - May 4, 2023

TX, Austin, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
397
Total Downloads

Downloads (Last 12 months)210
Downloads (Last 6 weeks)25

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten