skip to main content
10.1145/3543507.3583394acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article
Public Access

Scan Me If You Can: Understanding and Detecting Unwanted Vulnerability Scanning

Published: 30 April 2023 Publication History

Abstract

Web vulnerability scanners (WVS) are an indispensable tool for penetration testers and developers of web applications, allowing them to identify and fix low-hanging vulnerabilities before they are discovered by attackers. Unfortunately, malicious actors leverage the very same tools to identify and exploit vulnerabilities in third-party websites. Existing research in the WVS space is largely concerned with how many vulnerabilities these tools can discover, as opposed to trying to identify the tools themselves when they are used illicitly.
In this work, we design a testbed to characterize web vulnerability scanners using browser-based and network-based fingerprinting techniques. We conduct a measurement study over 12 web vulnerability scanners as well as 159 users who were recruited to interact with the same web applications that were targeted by the evaluated WVSs. By contrasting the traffic and behavior of these two groups, we discover tool-specific and type-specific behaviors in WVSs that are absent from regular users. Based on these observations,
we design and build ScannerScope, a machine-learning-based, web vulnerability scanner detection system. ScannerScope consists of a transparent reverse proxy that injects fingerprinting modules on the fly without the assistance (or knowledge) of the protected web applications. Our evaluation results show that ScannerScope can effectively detect WVSs and protect web applications against unwanted vulnerability scanning, with a detection accuracy of over 99% combined with near-zero false positives on human-visitor traffic. Finally, we show that the asynchronous design of ScannerScope results in a negligible impact on server performance and demonstrate that its classifier can resist adversarial ML attacks launched by sophisticated adversaries.

References

[1]
2022. Acunetix online scanner. https://www.acunetix.com/online-vulnerability-scanner/.
[2]
2022. Tenable cloud web scanner. https://www.tenable.com/products/tenable-io.
[3]
apachebench 2022. AB - Apache HTTP server benchmarking tool. https://httpd.apache.org/docs/2.4/programs/ab.html.
[4]
arachni 2022. Arachni Web Application Security Scanner Framework. https://www.arachni-scanner.com/.
[5]
Andrew Austin and Laurie Williams. 2011. One technique is not enough: A comparison of vulnerability discovery techniques. In 2011 International Symposium on Empirical Software Engineering and Measurement.
[6]
Babak Amin Azad, Oleksii Starov, Pierre Laperdrix, and Nick Nikiforakis. 2020. Web runner 2049: Evaluating third-party anti-bot services. In International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer.
[7]
Lee Brotherston. 2022. TLS Fingerprinting library. https://github.com/LeeBrotherston/tls-fingerprinting. https://github.com/LeeBrotherston/tls-fingerprinting
[8]
BugCrowd. 2022. Trello bug bounty program. https://bugcrowd.com/trello.
[9]
Elie Bursztein, Artem Malyshev, Tadek Pietraszek, and Kurt Thomas. 2016. Picasso: Lightweight device class fingerprinting for web clients. In Proceedings of the 6th Workshop on Security and Privacy in Smartphones and Mobile Devices.
[10]
Cybersecurity and Infrastructure Security Agency. 2021. Apache Log4j Vulnerability Guidance. https://www.cisa.gov/uscert/apache-log4j-vulnerability-guidance.
[11]
Adam Doupé, Ludovico Cavedon, Christopher Kruegel, and Giovanni Vigna. 2012. Enemy of the state: A state-aware black-box web vulnerability scanner. In 21st { USENIX} Security Symposium ({ USENIX} Security 2012).
[12]
Adam Doupé, Marco Cova, and Giovanni Vigna. 2010. Why Johnny can’t pentest: An analysis of black-box web vulnerability scanners. In International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer.
[13]
Peter Eckersley. 2010. How unique is your web browser¿. In International Symposium on Privacy Enhancing Technologies Symposium. Springer.
[14]
Benjamin Eriksson, Giancarlo Pellegrino, and Andrei Sabelfeld. 2021. Black Widow: Blackbox Data-driven Web Scanning. IEEE Symposium on Security and Privacy (2021).
[15]
Alicia Hope. 2021. Massive Cyber Attacks Target F5 BIG-IP Critical Vulnerabilities After Firm Releases Updates. https://www.cpomagazine.com/cyber-security/massive-cyber-attacks-target-f5-big-ip-critical-vulnerabilities-after-firm-releases-updates/.
[16]
Umar Iqbal, Steven Englehardt, and Zubair Shafiq. 2021. Fingerprinting the fingerprinters: Learning to detect browser fingerprinting behaviors. In 2021 IEEE Symposium on Security and Privacy.
[17]
Gregoire Jacob, Engin Kirda, Christopher Kruegel, and Giovanni Vigna. 2012. PUBCRAWL: Protecting Users and Businesses from CRAWLers. In 21st USENIX Security Symposium (USENIX Security 12). USENIX Association, Bellevue, WA, 507–522. https://www.usenix.org/conference/usenixsecurity12/technical-sessions/presentation/jacob
[18]
Steve TK Jan, Qingying Hao, Tianrui Hu, Jiameng Pu, Sonal Oswal, Gang Wang, and Bimal Viswanath. 2020. Throwing darts in the dark¿ detecting bots with limited data using neural data augmentation. In 2020 IEEE Symposium on Security and Privacy.
[19]
Brian Kondracki, Babak Amin Azad, Oleksii Starov, and Nick Nikiforakis. 2021. Catching Transparent Phish: Analyzing and Detecting MITM Phishing Toolkits. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security.
[20]
Xigao Li, Babak Amin Azad, Amir Rahmati, and Nick Nikiforakis. 2021. Good bot, bad bot: Characterizing automated browsing activity. In 2021 IEEE symposium on security and privacy.
[21]
Anália G Lourenço and Orlando O Belo. 2006. Catching web crawlers in the act. In Proceedings of the 6th international Conference on Web Engineering.
[22]
majestic 2022. Majestic Million. https://majestic.com/reports/majestic-million.
[23]
Yuma Makino and Vitaly Klyuev. 2015. Evaluation of web vulnerability scanners. In 2015 IEEE 8th International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS).
[24]
Balume Mburano and Weisheng Si. 2018. Evaluation of web vulnerability scanners based on owasp benchmark. In 2018 26th International Conference on Systems Engineering (ICSEng). IEEE.
[25]
mturk 2022. Amazon Mechanical Turk. https://www.mturk.com/.
[26]
NIST. 2021. CVSS Severity Distribution Over Time. https://nvd.nist.gov/general/visualizations/vulnerability-visualizations/cvss-severity-distribution-over-time.
[27]
Charlie Osborne. 2021. Critical remote code execution flaw in thousands of VMWare vCenter servers remains unpatched. https://www.zdnet.com/article/critical-remote-code-execution-flaw-in-thousands-of-vmware-vcenter-servers-remains-unpatched/.
[28]
owasp 2022. Free for Open Source Application Security Tools. https://owasp.org/www-community/Free_for_Open_Source_Application_Security_Tools.
[29]
owasp 2022. OWASP Zed Attack Proxy (ZAP). https://owasp.org/www-project-zap/.
[30]
owasp 2022. WSTG - v4.1, Testing Tools Resource. https://owasp.org/www-project-web-security-testing-guide/v41/6-Appendix/A-Testing_Tools_Resource.
[31]
KyoungSoo Park, Vivek S Pai, Kang-Won Lee, and Seraphin B Calo. 2006. Securing Web Service by Automatic Robot Detection. In USENIX Annual Technical Conference, General Track.
[32]
Davor Petreski. 2019. Integrating Web Vulnerability Scanners in Continuous Integration: DAST for CI/CD. https://blog.probely.com/integrating-web-vulnerability-scanners-in-continuous-integration-dast-for-ci-cd-7637eaff26bd.
[33]
Piwik. 2022. Piwik Pro bug bounty program. https://piwik.pro/security-bug-bounty-programat-piwik-pro/.
[34]
PortSwigger. 2019. CI/CD security testing. https://portswigger.net/developers/ci-cd-security.
[35]
Sugandh Shah and Babu M Mehtre. 2015. An overview of vulnerability assessment and penetration testing techniques. Journal of Computer Virology and Hacking Techniques (2015).
[36]
Larry Suto. 2010. Analyzing the accuracy and time costs of web application security scanners. San Francisco, February (2010).
[37]
Pang-Ning Tan and Vipin Kumar. 2004. Discovery of web robot sessions based on their navigational patterns. In Intelligent Technologies for Information Analysis. Springer.
[38]
UnitedAirlines. 2022. United Airlines bug bounty program. https://www.united.com/ual/en/us/fly/contact/vdppolicy.html.
[39]
Tom Van Goethem, Frank Piessens, Wouter Joosen, and Nick Nikiforakis. 2014. Clubbing seals: Exploring the ecosystem of third-party security seals. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security.
[40]
Nikos Virvilis, Bart Vanautgaerden, and Oscar Serrano Serrano. 2021. Changing the game: The art of deceiving sophisticated attackers. In 2014 6th International Conference On Cyber Conflict (CyCon 2014). IEEE.
[41]
w3techs 2022. Usage statistics and market share of Joomla. https://w3techs.com/technologies/details/cm-joomla.
[42]
w3techs 2022. Usage statistics and market share of WordPress. https://w3techs.com/technologies/details/cm-wordpress.
[43]
Guowu Xie, Huy Hang, and Michalis Faloutsos. 2014. Scanner hunter: Understanding http scanning traffic. In Proceedings of the 9th ACM symposium on Information, computer and communications security.
[44]
Ting-Fang Yen, Yinglian Xie, Fang Yu, Roger Peng Yu, and Martin Abadi. 2012. Host Fingerprinting and Tracking on the Web: Privacy and Security Implications. In NDSS.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '23: Proceedings of the ACM Web Conference 2023
April 2023
4293 pages
ISBN:9781450394161
DOI:10.1145/3543507
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 April 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Fingerprinting
  2. Vulnerabilities
  3. Web Vulnerability Scanner

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

WWW '23
Sponsor:
WWW '23: The ACM Web Conference 2023
April 30 - May 4, 2023
TX, Austin, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 397
    Total Downloads
  • Downloads (Last 12 months)210
  • Downloads (Last 6 weeks)25
Reflects downloads up to 25 Feb 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media