research-article

The Chameleon on the Web: an Empirical Study of the Insidious Proactive Web Defacements

Author:
Rui Zhao

University of Nebraska at Omaha, USA

University of Nebraska at Omaha, USA

0000-0001-8292-8483
View Profile

Authors Info & Claims

WWW '23: Proceedings of the ACM Web Conference 2023April 2023Pages 2241–2251https://doi.org/10.1145/3543507.3583377

Published:30 April 2023Publication History

WWW '23: Proceedings of the ACM Web Conference 2023

Pages 2241–2251

ABSTRACT

Web defacement is one of the major promotional channels for online underground economies. It regularly compromises benign websites and injects fraudulent content to promote illicit goods and services. It inflicts significant harm to websites’ reputations and revenues and may lead to legal ramifications. In this paper, we uncover proactive web defacements, where the involved web pages (i.e., landing pages) proactively deface themselves within browsers using JavaScript (i.e., control scripts). Proactive web defacements have not yet received attention from research communities, anti-hacking organizations, or law-enforcement officials. To detect proactive web defacements, we designed a practical tool, PACTOR. It runs in the browser and intercepts JavaScript API calls that manipulate web page content. It takes snapshots of the rendered HTML source code immediately before and after the intercepted API calls and detects proactive web defacements by visually comparing every two consecutive snapshots. Our two-month empirical study, using PACTOR, on 2,454 incidents of proactive web defacements shows that they can evade existing URL safety-checking tools and effectively promote the ranking of their landing pages using legitimate content/keywords. We also investigated the vendor network of proactive web defacements and reported all the involved domains to law-enforcement officials and URL-safety checking tools.

References

2016. adblockparser. https://pypi.org/project/adblockparser/.Google Scholar
2016. Tranco: A Research-Oriented Top Sites Ranking Hardened Against Manipulation. https://tranco-list.eu/.Google Scholar
2021. EasyList. https://easylist.to/easylist/easylist.txt.Google Scholar
2021. EasyPrivacy. https://easylist.to/easylist/easyprivacy.txt.Google Scholar
2022. Baidu url security center. https://bsb.baidu.com.Google Scholar
2022. Google safe browsing. https://transparencyreport.google.com/safe-browsing/search.Google Scholar
2022. In-depth guide to how Google Search works. https://developers.google.com/search/docs/fundamentals/how-search-works.Google Scholar
2022. “Jieba” (Chinese for “to stutter”) Chinese text segmentation: built to be the best Python Chinese word segmentation module. https://github.com/fxsjy/jieba.Google Scholar
2022. Norton safe web. https://safeweb.norton.com/.Google Scholar
2022. OpenCV. https://opencv.org/.Google Scholar
2022. Philippines to Shut 175 Online Casinos, Deport 40,000 Chinese. https://www.bloomberg.com/news/articles/2022-09-27/philippines-to-shut-175-online-casinos-deport-40-000-chinese.Google Scholar
2022. Policies for Content Posted by Users on Search. https://www.google.com/intl/en-US/search/policies/usercontent/.Google Scholar
2022. Selenium. https://www.selenium.dev/.Google Scholar
2022. Tencent url security center. https://urlsec.qq.com.Google Scholar
2022. Term frequency - inverse document frequency. https://en.wikipedia.org/wiki/Tf-idf.Google Scholar
2022. VirusTotal. https://www.virustotal.com.Google Scholar
2022. Wayback Machine - Internet Archive. https://archive.org/web/.Google Scholar
2022. Whois: Identify for everyone. https://www.whois.com/.Google Scholar
2022. Zone-H.org - Unrestricted information. http://zone-h.org/.Google Scholar
Sahar Abdelnabi, Katharina Krombholz, and Mario Fritz. 2020. VisualPhishNet: Zero-Day Phishing Website Detection by Visual Similarity. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security. 1681–1698.Google ScholarDigital Library
Alberto Bartoli and Eric Medvet. 2006. Automatic Integrity Checks for Remote Web Resources. IEEE Internet Computing 10 (2006), 56–62.Google ScholarDigital Library
Michael Bernard, Bonnie Lida, Shannon Riley, Telia Hackler, and Karen Janzen. 2002. A comparison of popular online fonts: Which size and type is best. Usability News 4, 1 (2002).Google Scholar
Kevin Borgolte, Christopher Kruegel, and Giovanni Vigna. 2013. Delta: Automatic Identification of Unknown Web-Based Infection Campaigns. In Proceedings of the ACM SIGSAC Conference on Computer & Communications Security. 109–120.Google ScholarDigital Library
Kevin Borgolte, Christopher Kruegel, and Giovanni Vigna. 2015. Meerkat: Detecting website defacements through image-based object recognition. In Proceedings of the USENIX Security Symposium. 595–610.Google Scholar
G. Davanzo, E. Medvet, and A. Bartoli. 2011. Anomaly Detection Techniques for a Web Defacement Monitoring Service. Expert Systems with Applications 38, 10 (sep 2011), 12521–12530.Google ScholarDigital Library
G. Donato, M.S. Bartlett, J.C. Hager, P. Ekman, and T.J. Sejnowski. 1999. Classifying facial actions. IEEE Transactions on Pattern Analysis and Machine Intelligence 21, 10 (1999), 974–989.Google ScholarDigital Library
Anthony Y. Fu, Liu Wenyin, and Xiaotie Deng. 2006. Detecting Phishing Web Pages with Visual Similarity Assessment Based on Earth Mover’s Distance (EMD). IEEE Transactions on Dependable and Secure Computing 3, 4 (2006), 301–311.Google ScholarDigital Library
Luca Invernizzi, Kurt Thomas, Alexandros Kapravelos, Oxana Comanescu, Jean-Michel Picod, and Elie Bursztein. 2016. Cloak of Visibility: Detecting When Machines Browse a Different Web. In Proceedings of the IEEE Symposium on Security and Privacy. 743–758.Google ScholarCross Ref
Zhuge Jianwei, Gu Lion, Duan Haixin, and Taylor Roberts. 2015. Investigating the Chinese Online Underground Economy. In China and Cybersecurity: Espionage, Strategy, and Politics in the Digital Domain.Google Scholar
Gene H. Kim and Eugene H. Spafford. 1994. The Design and Implementation of Tripwire: A File System Integrity Checker. In Proceedings of the ACM Conference on Computer and Communications Security. 18–29.Google Scholar
Ieng-Fat Lam, Wei-Cheng Xiao, Szu-Chi Wang, and Kuan-Ta Chen. 2009. Counteracting Phishing Page Polymorphism: An Image Layout Analysis Approach. In Proceedings of the Advances in Information Security and Assurance. 270–279.Google ScholarDigital Library
Wenyin Liu, Xiaotie Deng, Guanglin Huang, and A.Y. Fu. 2006. An antiphishing strategy based on visual similarity assessment. IEEE Internet Computing 10, 2 (2006), 58–65.Google ScholarDigital Library
F. Maes, A. Collignon, D. Vandermeulen, G. Marchal, and P. Suetens. 1997. Multimodality image registration by maximization of mutual information. IEEE Transactions on Medical Imaging 16, 2 (1997), 187–198.Google ScholarCross Ref
Federico Maggi, Marco Balduzzi, Ryan Flores, Lion Gu, and Vincenzo Ciancaglini. 2018. Investigating Web Defacement Campaigns at Large. In Proceedings of the Asia Conference on Computer and Communications Security. 443–456.Google ScholarDigital Library
Jian Mao, Jingdong Bian, Wenqian Tian, Shishi Zhu, Tao Wei, Aili Li, and Zhenkai Liang. 2018. Detecting Phishing Websites via Aggregation Analysis of Page Layouts. Procedia Computer Science 129 (2018), 224–230.Google ScholarCross Ref
Jian Mao, Jingdong Bian, Wenqian Tian, Shishi Zhu, Tao Wei, Aili Li, and Zhenkai Liang. 2019. Phishing page detection via learning classifiers from page layout feature. EURASIP Journal on Wireless Communications and Networking 2019, 1 (2019), 1–14.Google ScholarCross Ref
Leandro Medina and Friedrich Schneider. 2018. Shadow Economies Around the World: What Did We Learn Over the Last 20 Years¿Google Scholar
Eric Medvet, Cyril Fillon, and Alberto Bartoli. 2007. Detection of Web Defacements by means of Genetic Programming. In Proceedings of the International Symposium on Information Assurance and Security. 227–234.Google ScholarDigital Library
Eric Medvet, Engin Kirda, and Christopher Kruegel. 2008. Visual-Similarity-Based Phishing Detection. In Proceedings of the International Conference on Security and Privacy in Communication Netowrks.Google ScholarDigital Library
Adam G. Pennington, John D. Strunk, John Linwood Griffin, Craig A.N. Soules, Garth R. Goodson, and Gregory R. Ganger. 2003. Storage-based Intrusion Detection: Watching Storage Activity for Suspicious Behavior. In Proceedings of the USENIX Security Symposium.Google Scholar
Luz Rello, Martin Pielot, and Mari-Carmen Marcos. 2016. Make It Big! The Effect of Font Size and Line Spacing on Online Readability. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 3637–3648.Google Scholar
Angelo P. E. Rosiello, Engin Kirda, Christopher Kruegel, and Fabrizio Ferrandi. 2007. A layout-similarity-based approach for detecting phishing pages. In Proceedings of the International Conference on Security and Privacy in Communications Networks and the Workshops. 454–463.Google ScholarCross Ref
Joshua Saxe, Richard Harang, Cody Wild, and Hillary Sanders. 2018. A Deep Learning Approach to Fast, Format-Agnostic Detection of Malicious Web Content. In Proceedings of the IEEE Security and Privacy Workshops. 8–14.Google ScholarCross Ref
M. Schneider and Shih-Fu Chang. 1996. A robust content based digital signature for image authentication. In Proceedings of the IEEE International Conference on Image Processing, Vol. 3. 227–230.Google ScholarCross Ref
Markus Andreas Stricker and Markus Orengo. 1995. Similarity of color images. In Proceedings of the Storage and Retrieval for Image and Video Databases III, Vol. 2420. 381–392.Google Scholar
David Y. Wang, Stefan Savage, and Geoffrey M. Voelker. 2011. Cloak and Dagger: Dynamics of Web Search Cloaking. In Proceedings of the ACM Conference on Computer and Communications Security. 477–490.Google ScholarDigital Library
Zhou Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600–612.Google ScholarDigital Library
Liu Wenyin, Guanglin Huang, Liu Xiaoyue, Zhang Min, and Xiaotie Deng. 2005. Detection of Phishing Webpages Based on Visual Similarity. In Special Interest Tracks and Posters of the International Conference on World Wide Web. 1060–1061.Google Scholar
Baoning Wu and Brian D. Davison. 2006. Detecting Semantic Cloaking on the Web. In Proceedings of the International Conference on World Wide Web. 819–828.Google ScholarDigital Library
Ronghai Yang, Xianbo Wang, Cheng Chi, Dawei Wang, Jiawei He, Siming Pang, and Wing Cheong Lau. 2021. Scalable Detection of Promotional Website Defacements in Black Hat { SEO} Campaigns. In Proceedings of the USENIX Security Symposium. 3703–3720.Google Scholar
Haijun Zhang, Gang Liu, Tommy W. S. Chow, and Wenyin Liu. 2011. Textual and Visual Content-Based Anti-Phishing: A Bayesian Approach. IEEE Transactions on Neural Networks 22, 10 (2011), 1532–1546.Google ScholarDigital Library
Weifeng Zhang, Hua Lu, Baowen Xu, and Hongji Yang. 2013. Web phishing detection based on page spatial layout similarity. Informatica 37, 3 (2013).Google Scholar

Index Terms

The Chameleon on the Web: an Empirical Study of the Insidious Proactive Web Defacements
1. Security and privacy
  1. Software and application security
    1. Web application security

Recommendations

Investigating Web Defacement Campaigns at Large
ASIACCS '18: Proceedings of the 2018 on Asia Conference on Computer and Communications Security

Website defacement is the practice of altering the web pages of a website after its compromise. The altered pages, calleddeface pages, can negatively affect the reputation and business of the victim site. Previous research has focused primarily on ...
Read More
An Empirical Study of Web Cookies
WWW '16: Proceedings of the 25th International Conference on World Wide Web

Web cookies are used widely by publishers and 3rd parties to track users and their behaviors. Despite the ubiquitous use of cookies, there is little prior work on their characteristics such as standard attributes, placement policies, and the knowledge ...
Read More
Anomaly detection techniques for a web defacement monitoring service
Highlights
► Web site defacements are a widespread problem. ► Reactions by affected administrators are usually slow. ► Anomaly detection techniques can be used to automatically detect defacements.

Abstract
The defacement of web sites has become a widespread problem. Reaction to these incidents is often quite slow and triggered by occasional checks or even feedback from users, because organizations usually lack a systematic and round the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '23: Proceedings of the ACM Web Conference 2023
April 2023
4293 pages
ISBN:9781450394161
DOI:10.1145/3543507
Editors:
Ying Ding,
Jie Tang,
Juan Sequeda,
Lora Aroyo,
Carlos Castillo,
Geert-Jan Houben
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 April 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
proactive
security
trust
web defacement
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 176
  Total Downloads
- Downloads (Last 12 months)173
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

The Chameleon on the Web: an Empirical Study of the Insidious Proactive Web Defacements

WWW '23: Proceedings of the ACM Web Conference 2023

ABSTRACT

References

Cited By

Index Terms

Recommendations

Investigating Web Defacement Campaigns at Large

An Empirical Study of Web Cookies

Anomaly detection techniques for a web defacement monitoring service

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

The Chameleon on the Web: an Empirical Study of the Insidious Proactive Web Defacements

WWW '23: Proceedings of the ACM Web Conference 2023

ABSTRACT

References

Cited By

Index Terms

Recommendations

Investigating Web Defacement Campaigns at Large

An Empirical Study of Web Cookies

Anomaly detection techniques for a web defacement monitoring service

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media