abstract

Effective analysis, characterization, and detection of malicious web pages

Author:
Birhanu Eshete

Fondazione Bruno Kessler, Trento, Italy

Fondazione Bruno Kessler, Trento, Italy
View Profile

WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide WebMay 2013Pages 355–360https://doi.org/10.1145/2487788.2487942

Published:13 May 2013Publication History

WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide Web

Pages 355–360

ABSTRACT

The steady evolution of the Web has paved the way for miscreants to take advantage of vulnerabilities to embed malicious content into web pages. Up on a visit, malicious web pages steal sensitive data, redirect victims to other malicious targets, or cease control of victim's system to mount future attacks. Approaches to detect malicious web pages have been reactively effective at special classes of attacks like drive-by-downloads. However, the prevalence and complexity of attacks by malicious web pages is still worrisome. The main challenges in this problem domain are (1) fine-grained capturing and characterization of attack payloads (2) evolution of web page artifacts and (3) exibility and scalability of detection techniques with a fast-changing threat landscape. To this end, we proposed a holistic approach that leverages static analysis, dynamic analysis, machine learning, and evolutionary searching and optimization to effectively analyze and detect malicious web pages. We do so by: introducing novel features to capture fine-grained snapshot of malicious web pages, holistic characterization of malicious web pages, and application of evolutionary techniques to fine-tune learning-based detection models pertinent to evolution of attack payloads. In this paper, we present key intuition and details of our approach, results obtained so far, and future work.

References

M. Alexander, B. Tanya, D. Damien, S. D. Gribble, and H. M. Levy. Spyproxy: execution-based detection of malicious web content. In Proceedings of 16th USENIX Security Symposium, pages 3:1--3:16, 2007. Google ScholarDigital Library
I. Archive. Heritrix. http://crawler.archive.org/index.html, July 2012.Google Scholar
K. Byung-Ik, I. Chae-Tae, and J. Hyun-Chul. Suspicious malicious web site detection with strength analysis of a javascript obfuscation. In International Journal of Advanced Science and Technology, pages 19--32, 2011.Google Scholar
D. Canali, M. Cova, G. Vigna, and C. Kruegel. Prophiler: a fast filter for the large-scale detection of malicious web pages. In Proceedings of WWW, pages 197--206, 2011. Google ScholarDigital Library
H. Choi, B. B. Zhu, and H. Lee. Detecting malicious web links and identifying their attack types. In Proceedings of the 2nd USENIX conference on Web application development, pages 11--11, 2011. Google ScholarDigital Library
S. Corporation. Symantec web based attack prevalence report. http://www.symantec.com/business/threatreport/topic.jsp?id=threat_activity_trends&aid=web_based_attack_prevalence, July 2011.Google Scholar
A. Dewald, T. Holz, and F. C. Freiling. Adsandbox: sandboxing javascript to fight malicious websites. In ACM Symposium on Applied Computing, pages 1859--1864, 2010. Google ScholarDigital Library
B. Eshete, A. Villafiorita, and K. Weldemariam. Binspect: Holistic analysis and detection of malicious web pages. In Proceedings of Security and Privacy in Communication Networks, 2012.Google Scholar
B. Eshete, A. Villafiorita, and K. Weldemariam. Einspect: Evolution-guided analaysis and detection of malicious web pages. Technical report, Fondazione Bruno Kessler, 2012.Google Scholar
Google. Google safe browsing api. http://code.google.com/apis/safebrowsing/, August 2011.Google Scholar
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The weka data mining software: An update. SIGKDD Explorations, 11, 2009. Google ScholarDigital Library
A. Ikinci, T. Holz, and F. Freiling. Monkey-spider: Detecting malicious websites with low-interaction honeyclients. In Proceedings of Sicherheit, Schutz und Zuverl Lssigkeit, pages 407--421, 2008.Google Scholar
M. Justin, S. L. K., S. Stefan, and V. G. M. Beyond blacklists: learning to detect malicious web sites from suspicious urls. In Proceedings of KDDM, pages 1245--1254, 2009. Google ScholarDigital Library
M. Justin, S. L. K., S. Stefan, and V. G. M. Identifying suspicious urls: an application of large-scale online learning. In Proceedings of ICML, pages 681--688, 2009. Google ScholarDigital Library
C. Kolbitsch, B. Livshits, B. Zorn, and C. Seifer. Rozzle: De-cloaking internet malware. Technical report, Microsoft, 2011.Google Scholar
C. Marco, K. Christopher, and V. Giovanni. Detection and analysis of drive-by-download attacks and malicious javascript code. In Proceedings of WWW, pages 281--290, 2010. Google ScholarDigital Library
T. Micro. Web threats. http://apac.trendmicro.com/apac/threats/enterprise/web-threats/, November 2012.Google Scholar
MITRE. The mitre honeyclient project. http://search.cpan.org/~mitrehc, November 2011.Google Scholar
H. Project. Honeyc. https://projects.honeynet.org/honeyc, July 2011.Google Scholar
T. H. Project. Capture-hpc. https://projects.honeynet.org/capture-hpc, October 2011.Google Scholar
M. Qassrawi and H. Zhang. Detecting malicious web servers with honeyclients. Journal of Networks, 6(1), 2011.Google ScholarCross Ref
K. Rieck, T. Krueger, and A. Dewald. Cujo: efficient detection and prevention of drive-by-download attacks. In Proceedings ACSAC, pages 31--39, 2010. Google ScholarDigital Library
C. Seifert, I. Welch, and P. Komisarczuk. Identification of malicious web pages with static heuristics. In Proceedings of the Australasian Telecommunication Networks and Applications Conference, 2008.Google ScholarCross Ref
C. Seifert, I. Welch, P. Komisarczuk, C. Aval, and B. Endicott-Popovsky. Identification of malicious web pages through analysis of underlying dns and web server relationships. In 33rd IEEE Conference on Local Computer Networks, 2008.Google ScholarCross Ref
G. Software. Htmlunit. http://htmlunit.sourceforge.net/, March 2012.Google Scholar
Symantec. Symantec report on attack kits and malicious websites. http://symantec.com/content/en/us/enterprise/other_resources/b-symantec_report_on_attack_kits_and_malicious_websites_21169171_WP.en-us.pdf, July 2011.Google Scholar
K. Thomas, C. Grier, J. Ma, V. Paxson, and D. Song. Design and Evaluation of a Real-Time URL Spam Filtering Service. In Proceedings of the IEEE Symposium on Security and Privacy, 2011. Google ScholarDigital Library
UCSB. Wepawet. http://wepawet.cs.ucsb.edu, July 2011.Google Scholar
Y.-M. Wang, D. Beck, X. Jiang, and R. Roussev. Automated web patrol with strider honeymonkeys: Finding web sites that exploit browser vulnerabilities. In Proceedings of the NDSS, 2006.Google Scholar
A. Weiss. Top 5 security threats in html5. http://www.esecurityplanet.com/trends/article.php/3916381/Top-5-Security-Threats-in-HTML5.htm, October 2011.Google Scholar
D. Whitley. A genetic algorithm tutorial. Statistics and Computing, 4:65--85, 1993.Google Scholar
C. Whittaker, B. Ryner, and M. Nazif. Large-scale automatic classification of phishing pages. In Proceedings of the NDSS, 2010.Google Scholar
H. Yung-Tsung, C. Yimeng, C. Tsuhan, L. Chi-Sung, and C. Chia-Mei. Malicious web content detection by machine learning. Expert Syst. Appl., 3 (1):55--60, 2010. Google ScholarDigital Library

Index Terms

Effective analysis, characterization, and detection of malicious web pages
1. Security and privacy
  1. Intrusion/anomaly detection and malware mitigation
2. Social and professional topics
  1. Computing / technology policy
    1. Computer crime

Recommendations

EINSPECT: Evolution-Guided Analysis and Detection of Malicious Web Pages
COMPSAC '13: Proceedings of the 2013 IEEE 37th Annual Computer Software and Applications Conference

Most existing work to thwart malicious web pages capture maliciousness viadiscriminative artifacts, learn a model, and detect by leveraging staticand/or dynamic analysis. Unfortunately, there is a two-sided evolution of theartifacts of web pages. On one ...
Read More
Early detection of malicious behavior in JavaScript code
AISec '12: Proceedings of the 5th ACM workshop on Security and artificial intelligence

Malicious JavaScript code is widely used for exploiting vulnerabilities in web browsers and infecting users with malicious software. Static detection methods fail to protect from this threat, as they are unable to cope with the complexity and dynamics ...
Read More
Hybrid Analysis Technique to detect Advanced Persistent Threats

Advanced persistent threats APT are major threats in the field of system and network security. They are extremely stealthy and use advanced evasion techniques like packing and behaviour obfuscation to hide their malicious behaviour and evade the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide Web
May 2013
1636 pages
ISBN:9781450320382
DOI:10.1145/2487788
General Chairs:
Daniel Schwabe
PUC-Rio - Brazil
,
Virgílio Almeida
UFMG - Brazil
,
Hartmut Glaser
CGI.br - Brazil
,
Program Chairs:
Ricardo Baeza-Yates
Yahoo! Labs - Spain & Chile
,
Sue Moon
KAIST - South Korea
Copyright © 2013 Copyright is held by the International World Wide Web Conference Committee (IW3C2).
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 May 2013
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
dynamic analysis
effective detection
machine learning
malicious web pages
static analysis
web-based attacks
Qualifiers
- abstract
Conference

Acceptance Rates
WWW '13 Companion Paper Acceptance Rate831of1,250submissions,66%Overall Acceptance Rate1,899of8,196submissions,23%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 12
  Total Citations
  View Citations
- 448
  Total Downloads
- Downloads (Last 12 months)25
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Effective analysis, characterization, and detection of malicious web pages

WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide Web

ABSTRACT

References

Cited By

Index Terms

Recommendations

EINSPECT: Evolution-Guided Analysis and Detection of Malicious Web Pages

Early detection of malicious behavior in JavaScript code

Hybrid Analysis Technique to detect Advanced Persistent Threats