Article

Fighting unicode-obfuscated spam

Authors:

Sid StammAuthors Info & Claims

eCrime '07: Proceedings of the anti-phishing working groups 2nd annual eCrime researchers summit

Pages 45 - 59

https://doi.org/10.1145/1299015.1299020

Published: 04 October 2007 Publication History

Abstract

In the last few years, obfuscation has been used more and more by spammers to make spam emails bypass filters. The standard method is to use images that look like text, since typical spam filters are unable to parse such messages; this is what is used in so-called "rock phishing". To fight image-based spam, many spam filters use heuristic rules in which emails containing images are flagged, and since not many legit emails are composed mainly of a big image, this aids in detecting image-based spam. The spammers are thus interested in circumventing these methods. Unicode transliteration is a convenient tool for spammers, since it allows a spammer to create a large number of homomorphic clones of the same looking message; since Unicode contains many characters that are unique but appear very similar, spammers can translate a message's characters at random to hide black-listed words in an effort to bypass filters. In order to defend against these unicode-obfuscated spam emails, we developed a prototype tool that can be used with Spam Assassin to block spam obfuscated in this way by mapping polymorphic messages to a common, more homogeneous representation. This representation can then be filtered using traditional methods. We demonstrate the ease with which Unicode polymorphism can be used to circumvent spam filters such as SpamAssassin, and then describe a de-obfuscation technique that can be used to catch messages that have been obfuscated in this fashion.

References

[1]

S. Ahmed, F. Mithun, "Word Stemming to Enhance Spam Filtering," in the Conference on Email and Anti-Spam (CEAS'04) 2004. http://www.ceas.cc/papers-2004/167.

[2]

R. Cockerham, "There are 600, 426, 974, 379, 824, 381, 952 ways to spell Viagra." http://cockeyed.com/lessons/viagra/viagra.html. Retrieved on 25 July 2007.

[3]

D. Cook, J. Hartnett, K. Manderson, J. Scanlan, "Catching Spam Before it Arrives:Domain Specific Dynamic Blacklists," http://crpit.com/confpapers/CRPITV54Cook.pdf.

Digital Library

[4]

L. F. Cranor, B. A. LaMacchia, "Spam!" Communications of the ACM, August 1998.

Digital Library

[5]

A. Y. Fu, W. Zhang, X. Deng, W. Liu, "Safeguard against unicode attacks: generation and Application of UC-simlist," in the 15th International World Wide Web Conference (WWW'06), May 2006.

Digital Library

[6]

A. Y. Fu, X. Deng, W. Liu, G. Little, "The Methodology and an Application to Fight Against Unicode Attacks," in Proceedings of the Second Symposium on Usable Privacy and Security (SOUPS'06) July 2006. ACM Press.

Digital Library

[7]

F. D. Garcia, J. H. Hoepman, J. V. Nieuwenhuizen, "Spam Filter Analysis," arXiv report, February 2004. Available at http://arxiv.org/PS_cache/cs/pdf/0402/0402046v1.pdf

[8]

S. L. Garfinkel and R. C. Miller, "Johnny 2: a user test of key continuity management with S/MIME and Outlook Express," Proceedings of the 2005 Symposium on Usable Privacy and Security, 2005, pp. 13--24

Digital Library

[9]

P. Graham, "Better Bayesian Filtering," Spam Conference, January 2003. Available at http://www.paulgraham.com/better.html.

[10]

E. Gabber, M. Jakobsson, Y. Matias, A. Mayer, "Curbing Junk E-mail via Secure Classification," Financial Cryptograpy, 1998.

Digital Library

[11]

E. Gabrilovich, A. Gontmakher, "The Homograph Attack," Communications of the ACM, February 2002.

Digital Library

[12]

J. Goodman, G. V. Cormack, D. Heckerman, "Spam and the Ongoing Battle for the Inbox," Communications of the ACM, February 2007.

Digital Library

[13]

R. J. Hall, "Channels: Avoiding Unwanted Electronic Mail," Communications of the ACM, Volume 41 Issue 3, 1998.

Digital Library

[14]

R. J. Hall, "A Countermeasure to Duplicate-detecting Anti-spam Techniques," Available at http://citeseer.ist.psu.edu/279802.html, accessed 25 July 2007.

[15]

M. Jakobsson, "Modeling and Preventing Phishing Attacks," Phishing Panel in Financial Cryptography 2005. Available at www.informatics.indiana.edu/markus/papers/phishing_jakobsson.pdf

Digital Library

[16]

M. Jakobsson, J. Linn, J. Algesheimer, "How to Protect Against a Militant Spammer," http://www.informatics.indiana.edu/markus/papers/spam.pdf, accessed 1 July 2007.

[17]

M. Jakobsson and S. A. Myers (Eds.), Phishing and Countermeasures: Understanding the Increasing Problem of Electronic Identity Theft. ISBN 0-471-78245-9, Hardcover, 739 pages, December 2006.

Digital Library

[18]

J. Nazario, "Phishing Corpus," http://monkey.org/~jose/blog/viewpage.php?page=phishing_corpus. Accessed 22 May 2007.

[19]

U. Shardanand, P. Maes, "Social Information Filtering: Algorithms for Automating 'Word of Mouth'," Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. May 1995.

Digital Library

[20]

B. Thorson, "How Spammers Bypass E-mail Security," EE Times, 25 July 2007. http://www.eetimes.com/showArticle.jhtml? articleID=23900564

[21]

A. Tsow and M. Jakobsson, "Deceit and Deception: A Large User Study of Phishing," Technical Report TR649, Indiana University, August 2007. http://www.cs.indiana.edu/pub/techreports/TR649.pdf

[22]

S. Srikwan, M. Jakobsson, "Using Cartoons to Teach Internet Security." DIMACS Technical Report 2007-11, July 2007. http://www.informatics.indiana.edu/markus/documents/security-education.pdf

[23]

CRM114. http://crm114.sourceforge.net, Accessed 22 May 2007.

[24]

Anti-Phishing Group of City University of Hong Kong, http://antiphishing.cs.cityu.edu.hk.

[25]

Messaging Anti-Abuse Working Group, Email Metrics Program: "The Network Operator's Perspective, Report #4--3rd and 4th Quarters 2006," Available at http://www.maawg.org/about/MAAWGMetric_2006_3_4_report.pdf

[26]

SpamAssassin. http://wiki.apache.org/spamassassin, Accessed 22 May 2007.

[27]

SpamAssassin Readme file. http://www.cpan.org/modules/by-module/Mail/Mail-SpamAssassin-2.64.readme Accessed 22 May 2007.

[28]

SpamAssassin public Corpus, http://spamassassin.apache.org/publiccorpus, Accessed 25 May 2006.

Cited By

Chen WWang FEdwards M(2023)Active Countermeasures for Email Fraud2023 IEEE 8th European Symposium on Security and Privacy (EuroS&P)10.1109/EuroSP57164.2023.00012(39-55)Online publication date: Jul-2023
https://doi.org/10.1109/EuroSP57164.2023.00012
Ivanov NLou JChen TLi JYan QCao JHo Au MLin ZYung M(2021)Targeting the Weakest Link: Social Engineering Attacks in Ethereum Smart ContractsProceedings of the 2021 ACM Asia Conference on Computer and Communications Security10.1145/3433210.3453085(787-801)Online publication date: 24-May-2021
https://dl.acm.org/doi/10.1145/3433210.3453085
Yazdani Rvan der Toorn OSperotto A(2020)A Case of Identity: Detection of Suspicious IDN Homograph Domains Using Active DNS Measurements2020 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW)10.1109/EuroSPW51379.2020.00082(559-564)Online publication date: Sep-2020
https://doi.org/10.1109/EuroSPW51379.2020.00082
Show More Cited By

Fighting unicode-obfuscated spam
1. Information systems
  1. World Wide Web
    1. Web applications
      1. Internet communications tools
2. Social and professional topics
  1. Computing / technology policy

Recommendations

Clustering Spam Emails into Campaigns
ICISSP 2015: Proceedings of the 1st International Conference on Information Systems Security and Privacy

Spam emails constitute a fast growing and costly problems associated with the Internet today. To fight effectively

against spammers, it is not enough to block spam messages. Instead, it is necessary to analyze the

behavior of spammer. This analysis is ...
Fighting against web spam: a novel propagation method based on click-through data
SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval

Combating Web spam is one of the greatest challenges for Web search engines. State-of-the-art anti-spam techniques focus mainly on detecting varieties of spam strategies, such as content spamming and link-based spamming. Although these anti-spam ...
Optimization of Anti-Spam Systems with Multiobjective Evolutionary Algorithms

In this paper anti-spam filtering is presented as a cumbersome service, as opposed to a software product perspective. The huge human effort for setting up, adaptation, maintenance, and tuning of filters for spam detection in anti-spam systems is ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

eCrime '07: Proceedings of the anti-phishing working groups 2nd annual eCrime researchers summit

October 2007

90 pages

ISBN:9781595939395

DOI:10.1145/1299015

General Chair:
Lorrie Faith Cranor
Carnegie Mellon University

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 October 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

eCrime '07

eCrime '07: eCrime '07 - Anti-phishing working group 2007 eCrime Researchers' Summit

October 4 - 5, 2007

Pennsylvania, Pittsburgh, USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

19
Total Citations
View Citations
527
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chen WWang FEdwards M(2023)Active Countermeasures for Email Fraud2023 IEEE 8th European Symposium on Security and Privacy (EuroS&P)10.1109/EuroSP57164.2023.00012(39-55)Online publication date: Jul-2023
https://doi.org/10.1109/EuroSP57164.2023.00012
Ivanov NLou JChen TLi JYan QCao JHo Au MLin ZYung M(2021)Targeting the Weakest Link: Social Engineering Attacks in Ethereum Smart ContractsProceedings of the 2021 ACM Asia Conference on Computer and Communications Security10.1145/3433210.3453085(787-801)Online publication date: 24-May-2021
https://dl.acm.org/doi/10.1145/3433210.3453085
Yazdani Rvan der Toorn OSperotto A(2020)A Case of Identity: Detection of Suspicious IDN Homograph Domains Using Active DNS Measurements2020 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW)10.1109/EuroSPW51379.2020.00082(559-564)Online publication date: Sep-2020
https://doi.org/10.1109/EuroSPW51379.2020.00082
Joseph ANelson BRubinstein BTygar J(2019)Adversarial Machine Learning10.1017/9781107338548Online publication date: 14-Mar-2019
https://doi.org/10.1017/9781107338548
Alepis E(2019)Notify This: Exploiting Android Notifications for Fun and ProfitInformation Systems Security and Privacy10.1007/978-3-030-25109-3_5(86-108)Online publication date: 5-Jul-2019
https://doi.org/10.1007/978-3-030-25109-3_5
Illiano VPaudice AMuñoz-González LLupu E(2018)Determining Resilience Gains From Anomaly Detection for Event Integrity in Wireless Sensor NetworksACM Transactions on Sensor Networks10.1145/317662114:1(1-35)Online publication date: 1-Feb-2018
https://dl.acm.org/doi/10.1145/3176621
Dhiman MJakobsson MYen T(2017)Breaking and fixing content-based filtering2017 APWG Symposium on Electronic Crime Research (eCrime)10.1109/ECRIME.2017.7945054(52-56)Online publication date: Apr-2017
https://doi.org/10.1109/ECRIME.2017.7945054
Aleroud AZhou L(2017)Phishing environments, techniques, and countermeasuresComputers and Security10.1016/j.cose.2017.04.00668:C(160-196)Online publication date: 1-Jul-2017
https://dl.acm.org/doi/10.1016/j.cose.2017.04.006
Siadati HJafarikhah SJakobsson M(2016)Traditional Countermeasures to Unwanted EmailUnderstanding Social Engineering Based Scams10.1007/978-1-4939-6457-4_5(51-62)Online publication date: 2016
https://doi.org/10.1007/978-1-4939-6457-4_5
Montazer GArabYarmohammadi S(2015)Detection of phishing attacks in Iranian e-banking using a fuzzy-rough hybrid systemApplied Soft Computing10.1016/j.asoc.2015.05.05935:C(482-492)Online publication date: 1-Oct-2015
https://dl.acm.org/doi/10.1016/j.asoc.2015.05.059
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten