skip to main content
10.1145/3131365.3131399acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections
research-article

Email typosquatting

Published: 01 November 2017 Publication History

Abstract

While website domain typosquatting is highly annoying for legitimate domain operators, research has found that it relatively rarely presents a great risk to individual users. However, any application (e.g., email, ftp,...) relying on the domain name system for name resolution is equally vulnerable to domain typosquatting, and consequences may be more dire than with website typosquatting.
This paper presents the first in-depth measurement study of email typosquatting. Working in concert with our IRB, we registered 76 typosquatting domain names to study a wide variety of user mistakes, while minimizing the amount of personal information exposed to us. In the span of over seven months, we received millions of emails at our registered domains. While most of these emails are spam, we infer, from our measurements, that every year, three of our domains should receive approximately 3,585 "legitimate" emails meant for somebody else. Worse, we find, by examining a small sample of all emails, that these emails may contain sensitive information (e.g., visa documents or medical records).
We then project from our measurements that 1,211 typosquatting domains registered by unknown entities receive in the vicinity of 800,000 emails a year. Furthermore, we find that millions of registered typosquatting domains have MX records pointing to only a handful of mail servers. However, a second experiment in which we send "honey emails" to typosquatting domains only shows very limited evidence of attempts at credential theft (despite some emails being read), meaning that the threat, for now, appears to remain theoretical.

References

[1]
Alexa Web Information Service. http://aws.amazon.com/awis/.
[2]
CSMining group: CSDMCS 2010 spam dataset. http://csmining.org/index.php/spam-email-datasets-.html.
[3]
HIPAA Protected Health Information Identifiers (45 CFR 164.14). http://www.ecfr.gov/cgi-bin/text-idx?SID=e58a563f56b8cf8e6511be534d364a64&node=se45.1.164_1514&rgn=div8. Last accessed: September 30, 2017.
[4]
Python WHOIS parsing tool. https://bitbucket.org/richardpenman/pywhois.
[5]
Ruby WHOIS parsing tool. https://whoisrb.org/.
[6]
Textract. https://textract.readthedocs.io/en/stable/. Last accessed: September 30, 2017.
[7]
The Apache SpamAssassin Project. http://spamassassin.apache.org/.
[8]
Trec spam dataset. http://trec.nist.gov/data/spam.html.
[9]
Untroubled.org spam archive. http://untroubled.org/spam/.
[10]
Zmap: Internet-Wide Scan Data Repository. https://scans.io/.
[11]
Pieter Agten, Wouter Joosen, Frank Piessens, and Nick Nikiforakis. 2015. Seven months' worth of mistakes: A longitudinal study of typosquatting abuse. In Proceedings of the 22nd Network and Distributed System Security Symposium (NDSS 2015). Internet Society.
[12]
Ross Anderson, Chris Barton, Rainer Böhme, Richard Clayton, Michel JG Van Eeten, Michael Levi, Tyler Moore, and Stefan Savage. 2013. Measuring the cost of cybercrime. In The economics of information security and privacy. Springer, 265--300.
[13]
Anirban Banerjee, Dhiman Barman, Michalis Faloutsos, and Laxmi N Bhuyan. 2008. Cyber-fraud is one typo away. In Proc. IEEE INFOCOM 2008, 1939--1947.
[14]
Anirban Banerjee, Md Sazzadur Rahman, and Michalis Faloutsos. 2011. SUT: Quantifying and mitigating URL typosquatting. Computer Networks 55, 13 (2011), 3001--3014.
[15]
Guanchen Chen, Matthew F Johnson, Pavan R Marupally, Naveen K Singireddy, Xin Yin, and Vamsi Paruchuri. 2009. Combating Typo-Squatting for Safer Browsing. In Advanced Information Networking and Applications Workshops, 2009. WAINA'09. International Conference on. IEEE, 31--36.
[16]
Fred Damerau. 1964. A technique for computer detection and correction of spelling errors. Commun. ACM 7, 3 (1964), 171--176.
[17]
Benjamin Edelman. 2003. Large-Scale Registration of Domains with Typographical Errors. http://cyber.law.harvard.edu/people/edelman/typo-domains/. (Sep 2003).
[18]
Benjamin Edelman. 2010. Estimating Visitors and Advertising Costs of Typo Domains. http://www.benedelman.org/typosquatting/pop.html. (2010).
[19]
Godai group. 2011. Doppelganger Domains. http://godaigroup.net/wp-content/uploads/doppelganger/Doppelganger.Domains.pdf. (Sept 6 2011).
[20]
Tristan Halvorson, Janos Szurdi, Gregor Maier, Mark Felegyhazi, Christian Kreibich, Nicholas Weaver, Kirill Levchenko, and Vern Paxson. 2012. The BIZ top-level domain: ten years later. In Passive and Active Measurement. Springer, 221--230.
[21]
ICANN. 1999. Uniform Domain Name Dispute Resolution Policy (UDRP). http://www.icann.org/en/help/dndr/udrp. (1999).
[22]
Mohammad Taha Khan, Xiang Huo, Zhou Li, and Chris Kanich. 2015. Every second counts: Quantifying the negative externalities of cybercrime via typosquatting. In Security and Privacy (SP), 2015 IEEE Symposium on. IEEE, 135--150.
[23]
John Klensin. 2008. Simple mail transfer protocol. (Oct. 2008). IETF RFC 5321.
[24]
Tyler Moore and Benjamin Edelman. 2010. Measuring the perpetrators and funders of typosquatting. In Financial Cryptography and Data Security. Springer, 175--191.
[25]
The CALO project. {n. d.}. Enron email dataset. ({n. d.}). https://www.cs.cmu.edu/~./enron/. Last accessed: September 30, 2017.
[26]
Peter J Rousseeuw and Mia Hubert. 2011. Robust statistics for outlier detection. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 1, 1 (2011), 73--79.
[27]
Janos Szurdi, Balazs Kocso, Gabor Cseh, Jonathan Spring, Mark Felegyhazi, and Chris Kanich. 2014. The Long" Taile" of Typosquatting Domain Names. In USENIX Security. 191--206.
[28]
Kurt Thomas, Danny Huang, David Wang, Elie Bursztein, Chris Grier, Thomas Holt, Christopher Kruegel, Damon McCoy, Stefan Savage, and Giovanni Vigna. 2015. Framing Dependencies Introduced by Underground Commoditization. In Proceedings (online) of the Workshop on Economics of Information Security (WEIS).
[29]
VirusTotal. {n. d.}. VirusTotal - Free Online Virus, Malware and URL Scanner. ({n. d.}). https://www.virustotal.com/.
[30]
Thomas Vissers, Wouter Joosen, and Nick Nikiforakis. 2015. Parking Sensors: Analyzing and Detecting Parked Domains. In Network and Distributed Security Symposium. http://www.internetsociety.org/sites/default/files/01_2_2.pdf
[31]
Yi-Min Wang, Doug Beck, Jeffrey Wang, Chad Verbowski, and Brad Daniels. 2006. Strider typo-patrol: discovery and analysis of systematic typo-squatting. In Proc. 2nd Workshop on Steps to Reducing Unwanted Traffic on the Internet (SRUTI).

Cited By

View all
  • (2024)Ten Years of ZMapProceedings of the 2024 ACM on Internet Measurement Conference10.1145/3646547.3689012(139-148)Online publication date: 4-Nov-2024
  • (2024)Username Squatting on Online Social Networks: A Study on XProceedings of the 19th ACM Asia Conference on Computer and Communications Security10.1145/3634737.3637637(621-637)Online publication date: 1-Jul-2024
  • (2023)Unraveling Threat Intelligence Through the Lens of Malicious URL CampaignsProceedings of the 18th Asian Internet Engineering Conference10.1145/3630590.3630600(78-86)Online publication date: 12-Dec-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
IMC '17: Proceedings of the 2017 Internet Measurement Conference
November 2017
509 pages
ISBN:9781450351188
DOI:10.1145/3131365
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • USENIX Assoc: USENIX Assoc

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. abuse
  2. domain name
  3. ethics
  4. measurement
  5. typosquatting

Qualifiers

  • Research-article

Funding Sources

Conference

IMC '17
IMC '17: Internet Measurement Conference
November 1 - 3, 2017
London, United Kingdom

Acceptance Rates

Overall Acceptance Rate 277 of 1,083 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)47
  • Downloads (Last 6 weeks)7
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Ten Years of ZMapProceedings of the 2024 ACM on Internet Measurement Conference10.1145/3646547.3689012(139-148)Online publication date: 4-Nov-2024
  • (2024)Username Squatting on Online Social Networks: A Study on XProceedings of the 19th ACM Asia Conference on Computer and Communications Security10.1145/3634737.3637637(621-637)Online publication date: 1-Jul-2024
  • (2023)Unraveling Threat Intelligence Through the Lens of Malicious URL CampaignsProceedings of the 18th Asian Internet Engineering Conference10.1145/3630590.3630600(78-86)Online publication date: 12-Dec-2023
  • (2023)Investigating Package Related Security Threats in Software Registries2023 IEEE Symposium on Security and Privacy (SP)10.1109/SP46215.2023.10179332(1578-1595)Online publication date: May-2023
  • (2021)What makes phishing emails hard for humans to detect?Proceedings of the Human Factors and Ergonomics Society Annual Meeting10.1177/107118132064109764:1(431-435)Online publication date: 9-Feb-2021
  • (2021)Where are you taking me?Understanding Abusive Traffic Distribution SystemsProceedings of the Web Conference 202110.1145/3442381.3450071(3613-3624)Online publication date: 19-Apr-2021
  • (2020)Ten years of attacks on companies using visual impersonation of domain names2020 APWG Symposium on Electronic Crime Research (eCrime)10.1109/eCrime51433.2020.9493251(1-12)Online publication date: 16-Nov-2020
  • (2020)Automating Domain Squatting Detection Using Representation Learning2020 IEEE International Conference on Big Data (Big Data)10.1109/BigData50022.2020.9377875(1021-1030)Online publication date: 10-Dec-2020
  • (2020)Defending Against Package TyposquattingNetwork and System Security10.1007/978-3-030-65745-1_7(112-131)Online publication date: 19-Dec-2020
  • (2019)Opening the Blackbox of VirusTotalProceedings of the Internet Measurement Conference10.1145/3355369.3355585(478-485)Online publication date: 21-Oct-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media