Article

Fishing for phishes: applying capture-recapture methods to estimate phishing populations

Authors:
Rhiannon Weaver

CERT Network Situational Awareness Group, Pittsburgh, PA

CERT Network Situational Awareness Group, Pittsburgh, PA
View Profile

,
M. Patrick Collins

CERT Network Situational Awareness Group, Pittsburgh, PA

CERT Network Situational Awareness Group, Pittsburgh, PA
View Profile

eCrime '07: Proceedings of the anti-phishing working groups 2nd annual eCrime researchers summitOctober 2007Pages 14–25https://doi.org/10.1145/1299015.1299017

Published:04 October 2007Publication History

eCrime '07: Proceedings of the anti-phishing working groups 2nd annual eCrime researchers summit

Pages 14–25

ABSTRACT

We estimate of the extent of phishing activity on the Internet via capture-recapture analysis of two major phishing site reports. Capture-recapture analysis is a population estimation technique originally developed for wildlife conservation, but is applicable in any environment wherein multiple independent parties collect reports of an activity.

Generating a meaningful population estimate for phishing activity requires addressing complex relationships between phishers and phishing reports. Phishers clandestinely occupy machines and adding evasive measures into phishing URLs to evade firewalls and other fraud-detection measures. Phishing reports, in the meantime, may be demonstrate a preference towards certain classes of phish.

We address these problems by estimating population in terms of netblocks and by clustering phishing attempts together into scams, which are phishes that demonstrate similar behavior on multiple axes. We generate population estimates using data from two different phishing reports over an 80-day period, and show that these reports capture approximately 40% of scams and 80% of CIDR/24 (256 contiguous address) netblocks involved in phishing.

References

M. Abu Rajab, J. Zarfoss, F. Monrose, and A. Terzis. My botnet is bigger than yours (maybe, better than yours): why size estimates remain challenging. In Proceedings of the first annual workshop on hot topics in botnets, March 2007. Google ScholarDigital Library
L. Briand, K. Emam, B. Freimut, and O. Laitenberger. A comprehensive evaluation of capture-recatpure models for estimating software defect content. IEEE Transcripts of Software Engineering, 26:518--540, 2000. Google ScholarDigital Library
W. S. Cleveland. Robust locally weighted regression and smoothing scatterplots. Journal of the American Statistical Association, 74:829--836, 1979.Google ScholarCross Ref
M. Collins, T. Shimeall, S. Faber, J. Janies, R. Weaver, M. De Shon, and J. Kadane. Using uncleanliness to predict future botnet addresses. In Proceedings of the 2007 Internet Measurement Conference, October 2007. Google ScholarDigital Library
E. Cooke, F. Jahanian, and D. McPherson. The zombie roundup: Understanding, detecting and disturbing botnets. In Proceedings of the First Workshop on Steps to reducing unwanted traffic on the internet (SRUTI), July 2005. Google ScholarDigital Library
D. Dagon, C. Zou, and W. Lee. Modeling botnet propagation using time zones. In Proceedings of the 13th Network and Distributed Security Symposium (NDSS), February 2006.Google Scholar
J. Darroch, S. Fienberg, G. Glonek, and B. Junker. A three-sample multiple-recapture approach to census population estimation with heterogenous catchability. Journal of the American Statistical Association, 88:1137--1148, 1993.Google ScholarCross Ref
S. Fienberg. The Analysis of Cross-Classified Categorical Data. MIT Press, 1980.Google Scholar
F. Freiling, T. Holz, and G. Wicherski. Botnet tracking: Exploring a root-cause methodology to prevent denial-of-service attacks. In Proceedings of the 10th European Symposium on Research in Computer Security (ESORICS), September 2005. Google ScholarDigital Library
V. I. Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. Technical Report 8, Soviet Physics Doklady, 1966.Google Scholar
S. Lohr. Sampling Design and Analysis. Duxbury Press, 1999.Google Scholar
P. McCullagh and J. Nelder. Generalized Linear Models. Chapman and Hall/CRC, 1989.Google ScholarCross Ref
D. Moore, C. Shannon, D. Brown, G. Voelker, and S. Savage. Inferring internet denial-of-service activity. ACM Transactions on Computer Systems, 24(2), 2006. Google ScholarDigital Library
T. Moore and R. Clayton. An empirical analysis of the current state of phishing attack and defence. In Proceedings of the 2007 Workshop on the Economics of Information Security (WEIS), 2007.Google Scholar
A. Ramachandran and N. Feamster. Understanding the network-level behavior of spammers. In SIGCOMM '06: Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications, pages 291--302, New York, NY, USA, 2006. ACM Press. Google ScholarDigital Library
A. Ramachandran, N. Feamster, and D. Dagon. Revealing botnet membership using DNSBL counter-intelligence. In Proceedings of the 2006 USENIX workshop on steps for reducing unwanted traffic on the internet (SRUTI), 2006. Google ScholarDigital Library
J. Rawlings, S. Pantula, and D. Dickey. Applied Regression Analysis. Springer-Verlag, New York Inc., 1998.Google Scholar
R. Thomas and J. Martin. The underground economy: Priceless. Usenix; login;, 31(6), December 2006.Google Scholar
J. Wittes. Applications of a multinomial capture-recapture model to epidemiological data. Journal of the American Statistical Association, 69:93--97, 1974.Google ScholarCross Ref

Recommendations

Fishing for Fraudsters: Uncovering Ethereum Phishing Gangs With Blockchain Data
As one of the most typical cybercrime types, phishing scams have extended the devil’s hand to the emerging blockchain ecosystem in recent years. Especially huge economic losses have been caused by phishing scams in Ethereum, the second-largest ...
Read More
Preventing Spam Email by Delivery Limitation in RMX
IDEAS '15: Proceedings of the 19th International Database Engineering & Applications Symposium

On the rule-based email exchange system called RMX, similar to general mailing lists, anyone can send emails by sending to an address unique to RMX. However, there is a security problem that we cannot prevent spam emails and accidentally sending email ...
Read More
Fishing Support System: Tying Fishing Line Automatically
ACIT '19: Proceedings of the 7th ACIS International Conference on Applied Computing and Information Technology

The current fishing population is getting small. As one of the causes, it is conceivable that young people in their 20s and 30s are getting away from fishing due to overuse smartphones. We thought that the population of fishing may increase by being ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
eCrime '07: Proceedings of the anti-phishing working groups 2nd annual eCrime researchers summit
October 2007
90 pages
ISBN:9781595939395
DOI:10.1145/1299015
General Chair:
Lorrie Faith Cranor
Carnegie Mellon University
Copyright © 2007 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 October 2007
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 14
  Total Citations
  View Citations
- 479
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Fishing for phishes: applying capture-recapture methods to estimate phishing populations

eCrime '07: Proceedings of the anti-phishing working groups 2nd annual eCrime researchers summit

ABSTRACT

References

Cited By

Recommendations

Fishing for Fraudsters: Uncovering Ethereum Phishing Gangs With Blockchain Data

Preventing Spam Email by Delivery Limitation in RMX

Fishing Support System: Tying Fishing Line Automatically