ABSTRACT
We estimate of the extent of phishing activity on the Internet via capture-recapture analysis of two major phishing site reports. Capture-recapture analysis is a population estimation technique originally developed for wildlife conservation, but is applicable in any environment wherein multiple independent parties collect reports of an activity.
Generating a meaningful population estimate for phishing activity requires addressing complex relationships between phishers and phishing reports. Phishers clandestinely occupy machines and adding evasive measures into phishing URLs to evade firewalls and other fraud-detection measures. Phishing reports, in the meantime, may be demonstrate a preference towards certain classes of phish.
We address these problems by estimating population in terms of netblocks and by clustering phishing attempts together into scams, which are phishes that demonstrate similar behavior on multiple axes. We generate population estimates using data from two different phishing reports over an 80-day period, and show that these reports capture approximately 40% of scams and 80% of CIDR/24 (256 contiguous address) netblocks involved in phishing.
- M. Abu Rajab, J. Zarfoss, F. Monrose, and A. Terzis. My botnet is bigger than yours (maybe, better than yours): why size estimates remain challenging. In Proceedings of the first annual workshop on hot topics in botnets, March 2007. Google ScholarDigital Library
- L. Briand, K. Emam, B. Freimut, and O. Laitenberger. A comprehensive evaluation of capture-recatpure models for estimating software defect content. IEEE Transcripts of Software Engineering, 26:518--540, 2000. Google ScholarDigital Library
- W. S. Cleveland. Robust locally weighted regression and smoothing scatterplots. Journal of the American Statistical Association, 74:829--836, 1979.Google ScholarCross Ref
- M. Collins, T. Shimeall, S. Faber, J. Janies, R. Weaver, M. De Shon, and J. Kadane. Using uncleanliness to predict future botnet addresses. In Proceedings of the 2007 Internet Measurement Conference, October 2007. Google ScholarDigital Library
- E. Cooke, F. Jahanian, and D. McPherson. The zombie roundup: Understanding, detecting and disturbing botnets. In Proceedings of the First Workshop on Steps to reducing unwanted traffic on the internet (SRUTI), July 2005. Google ScholarDigital Library
- D. Dagon, C. Zou, and W. Lee. Modeling botnet propagation using time zones. In Proceedings of the 13th Network and Distributed Security Symposium (NDSS), February 2006.Google Scholar
- J. Darroch, S. Fienberg, G. Glonek, and B. Junker. A three-sample multiple-recapture approach to census population estimation with heterogenous catchability. Journal of the American Statistical Association, 88:1137--1148, 1993.Google ScholarCross Ref
- S. Fienberg. The Analysis of Cross-Classified Categorical Data. MIT Press, 1980.Google Scholar
- F. Freiling, T. Holz, and G. Wicherski. Botnet tracking: Exploring a root-cause methodology to prevent denial-of-service attacks. In Proceedings of the 10th European Symposium on Research in Computer Security (ESORICS), September 2005. Google ScholarDigital Library
- V. I. Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. Technical Report 8, Soviet Physics Doklady, 1966.Google Scholar
- S. Lohr. Sampling Design and Analysis. Duxbury Press, 1999.Google Scholar
- P. McCullagh and J. Nelder. Generalized Linear Models. Chapman and Hall/CRC, 1989.Google ScholarCross Ref
- D. Moore, C. Shannon, D. Brown, G. Voelker, and S. Savage. Inferring internet denial-of-service activity. ACM Transactions on Computer Systems, 24(2), 2006. Google ScholarDigital Library
- T. Moore and R. Clayton. An empirical analysis of the current state of phishing attack and defence. In Proceedings of the 2007 Workshop on the Economics of Information Security (WEIS), 2007.Google Scholar
- A. Ramachandran and N. Feamster. Understanding the network-level behavior of spammers. In SIGCOMM '06: Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications, pages 291--302, New York, NY, USA, 2006. ACM Press. Google ScholarDigital Library
- A. Ramachandran, N. Feamster, and D. Dagon. Revealing botnet membership using DNSBL counter-intelligence. In Proceedings of the 2006 USENIX workshop on steps for reducing unwanted traffic on the internet (SRUTI), 2006. Google ScholarDigital Library
- J. Rawlings, S. Pantula, and D. Dickey. Applied Regression Analysis. Springer-Verlag, New York Inc., 1998.Google Scholar
- R. Thomas and J. Martin. The underground economy: Priceless. Usenix; login;, 31(6), December 2006.Google Scholar
- J. Wittes. Applications of a multinomial capture-recapture model to epidemiological data. Journal of the American Statistical Association, 69:93--97, 1974.Google ScholarCross Ref
Recommendations
Fishing for Fraudsters: Uncovering Ethereum Phishing Gangs With Blockchain Data
As one of the most typical cybercrime types, phishing scams have extended the devil’s hand to the emerging blockchain ecosystem in recent years. Especially huge economic losses have been caused by phishing scams in Ethereum, the second-largest ...
Preventing Spam Email by Delivery Limitation in RMX
IDEAS '15: Proceedings of the 19th International Database Engineering & Applications SymposiumOn the rule-based email exchange system called RMX, similar to general mailing lists, anyone can send emails by sending to an address unique to RMX. However, there is a security problem that we cannot prevent spam emails and accidentally sending email ...
Fishing Support System: Tying Fishing Line Automatically
ACIT '19: Proceedings of the 7th ACIS International Conference on Applied Computing and Information TechnologyThe current fishing population is getting small. As one of the causes, it is conceivable that young people in their 20s and 30s are getting away from fishing due to overuse smartphones. We thought that the population of fishing may increase by being ...
Comments