skip to main content
10.1145/2382196.2382233acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Populated IP addresses: classification and applications

Published: 16 October 2012 Publication History

Abstract

Populated IP addresses (PIP) -- IP addresses that are associated with a large number of user requests are important for online service providers to efficiently allocate resources and to detect attacks. While some PIPs serve legitimate users, many others are heavily abused by attackers to conduct malicious activities such as scams, phishing, and malware distribution. Unfortunately, commercial proxy lists like Quova have a low coverage of PIP addresses and offer little support for distinguishing good PIPs from abused ones. In this study, we propose PIPMiner, a fully automated method to extract and classify PIPs through analyzing service logs. Our methods combine machine learning and time series analysis to distinguish good PIPs from abused ones with over 99.6% accuracy. When applying the derived PIP list to several applications, we can identify millions of malicious Windows Live accounts right on the day of their sign-ups, and detect millions of malicious Hotmail accounts well before the current detection system captures them.

References

[1]
GML AdaBoost Matlab Toolbox. http://goo.gl/vh0R9.
[2]
Networks enterprise data acquisition and IP rotation services. http://x5.net.
[3]
Quova. http://www.quova.com/.
[4]
ToR network status. http://torstatus.blutmagie.de/.
[5]
J. D. Brutlag. Aberrant behavior detection in time series for network monitoring. In USENIX Conference on System Administration, 2000.
[6]
X. Cai and J. Heidemann. Understanding block-level address usage in the visible Internet. In SIGCOMM, 2010.
[7]
M. Casado and M. J. Freedman. Peering through the shroud: The effect of edge opacity on IP-based client identification. In NSDI, 2007.
[8]
C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2011.
[9]
R. Dingledine, N. Mathewson, and P. Syverson. ToR: The second-generation onion router. In USENIX Security Symposium, 2004.
[10]
H. Eidnes, G. de Groot, and P. Vixie. Classless IN-ADDR.ARPA delegation. RFC 2317, 1998.
[11]
R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 2008.
[12]
S. Hao, N. A. Syed, N. Feamster, A. G. Gray, and S. Krasser. Detecting spammers with SNARE: Spatio-temporal network-level automatic reputation engine. In USENIX Security Symposium, 2009.
[13]
M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: Distributed data-parallel programs from sequential building blocks. In EuroSys, 2007.
[14]
J. P. John, F. Yu, Y. Xie, M. Abadi, and A. Krishnamurthy. Searching the searchers with searchaudit. In USENIX Security, 2010.
[15]
J. Jung and E. Sit. An empirical study of spam traffic and the use of DNS black lists. In IMC, 2004.
[16]
E. Katz-Bassett, J. P. John, A. Krishnamurthy, D. Wetherall, T. Anderson, and Y. Chawathe. Towards IP geolocation using delay and topology measurements. In IMC, 2006.
[17]
H.-T. Lin, C.-J. Lin, and R. C. Weng. A note on Platt's probabilistic outputs for support vector machines. Mach. Learn., 2007.
[18]
A. Metwally and M. Paduano. Estimating the number of users behind IP addresses for combating abusive traffic. In KDD, 2011.
[19]
S. Nagaraja, P. Mittal, C.-Y. Hong, M. Caesar, and N. Borisov. BotGrep: detecting P2P botnets using structured graph analysis. In USENIX Security Symposium, 2010.
[20]
J. C. Platt. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in Large Margin Classifiers, 1999.
[21]
J. R. Quinlan. C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., 1993.
[22]
A. Ramachandran, N. Feamster, and D. Dagon. Revealing botnet membership using DNSBL counter-intelligence. In Usenix Workshop on Steps to Reducing Unwanted Traffic on the Internet (SRUTI), 2006.
[23]
A. Ramachandran, N. Feamster, and S. Vempala. Filtering spam with behavioral blacklisting. In CCS, 2007.
[24]
G. Stringhini, T. Holz, B. Stone-Gross, C. Kruegel, and G. Vigna. Botmagnifier: Locating spambots on the Internet. In USENIX Security Symposium, 2011.
[25]
L. Wang, K. S. Park, R. Pang, V. Pai, and L. Peterson. Reliability and security in the codeen content distribution network. In USENIX ATC, 2004.
[26]
Y. Xie, V. Sekar, D. A. Maltz, M. K. Reiter, and H. Zhang. Worm origin identification using random moonwalks. In IEEE Symposium on Security and Privacy, 2005.
[27]
Y. Xie, F. Yu, and M. Abadi. De-anonymizing the internet using unreliable ids. In SIGCOMM, 2009.
[28]
Y. Xie, F. Yu, K. Achan, E. Gillum, M. Goldszmidt, and T. Wobber. How dynamic are IP addresses? In SIGCOMM, 2007.
[29]
Y. Xie, F. Yu, K. Achan, R. Panigrahy, G. Hulten, and I. Osipkov. Spamming botnets: Signatures and characteristics. In SIGCOMM, 2008.
[30]
Y. Yu, M. Isard, D. Fetterly, M. Budiu, U. Erlingsson, P. K. Gunda, and J. Currey. DryadLINQ: A system for general-purpose distributed data-parallel computing using a high-level language. In OSDI, 2008.
[31]
Y. Zhao, Y. Xie, F. Yu, Q. Ke, Y. Yu, Y. Chen, and E. Gillum. BotGraph: Large scale spamming botnet detection. In NSDI, 2009.

Cited By

View all

Index Terms

  1. Populated IP addresses: classification and applications

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CCS '12: Proceedings of the 2012 ACM conference on Computer and communications security
    October 2012
    1088 pages
    ISBN:9781450316514
    DOI:10.1145/2382196
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 October 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. ip blacklisting
    2. populated ip addresses
    3. proxy
    4. spam detection

    Qualifiers

    • Research-article

    Conference

    CCS'12
    Sponsor:
    CCS'12: the ACM Conference on Computer and Communications Security
    October 16 - 18, 2012
    North Carolina, Raleigh, USA

    Acceptance Rates

    Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

    Upcoming Conference

    CCS '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)19
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Challenges and pitfalls in malware researchComputers and Security10.1016/j.cose.2021.102287106:COnline publication date: 1-Jul-2021
    • (2018)Spam query detection using stream clusteringWorld Wide Web10.1007/s11280-017-0471-z21:2(557-572)Online publication date: 1-Mar-2018
    • (2017)Automatically Discovering Surveillance Devices in the CyberspaceProceedings of the 8th ACM on Multimedia Systems Conference10.1145/3083187.3084020(331-342)Online publication date: 20-Jun-2017
    • (2016)Characterizing industrial control system devices on the Internet2016 IEEE 24th International Conference on Network Protocols (ICNP)10.1109/ICNP.2016.7784407(1-10)Online publication date: Nov-2016
    • (2016)Active Profiling of Physical Devices at Internet Scale2016 25th International Conference on Computer Communication and Networks (ICCCN)10.1109/ICCCN.2016.7568486(1-9)Online publication date: Aug-2016
    • (2016)Identification of visible industrial control devices at Internet scale2016 IEEE International Conference on Communications (ICC)10.1109/ICC.2016.7511426(1-6)Online publication date: May-2016
    • (2016)Revolutionizing the inter-domain business model by information-centric thinking2016 IEEE International Conference on Communications (ICC)10.1109/ICC.2016.7510923(1-6)Online publication date: May-2016
    • (2016)The Abuse Sharing Economy: Understanding the Limits of Threat ExchangesResearch in Attacks, Intrusions, and Defenses10.1007/978-3-319-45719-2_7(143-164)Online publication date: 7-Sep-2016
    • (2015)AMALComputers and Security10.1016/j.cose.2015.04.00152:C(251-266)Online publication date: 1-Jul-2015
    • (2015)AMAL: High-Fidelity, Behavior-Based Automated Malware Analysis and ClassificationInformation Security Applications10.1007/978-3-319-15087-1_9(107-121)Online publication date: 22-Jan-2015
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media