research-article

Populated IP addresses: classification and applications

Authors:

Yinglian XieAuthors Info & Claims

CCS '12: Proceedings of the 2012 ACM conference on Computer and communications security

Pages 329 - 340

https://doi.org/10.1145/2382196.2382233

Published: 16 October 2012 Publication History

Abstract

Populated IP addresses (PIP) -- IP addresses that are associated with a large number of user requests are important for online service providers to efficiently allocate resources and to detect attacks. While some PIPs serve legitimate users, many others are heavily abused by attackers to conduct malicious activities such as scams, phishing, and malware distribution. Unfortunately, commercial proxy lists like Quova have a low coverage of PIP addresses and offer little support for distinguishing good PIPs from abused ones. In this study, we propose PIPMiner, a fully automated method to extract and classify PIPs through analyzing service logs. Our methods combine machine learning and time series analysis to distinguish good PIPs from abused ones with over 99.6% accuracy. When applying the derived PIP list to several applications, we can identify millions of malicious Windows Live accounts right on the day of their sign-ups, and detect millions of malicious Hotmail accounts well before the current detection system captures them.

References

[1]

GML AdaBoost Matlab Toolbox. http://goo.gl/vh0R9.

[2]

Networks enterprise data acquisition and IP rotation services. http://x5.net.

[3]

Quova. http://www.quova.com/.

[4]

ToR network status. http://torstatus.blutmagie.de/.

[5]

J. D. Brutlag. Aberrant behavior detection in time series for network monitoring. In USENIX Conference on System Administration, 2000.

Digital Library

[6]

X. Cai and J. Heidemann. Understanding block-level address usage in the visible Internet. In SIGCOMM, 2010.

Digital Library

[7]

M. Casado and M. J. Freedman. Peering through the shroud: The effect of edge opacity on IP-based client identification. In NSDI, 2007.

Digital Library

[8]

C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2011.

Digital Library

[9]

R. Dingledine, N. Mathewson, and P. Syverson. ToR: The second-generation onion router. In USENIX Security Symposium, 2004.

Digital Library

[10]

H. Eidnes, G. de Groot, and P. Vixie. Classless IN-ADDR.ARPA delegation. RFC 2317, 1998.

Digital Library

[11]

R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 2008.

Digital Library

[12]

S. Hao, N. A. Syed, N. Feamster, A. G. Gray, and S. Krasser. Detecting spammers with SNARE: Spatio-temporal network-level automatic reputation engine. In USENIX Security Symposium, 2009.

Digital Library

[13]

M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: Distributed data-parallel programs from sequential building blocks. In EuroSys, 2007.

Digital Library

[14]

J. P. John, F. Yu, Y. Xie, M. Abadi, and A. Krishnamurthy. Searching the searchers with searchaudit. In USENIX Security, 2010.

Digital Library

[15]

J. Jung and E. Sit. An empirical study of spam traffic and the use of DNS black lists. In IMC, 2004.

Digital Library

[16]

E. Katz-Bassett, J. P. John, A. Krishnamurthy, D. Wetherall, T. Anderson, and Y. Chawathe. Towards IP geolocation using delay and topology measurements. In IMC, 2006.

Digital Library

[17]

H.-T. Lin, C.-J. Lin, and R. C. Weng. A note on Platt's probabilistic outputs for support vector machines. Mach. Learn., 2007.

Digital Library

[18]

A. Metwally and M. Paduano. Estimating the number of users behind IP addresses for combating abusive traffic. In KDD, 2011.

Digital Library

[19]

S. Nagaraja, P. Mittal, C.-Y. Hong, M. Caesar, and N. Borisov. BotGrep: detecting P2P botnets using structured graph analysis. In USENIX Security Symposium, 2010.

Digital Library

[20]

J. C. Platt. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in Large Margin Classifiers, 1999.

[21]

J. R. Quinlan. C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., 1993.

Digital Library

[22]

A. Ramachandran, N. Feamster, and D. Dagon. Revealing botnet membership using DNSBL counter-intelligence. In Usenix Workshop on Steps to Reducing Unwanted Traffic on the Internet (SRUTI), 2006.

Digital Library

[23]

A. Ramachandran, N. Feamster, and S. Vempala. Filtering spam with behavioral blacklisting. In CCS, 2007.

Digital Library

[24]

G. Stringhini, T. Holz, B. Stone-Gross, C. Kruegel, and G. Vigna. Botmagnifier: Locating spambots on the Internet. In USENIX Security Symposium, 2011.

Digital Library

[25]

L. Wang, K. S. Park, R. Pang, V. Pai, and L. Peterson. Reliability and security in the codeen content distribution network. In USENIX ATC, 2004.

Digital Library

[26]

Y. Xie, V. Sekar, D. A. Maltz, M. K. Reiter, and H. Zhang. Worm origin identification using random moonwalks. In IEEE Symposium on Security and Privacy, 2005.

Digital Library

[27]

Y. Xie, F. Yu, and M. Abadi. De-anonymizing the internet using unreliable ids. In SIGCOMM, 2009.

Digital Library

[28]

Y. Xie, F. Yu, K. Achan, E. Gillum, M. Goldszmidt, and T. Wobber. How dynamic are IP addresses? In SIGCOMM, 2007.

Digital Library

[29]

Y. Xie, F. Yu, K. Achan, R. Panigrahy, G. Hulten, and I. Osipkov. Spamming botnets: Signatures and characteristics. In SIGCOMM, 2008.

Digital Library

[30]

Y. Yu, M. Isard, D. Fetterly, M. Budiu, U. Erlingsson, P. K. Gunda, and J. Currey. DryadLINQ: A system for general-purpose distributed data-parallel computing using a high-level language. In OSDI, 2008.

Digital Library

[31]

Y. Zhao, Y. Xie, F. Yu, Q. Ke, Y. Yu, Y. Chen, and E. Gillum. BotGraph: Large scale spamming botnet detection. In NSDI, 2009.

Digital Library

Cited By

Botacin MCeschin FSun ROliveira DGrégio A(2021)Challenges and pitfalls in malware researchComputers and Security10.1016/j.cose.2021.102287106:COnline publication date: 1-Jul-2021
https://dl.acm.org/doi/10.1016/j.cose.2021.102287
Shakiba TZarifzadeh SDerhami V(2018)Spam query detection using stream clusteringWorld Wide Web10.1007/s11280-017-0471-z21:2(557-572)Online publication date: 1-Mar-2018
https://dl.acm.org/doi/10.1007/s11280-017-0471-z
Li QFeng XWang HSun L(2017)Automatically Discovering Surveillance Devices in the CyberspaceProceedings of the 8th ACM on Multimedia Systems Conference10.1145/3083187.3084020(331-342)Online publication date: 20-Jun-2017
https://dl.acm.org/doi/10.1145/3083187.3084020
Show More Cited By

Index Terms

Populated IP addresses: classification and applications
1. Security and privacy
  1. Network security

Recommendations

How dynamic are IP addresses?
SIGCOMM '07: Proceedings of the 2007 conference on Applications, technologies, architectures, and protocols for computer communications

This paper introduces a novel algorithm, UDmap, to identify dynamically assigned IP addresses and analyze their dynamics pattern. UDmap is fully automatic, and relies only on application-level server logs. We applied UDmap to a month-long Hotmail user-...
Predicting Zero-day Malicious IP Addresses
SafeConfig '17: Proceedings of the 2017 Workshop on Automated Decision Making for Active Cyber Defense

Blacklisting IP addresses is an important part of enterprise security today. Malware infections and Advanced Persistent Threats can be detected when blacklisted IP addresses are contacted. It can also thwart phishing attacks by blocking suspicious ...
How dynamic are IP addresses?

This paper introduces a novel algorithm, UDmap, to identify dynamically assigned IP addresses and analyze their dynamics pattern. UDmap is fully automatic, and relies only on application-level server logs. We applied UDmap to a month-long Hotmail user-...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CCS '12: Proceedings of the 2012 ACM conference on Computer and communications security

October 2012

1088 pages

ISBN:9781450316514

DOI:10.1145/2382196

General Chair:
Ting Yu
North Carolina State University, USA
,
Program Chairs:
George Danezis
Microsoft Research Cambridge, UK
,
Virgil Gligor
Carnegie Mellon University, USA

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 October 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CCS'12

Sponsor:

SIGSAC

CCS'12: the ACM Conference on Computer and Communications Security

October 16 - 18, 2012

North Carolina, Raleigh, USA

Acceptance Rates

Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

Upcoming Conference

CCS '25

Sponsor:
sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 13 - 17, 2025

Taipei , Taiwan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
528
Total Downloads

Downloads (Last 12 months)19
Downloads (Last 6 weeks)5

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Botacin MCeschin FSun ROliveira DGrégio A(2021)Challenges and pitfalls in malware researchComputers and Security10.1016/j.cose.2021.102287106:COnline publication date: 1-Jul-2021
https://dl.acm.org/doi/10.1016/j.cose.2021.102287
Shakiba TZarifzadeh SDerhami V(2018)Spam query detection using stream clusteringWorld Wide Web10.1007/s11280-017-0471-z21:2(557-572)Online publication date: 1-Mar-2018
https://dl.acm.org/doi/10.1007/s11280-017-0471-z
Li QFeng XWang HSun L(2017)Automatically Discovering Surveillance Devices in the CyberspaceProceedings of the 8th ACM on Multimedia Systems Conference10.1145/3083187.3084020(331-342)Online publication date: 20-Jun-2017
https://dl.acm.org/doi/10.1145/3083187.3084020
Xuan Feng Qiang Li Haining Wang Limin Sun (2016)Characterizing industrial control system devices on the Internet2016 IEEE 24th International Conference on Network Protocols (ICNP)10.1109/ICNP.2016.7784407(1-10)Online publication date: Nov-2016
https://doi.org/10.1109/ICNP.2016.7784407
Feng XLi QHan QZhu HLiu YCui JSun L(2016)Active Profiling of Physical Devices at Internet Scale2016 25th International Conference on Computer Communication and Networks (ICCCN)10.1109/ICCCN.2016.7568486(1-9)Online publication date: Aug-2016
https://doi.org/10.1109/ICCCN.2016.7568486
Feng XLi QHan QZhu HLiu YSun L(2016)Identification of visible industrial control devices at Internet scale2016 IEEE International Conference on Communications (ICC)10.1109/ICC.2016.7511426(1-6)Online publication date: May-2016
https://doi.org/10.1109/ICC.2016.7511426
Feng ZXu MYang Y(2016)Revolutionizing the inter-domain business model by information-centric thinking2016 IEEE International Conference on Communications (ICC)10.1109/ICC.2016.7510923(1-6)Online publication date: May-2016
https://doi.org/10.1109/ICC.2016.7510923
Thomas KAmira RBen-Yoash AFolger OHardon ABerger ABursztein EBailey M(2016)The Abuse Sharing Economy: Understanding the Limits of Threat ExchangesResearch in Attacks, Intrusions, and Defenses10.1007/978-3-319-45719-2_7(143-164)Online publication date: 7-Sep-2016
https://doi.org/10.1007/978-3-319-45719-2_7
Mohaisen AAlrawi OMohaisen M(2015)AMALComputers and Security10.1016/j.cose.2015.04.00152:C(251-266)Online publication date: 1-Jul-2015
https://dl.acm.org/doi/10.1016/j.cose.2015.04.001
Mohaisen AAlrawi O(2015)AMAL: High-Fidelity, Behavior-Based Automated Malware Analysis and ClassificationInformation Security Applications10.1007/978-3-319-15087-1_9(107-121)Online publication date: 22-Jan-2015
https://doi.org/10.1007/978-3-319-15087-1_9
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten