skip to main content
10.1145/2068816.2068820acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections
research-article

Detecting, validating and characterizing computer infections in the wild

Published: 02 November 2011 Publication History

Abstract

Although network intrusion detection systems (IDSs) have been studied for several years, their operators are still overwhelmed by a large number of false-positive alerts. In this work we study the following problem: from a large archive of intrusion alerts collected in a production network, we want to detect with a small number of false positives hosts within the network that have been infected by malware. Solving this problem is essential not only for reducing the false-positive rate of IDSs, but also for labeling traces collected in the wild with information about validated security incidents. We use a 9-month long dataset of IDS alerts and we first build a novel heuristic to detect infected hosts from the on average 3 million alerts we observe per day. Our heuristic uses a statistical measure to find hosts that exhibit a repeated multi-stage malicious footprint involving specific classes of alerts. A significant part of our work is devoted to the validation of our heuristic. We conduct a complex experiment to assess the security of suspected infected systems in a production environment using data from several independent sources, including intrusion alerts, blacklists, host scanning logs, vulnerability reports, and search engine queries. We find that the false positive rate of our heuristic is 15% and analyze in-depth the root causes of the false positives. Having validated our heuristic, we apply it to our entire trace, and characterize various important properties of 9 thousand infected hosts in total. For example, we find that among the infected hosts, a small number of heavy hitters originate most outbound attacks and that future infections are more likely to occur close to already infected hosts.

References

[1]
Testing intrusion detection systems: a critique of the 1998 and 1999 darpa intrusion detection system evaluations as performed by lincoln laboratory. ACM Trans. Inf. Syst. Secur., 3:262--294, November 2000.
[2]
D. W. Allan. Time and frequency (time domain) characterization, estimation and prediction of precision clocks and oscillators. IEEE Trans. UFFC, 34, November 1987.
[3]
Carson Brown, Alex Cowperthwaite, Abdulrahman Hijazi, and Anil Somayaji. Analysis of the 1999 darpa/lincoln laboratory ids evaluation data with netadhict. In Proceedings of the Second IEEE international conference on Computational intelligence for security and defense applications, CISDA'09, pages 67--73, Piscataway, NJ, USA, 2009. IEEE Press.
[4]
David Brumley, Cody Hartwig, Zhenkai Liang, James Newsome, Dawn Song, and Heng Yin. Automatically identifying trigger-based behavior in malware, 2008.
[5]
Steven Cheung, Ulf Lindqvist, and Martin W. Fong. Modeling multistep cyber attacks for scenario recognition, 2003.
[6]
Frédéric Cuppens and Alexandre Miège. Alert correlation in a cooperative intrusion detection framework. In Proceedings of the 2002 IEEE Symposium on Security and Privacy, pages 202--, Washington, DC, USA, 2002. IEEE Computer Society.
[7]
Frédéric Cuppens and Rodolphe Ortalo. Lambda: A language to model a database for detection of attacks. In Proceedings of the Third International Workshop on Recent Advances in Intrusion Detection, RAID'00, pages 197--216, London, UK, 2000. Springer-Verlag.
[8]
D. Curry and H. Debar. Intrusion detection message exchange format: Extensible markup language document type definition, 2003.
[9]
Oliver Dain and Robert K. Cunningham. Fusing a heterogeneous alert stream into scenarios. In In Proceedings of the 2001 ACM workshop on Data Mining for Security Applications, pages 1--13, 2001.
[10]
Oliver M. Dain and Robert K. Cunningham. Building scenarios from a heterogeneous alert stream, 2002.
[11]
Neil Daswani, The Google Click Quality, Security Teams, and Google Inc. The anatomy of clickbot.a. In In USENIX Hotbots'07, 2007.
[12]
Hervé Debar and Andreas Wespi. Aggregation and correlation of intrusion-detection alerts. In Proceedings of the 4th International Symposium on Recent Advances in Intrusion Detection, RAID'00, pages 85--103, London, UK, 2001. Springer-Verlag.
[13]
Steven Eckmann, Giovanni Vigna, and Richard A. Kemmerer. Statl: An attack language for state-based intrusion detection, 2002.
[14]
Advanced automated threat analysis system. www.threatexpert.com.
[15]
Anonymous postmasters early warning system. www.apews.org.
[16]
Common Vulnerabilities and Exposures dictionary of known information security vulnerabilities. cve.mitre.org.
[17]
Cooperative Network Security Community - Internet Security. www.dshield.org.
[18]
Damballa - Botnet and Advanced Malware Detection and Protection. www.damballa.com.
[19]
Emerging Threats web page. http://www.emergingthreats.net.
[20]
Network Security Archive. http://www.networksecurityarchive.org.
[21]
Packet Storm Full Disclosure Information Security. packetstormsecurity.org.
[22]
Projecthoneypot web page. www.projecthoneypot.org.
[23]
Shadowserver Foundation web page. www.shadowserver.org.
[24]
Symantec SecurityFocus technical community. www.securityfocus.com.
[25]
The Nessus vulnerability scanner. www.tenable.com/products/nessus.
[26]
The Open Vulnerability Assessment System. www.openvas.org.
[27]
The Spamhaus Project. www.spamhaus.org.
[28]
The Urlblacklist web page. www.urlblacklist.org.
[29]
TrustedSource Internet Reputation System. www.trustedsource.org.
[30]
Loic Etienne and Jean-Yves Le Boudec. Malicious traffic detection in local networks with snort. Technical report, EPFL, 2009.
[31]
Joshua Haines, Dorene Kewley Ryder, Laura Tinnel, and Stephen Taylor. Validation of sensor alert correlators. IEEE Security and Privacy, 1:46--56, January 2003.
[32]
Klaus Julisch. Clustering intrusion detection alarms to support root cause analysis. ACM Transactions on Information and System Security, 6:443--471, 2003.
[33]
Klaus Julisch and Marc Dacier. Mining intrusion detection alarms for actionable knowledge. In KDD'02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 366--375, New York, NY, USA, 2002. ACM.
[34]
Sachin Katti, Balachander Krishnamurthy, and Dina Katabi. Collaborating against common enemies. In Proceedings of the 5th ACM SIGCOMM conference on Internet Measurement, IMC'05, pages 34--34, Berkeley, CA, USA, 2005. USENIX Association.
[35]
Richard Lippmann, Joshua W. Haines, David J. Fried, Jonathan Korba, and Kumar Das. The 1999 darpa off-line intrusion detection evaluation. Computer Networks, 34:579--595, October 2000.
[36]
Federico Maggi and Stefano Zanero. On the use of different statistical tests for alert correlation: short paper. In Proceedings of the 10th international conference on Recent advances in intrusion detection, RAID'07, pages 167--177, Berlin, Heidelberg, 2007. Springer-Verlag.
[37]
Gary McGraw and Greg Morrisett. Attacking malicious code: A report to the infosec research council. IEEE Softw., 17:33--41, September 2000.
[38]
Benjamin Morin and Hervé Debar. Correlation of intrusion symptoms: an application of chronicles. In RAID'03, pages 94--112, 2003.
[39]
Peng Ning, Yun Cui, and Douglas S. Reeves. Constructing attack scenarios through correlation of intrusion alerts. In In Proceedings of the 9th ACM conference on Computer and communications security, pages 245--254, 2002.
[40]
G. Piatetsky-Shapiro. Discovery, analysis and presentation of strong rules. In G. Piatetsky-Shapiro and W. J. Frawley, editors, Knowledge Discovery in Databases, pages 229--248. AAAI Press, 1991.
[41]
Xinzhou Qin. A probabilistic-based framework for infosec alert correlation. PhD thesis, Atlanta, GA, USA, 2005. AAI3183248.
[42]
Xinzhou Qin and Wenke Lee. Statistical causality analysis of infosec alert data. In RAID 2003, pages 73--93, 2003.
[43]
Hanli Ren, Natalia Stakhanova, and Ali A. Ghorbani. An online adaptive approach to alert correlation. In Proceedings of the 7th international conference on Detection of intrusions and malware, and vulnerability assessment, DIMVA'10, pages 153--172, Berlin, Heidelberg, 2010. Springer-Verlag.
[44]
Vyas Sekar, Yinglian Xie, Michael K. Reiter, and Hui Zhang. Is host-based anomaly detection
[45]
temporal correlation = worm causality?, 2007.
[46]
Sushant Sinha, Michael Bailey, and Farnam Jahanian. Shades of grey: On the effectiveness of reputation-based blacklists. In Proceedings of the 3rd International Conference on Malicious and Unwanted Software (MALWARE'08), pages 57--64, Fairfax, Virginia, USA, October 2008.
[47]
P. Smyth and R. M. Goodman. An information theoretic approach to rule induction from databases. IEEE Trans. on Knowl. and Data Eng., 4:301--316, August 1992.
[48]
A free lightweight network intrusion detection system for UNIX and Windows. http://www.snort.org.
[49]
Joel Sommers, Vinod Yegneswaran, and Paul Barford. A framework for malicious workload generation. In Proceedings of the 4th ACM SIGCOMM conference on Internet measurement, IMC'04, pages 82--87, New York, NY, USA, 2004. ACM.
[50]
Stuart Staniford, Vern Paxson, and Nicholas Weaver. How to own the internet in your spare time. In Proceedings of the 11th USENIX Security Symposium, pages 149--167, Berkeley, CA, USA, 2002. USENIX Association.
[51]
Ionut Trestian, Supranamaya Ranjan, Aleksandar Kuzmanovi, and Antonio Nucci. Unconstrained endpoint profiling (googling the internet). In Proceedings of the ACM SIGCOMM 2008 conference on Data communication, SIGCOMM'08, pages 279--290, New York, NY, USA, 2008. ACM.
[52]
Alfonso Valdes and Keith Skinner. Probabilistic alert correlation. In Recent Advances in Intrusion Detection, pages 54--68, 2001.
[53]
Vinod Yegneswaran, Paul Barford, and Johannes Ullrich. Internet intrusions: global characteristics and prevalence. In Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, SIGMETRICS'03, pages 138--147, New York, NY, USA, 2003. ACM.
[54]
Bin Zhu and Ali A. Ghorbani. Abstract alert correlation for extracting attack strategies, 2005.

Cited By

View all
  • (2021)IntroductionNetwork Behavior Analysis10.1007/978-981-16-8325-1_1(1-6)Online publication date: 16-Dec-2021
  • (2019)Employing attack graphs for intrusion detectionProceedings of the New Security Paradigms Workshop10.1145/3368860.3368862(16-30)Online publication date: 23-Sep-2019
  • (2017)Burstiness of Intrusion Detection Process: Empirical Evidence and a Modeling ApproachIEEE Transactions on Information Forensics and Security10.1109/TIFS.2017.270562912:10(2348-2359)Online publication date: Oct-2017
  • Show More Cited By

Index Terms

  1. Detecting, validating and characterizing computer infections in the wild

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    IMC '11: Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference
    November 2011
    612 pages
    ISBN:9781450310130
    DOI:10.1145/2068816
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    • USENIX Assoc: USENIX Assoc

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 November 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. alert correlation
    2. intrusion detection
    3. j-measure
    4. malware
    5. network security
    6. snort

    Qualifiers

    • Research-article

    Conference

    IMC '11
    IMC '11: Internet Measurement Conference
    November 2 - 4, 2011
    Berlin, Germany

    Acceptance Rates

    Overall Acceptance Rate 277 of 1,083 submissions, 26%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 22 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)IntroductionNetwork Behavior Analysis10.1007/978-981-16-8325-1_1(1-6)Online publication date: 16-Dec-2021
    • (2019)Employing attack graphs for intrusion detectionProceedings of the New Security Paradigms Workshop10.1145/3368860.3368862(16-30)Online publication date: 23-Sep-2019
    • (2017)Burstiness of Intrusion Detection Process: Empirical Evidence and a Modeling ApproachIEEE Transactions on Information Forensics and Security10.1109/TIFS.2017.270562912:10(2348-2359)Online publication date: Oct-2017
    • (2017)Empirical Analysis and Validation of Security Alerts Filtering TechniquesIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2017.2714164(1-1)Online publication date: 2017
    • (2015)A Practical Experience on Evaluating Intrusion Prevention System Event Data as Indicators of Security IssuesProceedings of the 2015 IEEE 34th Symposium on Reliable Distributed Systems (SRDS)10.1109/SRDS.2015.17(296-305)Online publication date: 28-Sep-2015
    • (2015)An Integrated Network Behavior and Policy Based Data Exfiltration Detection FrameworkProceedings of the Fifth International Conference on Fuzzy and Neuro Computing (FANCCO - 2015)10.1007/978-3-319-27212-2_26(337-351)Online publication date: 25-Nov-2015
    • (2015)How Dangerous Is Internet Scanning?Traffic Monitoring and Analysis10.1007/978-3-319-17172-2_11(158-172)Online publication date: 17-Apr-2015
    • (2014)IDS Alert Correlation in the Wild With EDGeIEEE Journal on Selected Areas in Communications10.1109/JSAC.2014.235883432:10(1933-1946)Online publication date: Oct-2014
    • (2014)An Experiment with Conceptual Clustering for the Analysis of Security AlertsProceedings of the 2014 IEEE International Symposium on Software Reliability Engineering Workshops10.1109/ISSREW.2014.82(335-340)Online publication date: 3-Nov-2014
    • (2013)Understanding Network Forensics Analysis in an Operational EnvironmentProceedings of the 2013 IEEE Security and Privacy Workshops10.1109/SPW.2013.12(111-118)Online publication date: 23-May-2013
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media