Skip to main content
Log in

Longitudinal analysis of a large corpus of cyber threat descriptions

  • Original Paper
  • Published:
Journal of Computer Virology and Hacking Techniques Aims and scope Submit manuscript

Abstract

Online cyber threat descriptions are rich, but little research has attempted to systematically analyze these descriptions. In this paper, we process and analyze two of Symantec’s online threat description corpora. The Anti-Virus (AV) corpus contains descriptions of more than 12,400 threats detected by Symantec’s AV, and the Intrusion Prevention System (IPS) corpus contains descriptions of more than 2,700 attacks detected by Symantec’s IPS. In our analysis, we quantify the over time evolution of threat severity and type in the corpora. We also assess the amount of time Symantec takes to release signatures for newly discovered threats. Our analysis indicates that a very small minority of threats in the AV corpus are high-severity, whereas the majority of attacks in the IPS corpus are high-severity. Moreover, we find that the prevalence of different threat types such as worms and viruses in the corpora varies considerably over time. Finally, we find that Symantec prioritizes releasing signatures for fast propagating threats.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. We choose to assign type “exploit” instead of assigning type “attack” because assigning type “attack” would be confusing given that all entries in the IPS corpus are considered attacks.

  2. We only consider the period later than 2008 for generic threats. Prior to 2008, the number of generic threats is small, and thus the analysis is not meaningful.

  3. A related duration that we do not study in this paper is the duration between threat release in the wild and threat discovery. We refer readers interested in such duration to a different study on zero-day attacks [9].

References

  1. Abu Rajab, M., Zarfoss, J., Monrose, F., Terzis, A.: A multifaceted approach to understanding the botnet phenomenon. In: Internet Measurement Conference (IMC), pp. 41–52 (2006)

  2. Allodi, L., Massacci, F.: A preliminary analysis of vulnerability scores for attacks in wild. The ekits and sym datasets. In: Workshop on Building Analysis Datasets and Gathering Experience Returns for Security (BADGERS), pp. 17–24. Raleigh, NC (2012)

  3. Alvarez, G., Petrovic, S.: A new taxonomy of web attacks suitable for efficient encoding. Comput. Secur. 22, 435–449 (2003)

    Article  Google Scholar 

  4. Anderson, R., Barton, C., Böhme, R., Clayton, R., Eeten, M.J.G.V., levi, M., Moore, T., Savage, S.: Measuring the cost of cybercrime. In: Workshop on the Economics of Information Security (WEIS), pp. 1–31. Berlin, Germany (2012)

  5. Arbaugh, W.A., Fithen, W.L., McHugh, J.: Windows of vulnerability: a case study analysis. Computer 12(33), 52–59 (2000)

    Google Scholar 

  6. Bailey, M., Oberheide, J., Anderen, J., Mao, Z.M., Jahanian, F., Nezario, J.: Automated classification and analysis of internet malware. In: International Symposium on Research in Attacks, Instrusions and Defenses (RAID), pp. 178–197 (2007)

  7. Barabási, A.L.: The origin of bursts and heavy tails in human dynamics. Lett. Nat. 435, 207–211 (2005)

    Article  Google Scholar 

  8. Bayer, U., Comparetti, P.M., Hlauschek, C., Kruegel, C., Kirda, E.: Scalable, behavior-based malware clustering. In: Network and Distributed System Security Symposium (NDSS). San Diego, CA (2009)

  9. Bilge, L., Dumitraş, T.: Before we knew it. An empirical study of zero-day attacks in the real world. In: Computer and Communication Security Conference (CCS), pp. 833–844. Raleigh, NC (2012)

  10. Bishop, M.: A taxonomy of (unix) system and network vulnerabilities. Tech. Rep. CSE-95-10, Department of Computer Science, University of California Davis (1995)

  11. Bozorgi, M., Saul, L.K., Savage, S., Vœlker, G.M.: Beyond heuristics: learning to classify vulnerabilities and predict exploits. In: ACM SIGKDD Conference on Knowledge Discovery and Data Minining (KDD), pp. 105–114 (2010)

  12. Browne, H., Arbaugh, W., McHugh, J., Fithen, W.: A trend analysis of exploitations. In: Symposium on Security and Privacy. Oakland, CA (2001)

  13. Canto, J., Dacier, M., Kirda, E., Leita, C.: Large scale malware collection: lessons learned. In: IEEE Workshop on Sharing Field Data and Experiment Measurements on Resilience of Distributed Computed Systems (SRDS) (2008)

  14. Cohen, F.: Information system attacks: a preliminary classification scheme. Comput. Secur. 16, 29–46 (1997)

    Article  Google Scholar 

  15. Dumitraş, T., Shou, D.: Toward a standard benchmark for computer security research. The worldwide intelligence network environment (wine). In: Workshop on Building Analysis Datasets and Gathering Experience Returns for Security (BADGERS), pp. 89–96. Salzburg, Austria (2011)

  16. Frei, S., May, M., Fiedler, U., Plattner, B.: Large-scale vulnerability analysis. In: SIGCOMM workshop on large-scale attack defense, pp. 131–138 (2006)

  17. Frei, S., Tellenbach, B., Plattner, B.: 0-day patch exposting vendors (in)security performance. In: Black Hat Technical Security Conference (2008)

  18. Hansman, S., Hunt, R.: A taxonomy of network and computer attacks. Comput. Secur. 24, 31–43 (2005)

    Article  Google Scholar 

  19. Hu, X., Chiueh, T., Shin, K.G.: Large-scale malware indexing using function-call graphs. In: Computer and Communication Security Conference (CCS). Chicago, IL (2009)

  20. Kanich, C., Kreibich, C., Levchenko, K., Enright, B., Vœlker, G.M., Paxon, V.: Spamlytics: an empirical analysis of spam marketing conversion. In: The Computer and Communication Security Conference (CCS), pp. 3–14. Alexandria, VA (2008)

  21. Kienzle, D.M., Elder, M.: Recent worms: a survey and trends. In: Proceedings of the ACM Workshop on Rapid Malcode (WORM), pp. 1–10. Washington, DC (2003)

  22. Kotov, V., Massacci, F.: Anatomy of exploit kits: Preliminary analysis of exploit kits as software artefacts. In: the 5th International Conference on Engineering Secure Software and Systems (ESSoS), pp. 181–196. Paris, France (2013)

  23. Leita, C., Bayer, U., Kirda, E.: Exploiting diverse observation perspectives to get insights on the malware landscape. In: IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 393–402 (2010)

  24. Lough, D.L.: A taxonomy of computer attacks with applications to wireless networks. Phd thesis, Virginia Polytechnic Institute and State University (2001)

  25. McAfee: Mcafee threat report: third quarter 2012. http://www.mcafee.com/us/resources/reports/rp-quarterly-threat-q3-2012.pdf (2012). Last accessed: February 2013

  26. McAfee: Mcafee threats report: third quarter 2012. http://www.mcafee.com/au/resources/reports/rp-quarterly-threat-q3-2012.pdf (2012). Last accessed: December 2012

  27. MITRE: CVE-Common Vulnerabilities and Exposures. http://cve.mitre.org/ (2012). Last accessed: October 2012

  28. Moore, D., Paxson, V., Savage, S., Shannon, C., Staniford, S., Weaver, N.: Inside the slammer worm. IEEE Secur. Priv. 4(1), 33–39 (2003)

    Article  Google Scholar 

  29. Moshchuk, A., Bragin, T., Gribble, S.D., Levy, H.M.: A crawler-based study of spyware on the web. In: Symposium on Network and Distributed System Security (NDSS). San Diego, CA (2006)

  30. OPSWAT: Market share report. http://www.opswat.com/about/media/reports/antivirus-september-2012 (2013). Last accessed: March 2013

  31. Reynaud-Plantey, D.: New threats of java viruses. J. Comput. Virol. 1–2(1), 32–43 (2005)

    Article  Google Scholar 

  32. Rieck, K., Holz, T., Willems, C., Düssel, P., Laskov, P.: Learning and classification of malware behavior. In: Conference on Detection of Intrusions and Malware and Vulnerability (DIMVA), pp. 108–125. Paris, France (2008)

  33. Shahzad, M., Shafiq, M.Z., Liu, A.X.: A large scale exploratory analysis of software vulnerability life cycles. In: International Conference on Software Engineering (ICSE), pp. 771–781 (2012)

  34. Sophos: security threat report. http://www.sophos.com/en-us/security-news-trends/reports/security-threat-report.aspx (2012). Last accessed: December 2012

  35. Symantec: internet security threat report. http://www.symantec.com/content/en/us/enterprise/other_resources/b-istr_main_report_2011_21239364.en-us.pdf (2011). Last accessed: October 2012

  36. Symantec: symantec attack signatures. http://www.symantec.com/security_response/attacksignatures/ (2012). Last accessed: October 2012

  37. Symantec: symantec threat explorer. http://www.symantec.com/security_response/landing/azlisting.jsp (2012). Last accessed: October 2012

  38. Symantec: threat severity assessment. http://www.symantec.com/security_response/severityassessment.jsp (2012). Last accessed: October 2012

  39. Symantec: Types of virus definitions available for download. http://www.symantec.com/popup.jsp?popupid=sr_help_popup (2012). Last accessed: October 2012

  40. Symantec: symantec naming conventions. http://www.symantec.com/security_response/virusnaming.jsp (2013). Last accessed: June 2013

  41. Thonnard, O., Bilge, L., O’Gorman, G., Kiernan, S., Lee, M.: Industrial espionage and targeted attacks: understanding the characteristics of an escalating threat. In: International Symposium on Research in Attacks, Instrusions and Defenses (RAID), pp. 64–85 (2012)

  42. Töyssy, S., Helenius, M.: About malicious software in smartphones. J. Comput. Virol. 2(2), 109–119 (2006)

    Article  Google Scholar 

  43. TrendMicro: Threat encyclopedia. http://about-threats.trendmicro.com/us/threatencyclopedia (2012). Last accessed: October 2012

  44. Weaver, N., Paxson, V., Staniford, S., Cunningham, R.: A taxonomy of computer worms. In: ACM Workshop on Rapid Malcode (WORM), pp. 11–18 (2003)

  45. Zhou, Y., Jiang, X.: Dissecting android malware: Characterization and evolution. In: IEEE Symposium on Security and Privacy, pp. 95–109. Oakland, CA (2012)

Download references

Acknowledgments

The authors would like to thank Matthew Elder and Tudor Dumitraş for their excellent feedback and support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ghita Mezzour.

Additional information

This work is supported in part by the Defense Threat Reduction (DTRA) under the grant number HDTRA11010102, the Army Research Office (ARO) under grants W911NF1310154 and W911NF0910273, and the Center for Computational Analysis of Social and Organizational Systems (CASOS). The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied of DTRA, ARO or the US government.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mezzour, G., Carley, L.R. & Carley, K.M. Longitudinal analysis of a large corpus of cyber threat descriptions. J Comput Virol Hack Tech 12, 11–22 (2016). https://doi.org/10.1007/s11416-014-0217-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11416-014-0217-8

Keywords

Navigation