Detecting unknown computer worm activity via support vector machines and active learning

Nissim, Nir; Moskovitch, Robert; Rokach, Lior; Elovici, Yuval

doi:10.1007/s10044-012-0296-4

Detecting unknown computer worm activity via support vector machines and active learning

Industrial and Commercial Application
Published: 25 September 2012

Volume 15, pages 459–475, (2012)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Nir Nissim^1,2,
Robert Moskovitch^1,2,
Lior Rokach^1,2,2 &
…
Yuval Elovici^1,2

740 Accesses
50 Citations
Explore all metrics

Abstract

To detect the presence of unknown worms, we propose a technique based on computer measurements extracted from the operating system. We designed a series of experiments to test the new technique by employing several computer configurations and background application activities. In the course of the experiments, 323 computer features were monitored. Four feature-ranking measures were used to reduce the number of features required for classification. We applied support vector machines to the resulting feature subsets. In addition, we used active learning as a selective sampling method to increase the performance of the classifier and improve its robustness in the presence of misleading instances in the data. Our results indicate a mean detection accuracy in excess of 90 %, and an accuracy above 94 % for specific unknown worms using just 20 features, while maintaining a low false-positive rate when the active learning approach is applied.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Machine Learning for Intelligent Data Analysis and Automation in Cybersecurity: Current and Future Prospects

Article Open access 19 September 2022

Iqbal H. Sarker

Review: machine learning techniques applied to cybersecurity

Article 04 January 2019

Javier Martínez Torres, Carla Iglesias Comesaña & Paulino J. García-Nieto

Notes

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/counter/counters2_lbfc.asp.
Symantec - http://www.symantec.com.
Kasparsky - http://www.viruslist.com.
Macfee - http://vil.nai.com.

References

Fosnock C (2008) Computer worms: past, present and future. Technical report, East Carolina University
Schultz MG , Eskin E, Zadok E, Stolfo SJ (2001) Data mining methods for detection of new malicious executables. In: Proceedings of the 2001 IEEE symposium on security and privacy, SP ’01, Washington, DC, USA, pp 38
Abou-Assaleh T, Cercone N, Keselj V, Sweidan R (2004) N-gram-based detection of new malicious code. In: Proceedings of the 28th annual international computer software and applications conference—workshops and fast abstracts, COMPSAC ’04, vol 02. IEEE Computer Society, Washington, DC, pp 41–42
Zico Kolter J, Maloof MA (2006) Learning to detect and classify malicious executables in the wild. J Mach Learn Res
Moore D, Paxson V, Savage S, Shannon C, Staniford S, Weaver N (2003) Inside the slammer worm. Security Privacy IEEE 1(4):33–39
Google Scholar
Moskovitch R, Elovici Y, Rokach L (2008) Detection of unknown computer worms based on behavioral classification of the host. Comput Stat Data Anal 52(9):4544–4566
Google Scholar
Menahem E,Shabtai A, Rokach L, Elovici Y (2009) Improving malware detection by applying multi-inducer ensemble. Comput Stat Data Anal 53(4):1483–1494
Google Scholar
Moskovitch R, Stopel D, Feher C, Nissim N, Japkowicz N, Elovici Y (2009) Unknown malcode detection and the imbalance problem. J Comput Virol 5:295–308. doi:10.1007/s11416-009-0122-8
Google Scholar
Kienzle DM, MC Elder (2003) Recent worms: a survey and trends. In: Proceedings of the 2003 ACM workshop on Rapid malcode, WORM ’03 , ACM, New York, pp 1–10
Moore D, Shannon C, Claffy K (2002) Code-red: a case study on the spread and victims of an internet worm. In: Proceedings of the 2nd ACM SIGCOMM Workshop on Internet measurment, IMW ’02, ACM, New York, pp 273–284
Weaver N, Paxson V, Staniford S, Cunningham R (2003) A taxonomy of computer worms. In: Proceedings of the 2003 ACM workshop on Rapid malcode, WORM ’03, ACM, New York, pp 11–18
Cert (2000) Multiple denial-of-Service problems in ISC BIND. http://www.cert.org/advisories/CA-2000-20.html. (Online; Accessed 23 July 2012)
Lee W, Stolfo SJ, Mok KW (1999) A data mining framework for building intrusion detection models. In: Security and Privacy, 1999, Proceedings of the 1999 IEEE Symposium, pp 120–132
P Kabiri, Ghorbani Ali A (2005) Research on intrusion detection and response: a survey. Int J Netw Security 1:84–102
Google Scholar
Barbará D , Ningning Wu, Jajodia S (2001) Detecting novel network intrusions using Bayes estimators. In:Proceedings of the First SIAM Conference on Data Mining
Zanero S, Savaresi SM (2004) Unsupervised learning techniques for an intrusion detection system. In: Proceedings of the 2004 ACM symposium on applied computing, SAC ’04,ACM, New York, NY, USA, pp 412–419
Kayacik HG, Zincir-Heywood AN, Heywood MI (2003) On the capability of an som based intrusion detection system. In: Neural networks 2003. Proceedings of the International Joint Conference, vol 3, pp 1808–1813
Lei JZ, Ghorbani A (2004) Network intrusion detection using an improved competitive learning neural network. In: Communication networks and services research, 2004, Proceedings. second annual conference, pp 190–197
Stopel D, Moskovitch R, Boger Z, Shahar Y, Elovici Y (2009) Using artificial neural networks to detect unknown computer worms. Neural Comput Appl 18:663–674
Article Google Scholar
PingZhao Hu, MI Heywood (2003) Predicting intrusions with local linear models. In: Neural networks 2003. Proceedings of the international joint conference, vol 3, pp 1780–1785
Dickerson JE, Dickerson JA (2000) Fuzzy network profiling for intrusion detection. In: Fuzzy Information Processing Society, NAFIPS, 19th International Conference of the North American, pp 301–306
Bridges SM, Vaughn RB (2000) Associate Professor and Associate Professor Fuzzy data mining and genetic algorithms applied to intrusion detection. In: Proceedings of the national information systems security conference (NISSC), pp 6–19
Botha M, von Solms R (2003) Utilising fuzzy logic and trend analysis for effective intrusion detection. Comput Amp Security 22(5):423–434
Article Google Scholar
Cohn DA, Ghahramani Z, Jordan MI (1995) Active learning with statistical models. Technical Report, Cambridge, MA, USA
Lewis DD, Gale WA (1994) A sequential algorithm for training text classifiers. In: Proceedings of the 17th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’94, New York, NY, USA. Springer-Verlag New York, Inc,New York, pp 3–12
Roy N, McCallum A (2001) Toward optimal active learning through sampling estimation of error reduction. In: Proceedings of the eighteenth international conference on machine learning, ICML ’01. Morgan Kaufmann Publishers Inc, San Francisco, pp 441–448
Margineantu DD (2005) Active cost-sensitive learning. In: IJCAI, pp 1622–1613
Lorch JR, AJ Smith (2000) Building vtrace, a tracer for windows nt and windows 2000. Technical Report UCB/CSD-00-1093, EECS Department, University of California, Berkeley
Francisco A (2006) Witten ih, frank e: data mining: practical machine learning tools and techniques. BioMed Eng OnLine 5:1–2
Article Google Scholar
Ross Quinlan J (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA
Mitchell TM (1997) Machine learning. McGraw-Hill, New York
Pearl J (1986) Fusion propagation, and structuring in belief networks. Artif Intel 29(3):241–288
Google Scholar
Lior R, Oded M, Reuven A (2006) Selective voting—getting more for less in sensor fusion. IJPRAI 20(3):329–350
Google Scholar
Lior R, Barak C, Oded M (2007) A methodology for improving the performance of non-ranker feature selection filters. IJPRAI 21(5):809–830
Google Scholar
Rokach L, Romano R, Maimon O (2008) Negation recognition in medical narrative reports. Inf Retrieval 11(6):499–538
Article Google Scholar
Boser BE, Guyon IM, Vapnik VN (1992) A training algorithm for optimal margin classifiers. In: Proceedings of the fifth annual workshop on Computational learning theory, COLT ’92, , ACM, New York, pp 144–152
Thorsten J (1999) Advances in kernel methods. chapter Making large-scale support vector machine learning practical. MIT Press, Cambridge, pp 169–184
CJC Burges (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167
Article Google Scholar
Aizerman A, Braverman EM, LI Rozoner (1964) Theoretical foundations of the potential function method in pattern recognition learning. Automat Remote Control 25:821–837
Google Scholar
Chih-Chung C, Chih-Jen Lin Libsvm: a library for support vector machines. ACM Trans Intel Syst Technol
Wang X, Yu W, Champion A, Xinwen Fu, Dong Xuan (2007) Detecting worms via mining dynamic program execution. In: Security and Privacy in Communications Networks and the Workshops, 2007. SecureComm 2007. Third International Conference, pp 412 –421
Masud MM, Khan L, Thuraisingham B (2007) Feature based techniques for auto-detection of novel email worms. In: Proceedings of the 11th Pacific-Asia conference on advances in knowledge discovery and data mining, PAKDD’07. Springer, Berlin, pp 205–216
Moskovitch R, Nissim N, Stopel D, Feher C, Englert R, Elovici Y (2007) Improving the detection of unknown computer worms activity using active learning. In: Proceedings of the 30th annual German conference on advances in artificial intelligence, KI ’07. Springer, Berlin, Heidelberg, pp 489–493
Zhu Y, Wang X, Shen H (2008) Detection method of computer worms based on svm. Mech Elect Eng Magazine 8
Moskovitch R, Nissim N, Elovici Y (2009) Malicious code detection using active learning. In: Bonchi F, Ferrari E, Jiang W, Malin B (eds) Privacy, Security, and Trust in KDD. Lecture notes in computer science, vol 5456, pp 74–91. Springer, Berlin, Heidelberg
Rocco A (2003) Servedio smooth boosting and learning with malicious noise J Mach Learn Res 4:633–648
Google Scholar
Chen Y, Zhan Y (2009) Co-training semi-supervised active learning algorithm based on noise filter. In: Proceedings of the 2009 WRI global congress on intelligent systems, GCIS ’09, vol 03. IEEE Computer Society, Washington, DC, USA, pp 524–528
Schohn G , Cohn D (2000) Less is more: active learning with support vector machines. In: Proceedings of the seventeenth international conference on machine learning, ICML ’00. Morgan Kaufmann Publishers Inc,San Francisco, pp 839–846
Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res

Download references

Author information

Authors and Affiliations

Department of Information Systems Engineering, Ben Gurion University of the Negev, P.O.B. 653, 84105, Beer-Sheva, Israel
Nir Nissim, Robert Moskovitch, Lior Rokach & Yuval Elovici
Deutsche Telekom Laboratories, Ben Gurion University, Beer-Sheva, Israel
Nir Nissim, Robert Moskovitch, Lior Rokach, Lior Rokach & Yuval Elovici

Authors

Nir Nissim
View author publications
You can also search for this author in PubMed Google Scholar
Robert Moskovitch
View author publications
You can also search for this author in PubMed Google Scholar
Lior Rokach
View author publications
You can also search for this author in PubMed Google Scholar
Yuval Elovici
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lior Rokach.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nissim, N., Moskovitch, R., Rokach, L. et al. Detecting unknown computer worm activity via support vector machines and active learning. Pattern Anal Applic 15, 459–475 (2012). https://doi.org/10.1007/s10044-012-0296-4

Download citation

Received: 08 December 2009
Accepted: 05 September 2012
Published: 25 September 2012
Issue Date: November 2012
DOI: https://doi.org/10.1007/s10044-012-0296-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detecting unknown computer worm activity via support vector machines and active learning

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Machine Learning for Intelligent Data Analysis and Automation in Cybersecurity: Current and Future Prospects

Review: machine learning techniques applied to cybersecurity

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Machine Learning for Intelligent Data Analysis and Automation in Cybersecurity: Current and Future Prospects

Review: machine learning techniques applied to cybersecurity

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation