Review: machine learning techniques applied to cybersecurity

Martínez Torres, Javier; Iglesias Comesaña, Carla; García-Nieto, Paulino J.

doi:10.1007/s13042-018-00906-1

Review: machine learning techniques applied to cybersecurity

Original Article
Published: 04 January 2019

Volume 10, pages 2823–2836, (2019)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Javier Martínez Torres ORCID: orcid.org/0000-0001-6359-895X¹,
Carla Iglesias Comesaña² &
Paulino J. García-Nieto³

10k Accesses
Explore all metrics

Abstract

Machine learning techniques are a set of mathematical models to solve high non-linearity problems of different topics: prediction, classification, data association, data conceptualization. In this work, the authors review the applications of machine learning techniques in the field of cybersecurity describing before the different classifications of the models based on (1) their structure, network-based or not, (2) their learning process, supervised or unsupervised and (3) their complexity. All the capabilities of machine learning techniques are to be regarded, but authors focus on prediction and classification, highlighting the possibilities of improving the models in order to minimize the error rates in the applications developed and available in the literature. This work presents the importance of different error criteria as the confusion matrix or mean absolute error in classification problems, and relative error in regression problems. Furthermore, special attention is paid to the application of the models in this review work. There are a wide variety of possibilities, applying these models to intrusion detection, or to detection and classification of attacks, to name a few. However, other important and innovative applications in the field of cybersecurity are presented. This work should serve as a guide for new researchers and those who want to immerse themselves in the field of machine learning techniques within cybersecurity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Application of Machine Learning in Cybersecurity: A Technological Perceptive

Machine Learning for Intelligent Data Analysis and Automation in Cybersecurity: Current and Future Prospects

Article Open access 19 September 2022

Applications and Challenges of Machine Learning (ML) in Cyber Security

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

References

International Telecommunication Union (2014) The world in 2014: ICT Facts and figures. Technical report
Klimburg A (ed) (2012) National cyber security framework manual. NATO CCD COE Publication
Kolter JZ, Maloof MA (2006) Learning to detect and classify malicious executables in the wild. J Mach Learn Res 7:2721–2744
MathSciNet MATH Google Scholar
Almomani A, Altaher A, Ramadass S (2012) Application of adaptive neuro-fuzzy inference system for information security. J Comput Sci 8(6):983–986
Google Scholar
Bauer JM, van Eeten MJG (2009) Cybersecurity: stakeholder incentives, externalities, and policy options. Telecommun Policy 33(10–11):706–719
Google Scholar
Vázquez C (2014) Auditing using vulnerability tools to identify today’s threats business performance. SANS Institute, Fredericksburg
Google Scholar
Parise Furfaro A (2017) Using virtual environments for the assessment of cybersecurity issues in IoT scenarios. Simul Model Pract Theory 73:43–54
Google Scholar
Hashemi Khorshidpour T (2017) Domain invariant feature extraction against evasion attack. Int J Mach Learn Cybern 9:1–12
Google Scholar
Kumar VA, Pandey KK, Punia DK (2014) Cyber security threats in the power sector: Need for a domain specific regulatory framework in India. Energy Policy 65:126–133
Google Scholar
North Atlantic Treaty Organization (NATO) (2008) Bucharest summit declaration. Issued by the Heads of State and Government participating in the meeting of the North Atlantic Council in Bucharest on 3 April 2008
Barat M, Bogdan D, P, Gavrilut DT (2013) An automatic updating perceptron-based system for malware detection. In: IEEE 2013 15th international symposium on symbolic and numeric algorithms for scientific computing, pp 303–307
Bauer JM, Van Eeten M, Chattopadhyay T, Wu Y (2008) Financial implications of network security: malware and spam. Technical report, report for the international telecommunication union (ITU), Geneva (Switzerland)
International Organization for Standardization (2012) ISO/IEC 27032:2012. Information technology—Security techniques—Guidelines for cybersecurity
Fischer EA (2005) Creating a national framework for cybersecurity: an analysis of issues and options. Technical report. Congressional Research Service
The Open Web Application Security Project (OWASP) (2018) https://www.swascan.com/owasp/
The Open Web Application Security Project (2013) OWASP Top 10—the ten most critical web application security risks. The OWASP Foundation
Microsoft Security Development Lifecycle (2018) https://www.microsoft.com/enus/securityengineering/sdl/
Vatamanu C, Gavriluţ D, Benchea R-M (2013) Building a practical and reliable classifier for malware detection. J Comput Virol Hacking Tech 9(4):205–214
Google Scholar
Gavrilut D, Benchea R, Vatamanu C (September 2012) Optimized zero false positives perceptron training for malware detection. In: IEEE 2012 14th international symposium on symbolic and numeric algorithms for scientific computing, pp 247–253
Gavrilut D, Benchea R, Vatamanu C (2012) Practical optimizations for perceptron algorithms in large malware dataset. In: IEEE 2012 14th international symposium on symbolic and numeric algorithms for scientific computing, pp 240–246
Singh K, Guntuku SC, Thakur A, Hota C (2014) Big data analytics framework for peer-to-peer botnet detection using random forests. Inf Sci 278:488–497
Google Scholar
Goseva-Popstojanova K, Anastasovski G, Dimitrijevikj A, Pantev R, Miller B (2014) Characterization and classification of malicious web traffic. Comput Secur 42:92–115
Google Scholar
Purkait S (2012) Phishing counter measures and their effectiveness: literature review. Inf Manag Comput Secur 20(5):382–420
Google Scholar
Ceesay EN (2008) Mitigating phishing attacks: a detection, response and evaluation framework. Ph.D. thesis, University of California
Nappa D, Wang X, Abu-Nimeh S, Nair S (2007) A comparison of machine learning techniques for phishing detection. In: Proceedings of the anti-phishing working groups 2nd annual eCrime researchers summit on—eCrime ’07, pp 60–69
MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: pp 281–297
Kohonen T (1982) Self-organizating formation of topologically correct feature maps. Biol Cybern 43:59–69
MathSciNet MATH Google Scholar
Gordon AD (1992) Hierarchical classification. World Scientific Press, Singapore
Google Scholar
Albayrak S, Amasyali F (2003) Fuzzy c-means clustering on medical diagnostic systems. In: International twelfth Turkish symposium on artificial intelligence and neural networks (TAINN), pp 1–3
Bradley PS, Fayad UM (1998) Refining initial points for k-means clustering. In: Proceedings of the 15th conference on machine learning, Wisconsin, pp 91–99
Haykin S (1999) Neural netowrks. A comprehensive foundation. Prentice Hall, Upper Saddle River
MATH Google Scholar
Quinlan JR (1986) Induction on decision trees. Mach Learn 1:81–106
Google Scholar
Quinlan JR (1993) C4.5: programas for machine learning. Morgan Kaufmann, Burlington
Google Scholar
Breiman L, Friedman J (1984) Classification and regression trees. Wadsworth, Belmont
MATH Google Scholar
Cherkassky V, Mulier F (1998) Learning from data: concepts, theory and methods. Wiley, Berlin
MATH Google Scholar
Vorobeva A (2017) Influence of features discretization on accuracy of random forest classifier for web user identification. In: Conference of open innovation association, FRUCT
Miller S, Busby-Earle C (2017) Multi-perspective machine learning a classifier ensemble method for intrusion detection. In: ICMLSC ’17 proceedings of the 2017 international conference on machine learning and soft computing, pp 7–12
He S, Lee G, Han S, Whinston A (2016) How would information disclosure influence organizations’ outbound spam volume? Evidence from a field experiment. J Cybersecur 2(1):99–118
Google Scholar
Vapnik V (1982) Estimation of dependences based on empirical data. Springer, Berlin
MATH Google Scholar
Drucker H, Burges C, Kaufman L, Smola A, Vapnik V (1997) Support vector regression machines. MIT Press, Cambridge
Google Scholar
Osuna E, Freund R, Girosi F (1997) An improved training algorithm for support vector machines, In: Proceedings of the 1997 IEEE signal processing society workshop, Amelia Island, Florida, USA, pp 1–10
Joachims T (1999) Machine large-scale SVM learning practical. MIT Press, Cambridge
Google Scholar
Kyriakopoulos Ghanem A (2017) Support vector machine for network intrusion and cyber-attack detection. Sensor Signal Processing for Defence Conference (SSPD2017) 38–41
Vapnik V (1998) Statistical learning theory. Wiley, Berlin
MATH Google Scholar
MacCulloch WS, Pitts WS (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5:115–133
MathSciNet MATH Google Scholar
Dua S, Du X (2011) Data mining and machine learning in cybersecurity. Auerbach Publications, Taylor & Francis Group, Boca Raton, FL, USA
Battiti R (1992) First and second-order methods for learning: between steepset descent and newton method. Neural Comput 4:141–166
Google Scholar
Bishop CM (1995) Neural networks and pattern recognition. Oxford University Press, Oxford
Google Scholar
Nguyen D, Widrow B (1990) Improving the learning speed of 2-layer neural network by choosing initial values of the adaptative weights. In: International joint conference on neural networks (IJCNN). IEEE, San Diego, pp 21–26
Wang X-Z, Wang R, Xu C (2018) Discovering the relationship between generalization and uncertainty by incorporating complexity of classification. IEEE Trans Cybern 48:703–715
Google Scholar
Wang R, Wang X-Z, Kwong S, Xu C (2017) Incorporating diversity and informativeness in multiple-instance active learning. IEEE Trans Fuzzy Syst 25:1460–1475
Google Scholar
Ashfaq R, Wang X-Z, Huang J, Abbas H, He Y-L (2017) Fuzziness based semi-supervised learning approach for intrusion detection system. Inf Sci 378:484–497
Google Scholar
Wang X-Z, Xing H-J, Li Y, Hua Q, Dong CR, Pedrycz W (2017) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23:1638–1654
Google Scholar
Lecun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Google Scholar
Fernandez Maimo L, Perales Gomez AL, Garcia Clemente FJ, Gil Perez M, Martinez Perez. G (2018) A self-adaptive deep learning-based system for anomaly detection in 5G networks. IEEE Access 6(6):7700–7712
Google Scholar
Abeshu A, Chilamkurti N (2018) Deep learning: the frontier for distributed attack detection in fog-to-things computing. IEEE Commun Mag 56(2):169–175
Google Scholar
Kebede TM, Djaneye-Boundjou O, Narayanan BN, Ralescu A, Kapp D (2017) Classification of malware programs using autoencoders based deep learning architecture and its application to the microsoft malware classification challenge (big 2015) dataset. Proc IEEE Natl Aerosp Electron Conf NAECON 2017:70–75
Google Scholar
Xin Y, Kong L, Liu Z, Chen Y, Li Y, Zhu H, Gao M, Hou H, Wang C (2018) Machine learning and deep learning methods for cybersecurity. IEEE Access 6:35365–35381
Google Scholar
Islam R, Abawajy J (2013) A multi-tier phishing detection and filtering approach. J Netw Comput Appl 36(1):324–335
Google Scholar
Almomani A, Gupta BB, Atawneh S, Meulenberg A, Almomani E (2013) A survey of phishing email filtering techniques. IEEE Commun Surv Tutor 15(4):2070–2090
Google Scholar
Drucker H, Wu D, Vapnik VN (1999) Support vector machines for spam categorization. IEEE Trans Neural Netw Publ IEEE Neural Netw Counc 10(5):1048–54
Google Scholar
Jagatic TN, Johnson NA, Jakobsson M, Menczer F (2007) Social phishing. Commun ACM 50(10):94–100
Google Scholar
Mohammad RM, Thabtah F, McCluskey L (2015) Tutorial and critical analysis of phishing websites methods. Comput Sci Rev 17:1–24
MathSciNet Google Scholar
Cranor LF, Lamacchia BA (1998) Spam!. Commun ACM 41(8):74–83
Google Scholar
SANS Institute. Top 15 Malicious Spyware Actions (2018) https://www.sans.org/security-resources/
Kim SC, Lee SW, Sung KJ, Kim SK (2012) Splog detection usingstructural similarity between posts and URL biasedness in posts. J Internet Technol 13(5):767–772
Google Scholar
Zhu L, Sun A, Choi B (2011) Detecting spam blogs from blog search results. Inf Process Manag 47(2):246–262
Google Scholar
Luckner M, Gad M, Sobkowiak P (2014) Stable web spam detection using features based on lexical items. Comput Secur 46:79–93
Google Scholar
Prieto VM, Álvarez M, Cacheda F (2013) SAAD, a content based web spam analyzer and detector. J Syst Softw 86(11):2906–2918
Google Scholar
Scarselli F, Tsoi AC, Hagenbuchner M, Noi LD (2013) Solving graph data issues using a layered architecture approach with applications to web spam detection. Neural Netw Off J Int Neural Netw Soc 48:78–90
Google Scholar
Martinez-Romo J, Araujo L (2013) Detecting malicious tweets in trending topics using a statistical analysis of language. Expert Syst Appl 40(8):2992–3000
Google Scholar
Stern H (2008) A survey of modern spam tools. In: 5th conference on email and anti-spam, CEAS 2008. Conference on email and anti-spam, CEAS
Guzella TS, Caminhas WM (2009) A review of machine learning approaches to spam filtering. Expert Syst Appl 36(7):10206–10222
Google Scholar
Fawcett T (2003) “In vivo” spam filtering: a challenge problem for KDD. SIGKDD Explor 5(2):140–148
Google Scholar
Sahami M, Dumais S, Heckerman D, Horvitz E (1998) A Bayesian approach to filtering junk E-mail. Tech. rep. WS-98-05
Graham P (2003) A plan for spam. http://paulgraham.com/spam.html. Accessed 26 June 2003
Wang ZJ, Liu Y, Wang ZJ (2014) E-mail filtration and classification based on variable weights of the Bayesian algorithm. Appl Mech Mater 513–517:2111–2114
Google Scholar
Dewdney N, VanEss-Dykema C, MacMillan R (2001) The form is the substance. In: Proceedings of the workshop on human language technology and knowledge management, vol 2001, Morristown, NJ, USA. Association for Computational Linguistics, pp 1–8
Almeida J, Almeida T, Yamakami A (2011) Spam filtering: how the dimensionality reduction affects the accuracy of Naive Bayes classifiers. J Internet Serv Appl 1(3):183–200
Google Scholar
Song Y, Kołcz A, Giles CL (2009) Better Naive Bayes classification for high-precision spam detection. Softw Pract Exp 39(11):1003–1024
Google Scholar
Amayri O, Bouguila N (2010) A study of spam filtering using support vector machines. Artif Intell Rev 34(1):73–108
Google Scholar
Hsu W-C, Yu T-Y (2010) E-mail spam filtering based on support vector machines with Taguchi method for parameter selection. J Converg Inf Technol 5(8):78–88
Google Scholar
Caruana G, Li M, Qi M (2011) A MapReduce based parallel SVM for large scale spam filtering. In: IEEE 2011 eighth international conference on fuzzy systems and knowledge discovery (FSKD), vol 4, pp 2659–2662
Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
Google Scholar
Wu C-H (2009) Behavior-based spam detection using a hybrid method of rule-based techniques and neural networks. Expert Syst Appl 36(3):4321–4330
Google Scholar
Tseng L-S, Wu C-H (2003) Detection of spam e-mails by analyzing the distributing behaviors of e-mail servers. In: Proceedings of the third international conference on hybrid intelligent systems, pp 1024–1033
Gupta A, Singhal C, Aggarwal S (2012) An improved anti spam filter based on content, low level features and noise. Lect Notes Inst Comput Sci Soc Inf Telecommun Engi LNICST 84(PART 1):563–572
Google Scholar
Li P, Yan H, Cui G, Du Y (2012) Integration of local and global features for image spam filtering. J Comput Inf Syst 8(2):779–789
Google Scholar
Biggio B, Fumera G, Pillai I, Roli F (2011) A survey and experimental evaluation of image spam filtering techniques. Pattern Recognit Lett 32(10):1436–1446
Google Scholar
Hazza ZM, Aziz NA (2015) A new efficient text detection method for image spam filtering. Int Rev Comput Softw 10(1):1–8
Google Scholar
Liu T-J, Wu C-N, Lee C-L, Chen C-W (2014) A self-adaptable image spam filtering system. J Chin Inst Eng Trans Chin Inst Eng Ser A (Chung-kuo Kung Ch’eng Hsuch K’an) 37(4):517–528
Google Scholar
Manek AS, Shamini DK, Bhat VH, Shenoy PD, Mohan MC, Venugopal KR, Patnaik LM (2014) Rep-etd: a repetitive preprocessing technique for embedded text detection from images in spam emails. In: pp 568–573
Wakade S, Liszka KJ, Chan C-C (2013) Application of learning algorithms to image spam evolution. Smart Innov Syst Technol 13:471–495
Google Scholar
Attar A, Rad RM, Atani RE (2013) A survey of image spamming and filtering techniques. Artif Intell Rev 40(1):71–105
Google Scholar
Romero C, Garcia-Valdez M, Alanis A (2010) A comparative study of blog comments spam filtering with machine learning techniques. Stud Comput Intell 312:57–72
Google Scholar
Yang W, Dong G, Wang W, Hu Y, Shen G, Yu M (2015) A novel approach for bots detection in sina microblog. J Comput Theor Nanosci 12(7):1420–1425
Google Scholar
Abu-Nimeh S, Chen T (2010) Proliferation and detection of blog spam. IEEE Secur Priv Mag 8(5):42–47
Google Scholar
Kolari P, Java A, Finin T, Oates T, Joshi A (2006) Detecting spam blogs: a machine learning approach. Proc Natl Conf Artif Intell 2:1351–1356
Google Scholar
Yoshinaka T, Ishii S, Fukuhara T, Masuda H, Nakagawa H (2010) A user-oriented splog filtering based on a machine learning. Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6045 LNCS((M4D)):88–99
Google Scholar
Sculley D, Wachman GM (2007) Relaxed online SVMS for spam filtering. In: pp 415–422
McCord M, Chuah M (2011) Spam detection on twitter using traditional classifiers. Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6906 LNCS:175–186
Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
MATH Google Scholar
Soman SJ, Murugappan S (2014) Detecting malicious tweets in trending topics using clustering and classification
Chu Z, Gianvecchio S, Wang H, Jajodia S (2010) Who is tweeting on twitter: human, bot, or cyborg? In: pp 21–30
Wang AH (2010) Detecting spam bots in online social networking sites: a machine learning approach. Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6166 LNCS:335–342
Google Scholar
Wang AH (2010) Don’t follow me—spam detection in twitter. In: pp 142–151
Santos I, Miñambres-Marcos I, Laorden C, Galán-García P, Santamaría-Ibirika A, García Bringas P (2014) Twitter content-based spam filtering. Adv Intell Syst Comput 239:449–458
Google Scholar
Zangerle E, Specht G (2014) “sorry, i was hacked” a classification of compromised twitter accounts. In: pp 587–593
Benevenuto F, Rodrigues T, Almeida V, Almeida J, Zhang C, Ross K (2008) Identifying video spammers in online social networks. In: pp 45–52
Benevenuto F, Rodrigues T, Veloso A, Almeida J, Goncalves M, Almeida V (2012) Practical detection of spammers and content promoters in online video sharing systems. IEEE Trans Syst Man Cybern Part B Cybern 42(3):688–701
Google Scholar
Indira K, Christal Joy E (2014) Prevention of spammers and promoters in video social networks using SVM-knn. Int J Eng Technol 6(5):2024–2030
Google Scholar
Stolfo SJ, Hershkop S, Bui LH, Ferster R, Wang K (2005) Anomaly detection in computer security and an application to file system accesses. Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 3488 LNAI:14–28
Google Scholar
Chen Z, Ji C (2005) Spatial-temporal modeling of malware propagation in networks. IEEE Trans Neural Netw 16(5):1291–1303
Google Scholar
Lin J (2008) On malicious software classification. In: pp 368–371
Li P, Liu L, Gao D, Reiter MK (2010) On challenges in evaluating malware clustering. Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6307 LNCS:238–255
Google Scholar
Nakazato J, Song J, Eto M, Inoue D, Nakao K (2011) A novel malware clustering method using frequency of function call traces in parallel threads. IEICE Trans Inf Syst E94–D(11):2150–2158
Google Scholar
Shafiq MZ, Khayam SA, Farooq M (2008) Improving accuracy of immune-inspired malware detectors by using intelligent features. In: pp 119–126
Bose A, Hu X, Shin KG, Park T (2008) Behavioral detection of malware on mobile handsets. In: pp 225–238
Anderson B, Quist D, Neil J, Storlie C, Lane T (2011) Graph-based malware detection using dynamic analysis. J Comput Virol 7(4):247–258
Google Scholar
Chandramohan M, Tan HBK, Briand LC, Shar LK, Padmanabhuni BM (2013) A scalable approach for malware detection through bounded feature space behavior modeling. In: pp 312–322
Dhaya R, Poongodi M (2015) Detecting software vulnerabilities in android using static analysis. In: pp 915–918
Durand J, Atkison T (2012) Applying random projection to the classification of malicious applications using data mining algorithms. In: pp 286–291
Ismail I, Marsono MN, Nor SM (2014) Malware detection using augmented naive bayes with domain knowledge and under presence of class noise. Int J Inf Comput Secur 6(2):179–197
Google Scholar
Lu W, Rammidi G, Ghorbani AA (2011) Clustering botnet communication traffic based on n-gram feature selection. Comput Commun 34(3):502–514
Google Scholar
Markel Z, Bilzor M (2015) Building a machine learning classifier for malware detection. In: Second workshop on anti-malware testing research (WATeR). IEEE, Canterbury, UK. https://doi.org/10.1109/WATeR.2014.7015757
Merkel R, Hoppe T, Kraetzer C, Dittmann J (2010) Statistical detection of malicious pe-executables for fast offline analysis. Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6109 LNCS:93–105
Google Scholar
Moskovitch R, Elovici Y (2008) Unknown malicious code detection—practical issues. In: pp 145–152
Ponomarev S, Durand J, Wallace N, Atkison T (2013) Evaluation of random projection for malware classification. In: pp 68–73
Reddy DKS, Pujari AK (2006) N-gram analysis for computer virus detection. J Comput Virol 2(3):231–239
Google Scholar
Santos I, Penya YK, Devesa J, Bringas PG (2009) N-grams-based file signatures for malware detection. In: Volume AIDSS, pp 317–320
Shabtai A, Moskovitch R, Elovici Y, Glezer C (2009) Detection of malicious code by applying machine learning classifiers on static features: a state-of-the-art survey. Inf Secur Tech Rep 14(1):16–29 Malware
Google Scholar
Shahzad F, Farooq M (2012) Elf-miner: using structural knowledge and data mining methods to detect new (linux) malicious executables. Knowl Inf Syst 30(3):589–612
Google Scholar
Shijo PV, Salim A (2015) Integrated static and dynamic analysis for malware detection. Procedia Comput Sci 46:804–811
Google Scholar
Siddiqui M, Wang MC, Lee J (2008) A survey of data mining techniques for malware detection using file features. In: pp 509–510
Uppal D, Sinha R, Mehra V, Jain V (2014) Malware detection and classification based on extraction of API sequences. In: pp 2337–2342
Wressnegger C, Schwenk G, Arp D, Rieck K (2013) A close look on n-grams in intrusion detection: anomaly detection vs. classification. In: pp 67–76
Yu W, Zhang H, Ge L, Hardy R (2013) On behavior-based detection of malware on android platform. In: pp 814–819
Yuxin D, Wei D, Yibin Z, Chenglong X (2014) Malicious code detection using opcode running tree representation. In: pp 616–621
Yuxin D, Xuebing Y, Di Z, Li D, Zhanchao A (2011) Feature representation and selection in malicious code detection methods based on static system calls. Comput Secur 30(6–7):514–524
Google Scholar
Zolotukhin M, Hämäläinen T (2013) Support vector machine integrated with game-theoretic approach and genetic algorithm for the detection and classification of malware. In: pp 211–216
Cova M, Kruegel C, Vigna G (2010) Detection and analysis of drive-by-download attacks and malicious javascript code. In: pp 281–290
Zhu K, Yin B (2012) Malware behavior classification approach based on naive bayes. J Converg Inf Technol 7(5):203–210
Google Scholar
Zhu K, Yin B, Mao Y, Hu Y (2014) Malware classification approach based on valid window and naive bayes. Comput Res Dev (Jisuanji Yanjiu yu Fazhan) 51(2):373–381
Google Scholar
Bat-Erdene M, Kim T, Li H, Lee H (2013) Dynamic classification of packing algorithms for inspecting executables using entropy analysis. In: pp 19–26
Ban T, Isawa R, Guo S, Inoue D, Nakao K (2013) Application of string kernel based support vector machine for malware packer identification. In: The 2013 international joint conference on neural networks (IJCNN). IEEE, Dallas, TX, USA. https://doi.org/10.1109/IJCNN.2013.6707043
Divya S, Padmavathi G (2014) A novel method for detection of internet worm malcodes using principal component analysis and multiclass support vector machine. Int J Secur Appl 8(5):391–402
Google Scholar
Komiya R, Paik I, Hisada M (2011) Classification of malicious web code by machine learning. In: pp 406–411
Nissim N, Moskovitch R, Rokach L, Elovici Y (2012) Detecting unknown computer worm activity via support vector machines and active learning. Pattern Anal Appl 15(4):459–475
MathSciNet Google Scholar
Nissim N, Moskovitch R, Rokach L, Elovici Y (2014) Novel active learning methods for enhanced pc malware detection in windows os. Expert Syst Appl 41(13):5843–5857
Google Scholar
Okane P, Sezer S, McLaughlin K, Im EG (2014) Malware detection: program run length against detection rate. IET Softw 8(1):42–51
Google Scholar
Sanjaa B, Chuluun E (2013) Malware detection using linear SVM. In: vol 2, pp 136–138
Wang P, Wang Y-S (2015) Malware behavioural detection and vaccine development by using a support vector model classifier. J Comput Syst Sci 81(6):1012–1026
Google Scholar
Zhao M, Ge F, Zhang T, Yuan Z (2011) Antimaldroid: an efficient SVM-based malware detection framework for android. Commun Comput Inf Sci 243 CCIS(PART 1):158–166
Google Scholar
Biggio B, Corona I, Nelson B, Rubinstein BIP, Maiorca D, Fumera G, Giacinto G, Roli F (2014) Security evaluation of support vector machines in adversarial environments
Firdausi I, Lim C, Erwin A, Nugroho AS (2010) Analysis of machine learning techniques used in behavior-based malware detection. In: pp 201–203
Canzanese R, Kam M, Mancoridis S (2013) Toward an automatic, online behavioral malware classification system. In: pp 111–120
Dube T, Raines R, Peterson G, Bauer K, Grimaila M, Rogers S (2012) Malware target recognition via static heuristics. Comput Secur 31(1):137–147
Google Scholar
Haddadi F, Runkel D, Nur Zincir-Heywood A, Heywood MI (2014) On botnet behaviour analysis using gp and c4.5. In: pp 1253–1260
Ye W, Cho K (2014) Hybrid p2p traffic classification with heuristic rules and machine learning. Soft Comput 18(9):1815–1827
Google Scholar
Borgolte K, Kruegel C, Vigna G (2013) Delta: automatic identification of unknown web-based infection campaigns. In: pp 109–120
Mohaisen A, Alrawi O (2015) AMAL: high-fidelity, behavior-based automated malware analysis and classification. In: Rhee KH, Yi J (eds) Information security applications, WISA 2014. Lecture notes in computer science, vol 8909. Springer, pp 107–121
Rieck K, Trinius P, Willems C, Holz T (2011) Automatic analysis of malware behavior using machine learning. J Comput Secur 19(4):639–668
Google Scholar
Menahem E, Shabtai A, Rokach L, Elovici Y (2009) Improving malware detection by applying multi-inducer ensemble. Comput Stat Data Anal 53(4):1483–1494
MathSciNet MATH Google Scholar
Shabtai A, Fledel Y, Elovici Y (2010) Automated static code analysis for classifying android applications using machine learning. In: pp 329–333
Huang C-Y, Tsai Y-T, Hsu C-H (2013) Performance evaluation on permission-based detection for android malware. Smart Innov Syst Technol 21:111–120
Google Scholar
Glodek W, Harang R (2013) Rapid permissions-based detection and analysis of mobile malware using random decision forests. In: pp 980–985
Alam MS, Vuong ST (2013) Random forest classification for detecting android malware. In: pp 663–669
Ng DV, Hwang J-IG (2015) Android malware detection using the dendritic cell algorithm. In: IEEE international conference on machine learning and cybernetics, Lanzhou, China, pp 257–262
Pehlivan U, Baltaci N, Acarturk C, Baykal N (2014) The analysis of feature selection methods and classification algorithms in permission based android malware detection. In: IEEE symposium on computational intelligence in cyber security (CICS), Orlando, FL, USA. https://doi.org/10.1109/CICYBS.2014.7013371
Barbareschi M, De Benedictis A, Mazzeo A, Vespoli A (2014) Mobile traffic analysis exploiting a cloud infrastructure and hardware accelerators. In: pp 414–41
Yu W, Zhang H, Xu G (2013) A study of malware detection on smart mobile devices. In: vol 8757
Yerima SY, Sezer S, Muttik I (2014) Android malware detection using parallel machine learning classifiers. In: pp 37–42
Feldman S, Stadther D, Wang B (2015) Manilyzer: automated android malware detection through manifest analysis. In: pp 767–77
Gates CS, Li N, Peng H, Sarma B, Qi Y, Potharaju R, Nita-Rotaru C, Molloy I (2014) Generating summary risk scores for mobile applications. IEEE Trans Dependable Secure Comput 11(3):238–251
Google Scholar
Yu L, Pan Z, Liu J, Shen Y (2013) Android malware detection technology based on improved bayesian classification. In: pp 1338–1341
Shabtai A, Kanonov U, Elovici Y, Glezer C, Weiss Y (2012) “Andromaly”: a behavioral malware detection framework for android devices. J Intell Inf Syst 38(1):161–190
Google Scholar
Sanz B, Santos I, Laorden C, Ugarte-Pedrero X, Bringas PG (2012) On the automatic categorisation of android applications. In: pp 149–153
Feizollah A, Anuar NB, Salleh R, Amalina F, Ma’arof RR, Shamshirband S (2013) A study of machine learning classifiers for anomaly-based mobile botnet detection. Malays J Comput Sci 26(4):251–265
Google Scholar
Ham H-S, Kim H-H, Kim M-S, Choi M-J (2014) Linear SVM-based android malware detection. Lect Notes Electr Eng 301:575–585
Google Scholar
Narayanan A, Chen L, Chan CK (2014) AdDetect: automated detection of android ad libraries using semantic analysis. In: IEEE ninth international conference on intelligent sensors, sensor networks and information processing (ISSNIP). IEEE, Singapore. https://doi.org/10.1109/ISSNIP.2014.6827639
Sahs J, Khan L (2012) A machine learning approach to android malware detection. In: pp 141–147
Spreitzenbarth M, Schreck T, Echtler F, Arp D, Hoffmann J (2015) Mobile-sandbox: combining static and dynamic analysis with machine-learning techniques. Int J Inf Secur 14(2):141–153
Google Scholar
Sheen S, Anitha R, Natarajan V (2015) Android based malware detection using a multifeature collaborative decision fusion approach. Neurocomputing 151(P2):905–912
Google Scholar
Allix K, Bissyandé TF, Jérome Q, Klein J, State R, Le Traon Y (2014) Empirical assessment of machine learning-based malware detectors for Android. Empir Softw Eng 21:183–211
Google Scholar
Allix K, Bissyandé TF, Klein J, Traon YL (2015) Are your training datasets yet relevant? an investigation into the importance of timeline in machine learning-based malware detection. Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8978:51–67
Google Scholar
Fette I, Sadeh N, Tomasic A (2007) Learning to detect phishing emails. In: Proceedings of the 16th international conference on World Wide Web (WWW ’07), New York (US), ACM, pp 649–656
Zhang L, Yao T (2003) Filtering junk mail with a maximum entropy model. In: pp 446–453
Gu X, Wang H, Ni T (2013) An efficient approach to detecting phishing web. J Comput Inf Syst 9(14):5553–5560
Google Scholar
He M, Horng S, Fan P, Khan M Khurram, Run R, Lai J, Chen R, Sutanto A (2011) An efficient phishing webpage detector. Expert Syst Appl 38(10):12018–12027
Google Scholar
Cao J, Dong D, Mao B, Wang T (2013) Phishing detection method based on url features. J Southeast Univ (English Edition) 29(2):134–138
Google Scholar
Chandrasekaran M, Narayanan K, Upadhyaya S (2006) Phishing E-mail detection based on structural properties. In: Proceedings of 9th annual NYS cyber security conference, Albany, NY, USA, pp 2–8
Ma L, Ofoghi B, Watters P, Brown S (2009) Detecting phishing emails using hybrid features. In: pp 493–497
Santhana Lakshmi V, Vijaya MS (2012) Efficient prediction of phishing websites using supervised learning algorithms. Procedia Eng 30:798–805
Google Scholar
Akinyelu AA, Adewumi AO (2014) Classification of phishing email using random forest machine learning technique. J Appl Math 2014:1–6
Google Scholar
Webber CG, De Fátima M, Do Prado Lima W, Hepp FS (2012) Testing phishing detection criteria and methods. Adv Intell Soft Comput 133AISC:853–858
Google Scholar
Del Castillo MD, Iglesias Á, Serrano JI (2007) An integrated approach to filtering phishing e-mails. Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 4739 LNCS:321–328
Google Scholar
Xiang G, Hong J, Rose CP, Cranor L (2011) Cantina+: a feature-rich machine learning framework for detecting phishing web sites. ACM Trans Inf Syst Secur 14(2):1–28
Google Scholar
Patil R, Dasharath DB, Dhonde KS, Chinchwade RG, Mehetre SB (2014) A hybrid model to detect phishing-sites using clustering and bayesian approach. Int J Comput Sci Netw Secur 15:92–95
Google Scholar
Basnet RB, Sung AH, Liu Q (2012) Feature selection for improved phishing detection. Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 7345 LNAI:252–261
Google Scholar
Qabajeh I, Thabtah F (2014) An experimental study for assessing email classification attributes using feature selection methods. In: pp 125–132

Download references

Acknowledgements

C. Iglesias acknowledges the support of the Spanish Ministry of Education, Culture and Sport for FPU Grant number 12/02283. J. Martinez acknowledges the support of the Spanish Ministry of Education for Grant project ID TIN2016-76770-R.

Author information

Authors and Affiliations

Universidad Internacional de la Rioja, Logroño, Spain
Javier Martínez Torres
University of Vigo, Vigo, Spain
Carla Iglesias Comesaña
University of Oviedo, Oviedo, Spain
Paulino J. García-Nieto

Authors

Javier Martínez Torres
View author publications
You can also search for this author inPubMed Google Scholar
Carla Iglesias Comesaña
View author publications
You can also search for this author inPubMed Google Scholar
Paulino J. García-Nieto
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Javier Martínez Torres.

Additional information

Publisher's Note

Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Martínez Torres, J., Iglesias Comesaña, C. & García-Nieto, P.J. Review: machine learning techniques applied to cybersecurity. Int. J. Mach. Learn. & Cyber. 10, 2823–2836 (2019). https://doi.org/10.1007/s13042-018-00906-1

Download citation

Received: 20 October 2017
Accepted: 18 December 2018
Published: 04 January 2019
Issue Date: October 2019
DOI: https://doi.org/10.1007/s13042-018-00906-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Review: machine learning techniques applied to cybersecurity

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Application of Machine Learning in Cybersecurity: A Technological Perceptive

Machine Learning for Intelligent Data Analysis and Automation in Cybersecurity: Current and Future Prospects

Applications and Challenges of Machine Learning (ML) in Cyber Security

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now