Abstract
Malware is continuously evolving and becoming more sophisticated to avoid detection. Traditionally, the Windows operating system has been the most popular target for malware writers because of its dominance in the market of desktop operating systems. However, despite a large volume of new Windows malware samples that are collected daily, there is relatively little research focusing on Windows malware. The Windows Registry, or simply the registry, is very heavily used by programs in Windows, making it a good source for detecting malicious behavior. In this paper, we present RAMD, a novel approach that uses an ensemble classifier consisting of multiple one-class classifiers to detect known and especially unknown malware abusing registry keys and values for malicious intent. RAMD builds a model of registry behavior of benign programs and then uses this model to detect malware by looking for anomalous registry accesses. In detail, it constructs an initial ensemble classifier by training multiple one-class classifiers and then applies a novel swarm intelligence pruning algorithm, called memetic firefly-based ensemble classifier pruning (MFECP), on the ensemble classifier to reduce its size by selecting only a subset of one-class classifiers that are highly accurate and have diversity in their outputs. To combine the outputs of one-class classifiers in the pruned ensemble classifier, RAMD uses a specific aggregation operator, called Fibonacci-based superincreasing ordered weighted averaging (FSOWA). The results of our experiments performed on a dataset of benign and malware samples show that RAMD can achieve about 98.52% detection rate, 2.19% false alarm rate, and 98.43% accuracy.






Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abbas H, Yasin M, Ahmed F, Sajid A, Khan FA, Ashfaq RAR, Haldar NAH (2016) Forensic artifacts modeling for social media client applications to enhance investigatory learning mechanisms. J Intell Fuzzy Syst 31(5):2645–2658. https://doi.org/10.3233/JIFS-169105
Alazab M (2015) Profiling and classifying the behavior of malicious codes. J Syst Softw 100:91–102. https://doi.org/10.1016/j.jss.2014.10.031
Apap F, Honig A, Hershkop S, Eskin E, Stolfo SJ (2002) Detecting malicious software by monitoring anomalous Windows Registry accesses. In: Proceedings of the 5th International Symposium on Recent Advances in Intrusion Detection (RAID’02), pp 36-53. https://doi.org/10.1007/3-540-36084-0_3. Springer, Berlin
AV-TEST (2017) Security report 2016/17 https://www.av-test.org/fileadmin/pdf/security_report/AV-TEST_security_report_2016-2017.pdf
Brown G, Wyatt J, Harris R, Yao X (2005) Diversity creation methods: a survey and categorisation. Inf Fusion 6(1):5–20. https://doi.org/10.1016/j.inffus.2004.04.004
Carvey H (2016) Windows Registry Forensics: Advanced Digital Forensic Analysis of the Windows Registry, 2nd edn. Syngress, Amsterdam
Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41(3):15:1–15:58. https://doi.org/10.1145/1541880.1541882
Christodorescu M, Jha S (2003) Static analysis of executables to detect malicious patterns. In: Proceedings of the 12th USENIX Security Symposium (Security’03), pp 169-186, USENIX Association, Berkeley, CA, USA
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Ding Y, Xia X, Chen S, Li Y (2018) A malware detection method based on family behavior graph. Comput Secur 73:73–86. https://doi.org/10.1016/j.cose.2017.10.007
Ding Y, Yuan X, Tang K, Xiao X, Zhang Y (2013) A fast malware detection algorithm based on objective-oriented association mining. Comput Secur 39:315–324. https://doi.org/10.1016/j.cose.2013.08.008
Duin RPW, Tax DMJ (2000) Experiments with classifier combining rules. In: Proceedings of the 1st International Workshop on Multiple Classifier Systems (MCS’00). https://doi.org/10.1007/3-540-45014-9_2. Springer, Berlin, pp 16–29
Eskin E (2002) Probabilistic anomaly detection over discrete records using inconsistency checks. Technical report, Department of Computer Science Columbia University
Fattori A, Lanzi A, Balzarotti D, Kirda E (2015) Hypervisor-based malware protection with AccessMiner. Comput Secur 52:33–50. https://doi.org/10.1016/j.cose.2015.03.007
Galal HS, Mahdy YB, Atiea MA (2016) Behavior-based features model for malware detection. J Comput Virol Hacking Techniques 12(2):59–67. https://doi.org/10.1007/s11416-015-0244-0
Gautam C, Tiwari A, Leng Q (2017) On the construction of extreme learning machine for online and offline one-class classification–an expanded toolbox. Neurocomputing 261:126–143. https://doi.org/10.1016/j.neucom.2016.04.070
Ghaffari F, Abadi M, Tajoddin A (2017) AMD-EC: anomaly-based android malware detection using ensemble classifiers. In: Proceedings of the 2017 25th Iranian Conference on Electrical Engineering (ICEE’17), pp 2247-2252. https://doi.org/10.1109/IranianCEE.2017.7985436. IEEE, Piscataway
Guo X, Yin Y, Dong C, Yang G, Zhou G (2008) On the class imbalance problem. In: Proceedings of the 2008 4th International Conference on Natural Computation (ICNC’08), pp 192-201. https://doi.org/10.1109/ICNC.2008.871. IEEE, Piscataway
Gupta S, Kumar P (2015) An immediate system call sequence based approach for detecting malicious program executions in cloud environment. Wirel Pers Commun 81(1):405–425. https://doi.org/10.1007/s11277-014-2136-x
Halsey M, Bettany A (2015) Windows Registry troubleshooting. Apress, New York. https://doi.org/10.1007/978-1-4842-0992-9
Heller KA, Svore KM, Keromytis AD, Stolfo SJ (2003) One class support vector machines for detecting anomalous Windows Registry accesses. In: Proceedings of the 2003 ICDM Workshop on Data Mining for Computer Security (DMSEC’03), pp 1–8. https://doi.org/10.7916/D85M6CFF
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844. https://doi.org/10.1109/34.709601
Hollander M, Wolfe DA, Chicken E (2014) Nonparametric statistical methods, 3rd edn. Wiley, Hoboken
Hosseini Bamakan SM, Wang H, Shi Y (2017) Ramp loss K-support vector classification-regression: a robust and sparse multi-class approach to the intrusion detection problem. Knowl-Based Syst 126:113–126. https://doi.org/10.1016/j.knosys.2017.03.012
Jodavi M, Abadi M (2015) JSObfusDetector: a binary PSO-based one-class classifier ensemble to detect obfuscated JavaScript code. In: Proceedings of the 2015 International Symposium on Artificial Intelligence and Signal Processing (AISP’15), pp 322-327. https://doi.org/10.1109/AISP.2015.7123508. IEEE, Piscataway
Jodavi M, Abadi M, Parhizkar E (2015) DbDHunter: an ensemble-based anomaly detection approach to detect drive-by download attacks. In: Proceedings of the 2015 5th International Conference on Computer and Knowledge Engineering (ICCKE’15), pp 273-278. https://doi.org/10.1109/ICCKE.2015.7365841. IEEE, Piscataway
Juszczak P, Tax DMJ, Pekalska E, Duin RPW (2009) Minimum spanning tree based one-class classifier. Neurocomputing 72(7–9):1859–1869. https://doi.org/10.1016/j.neucom.2008.05.003
Karaboga D, Gorkemli B, Ozturk C, Karaboga N (2014) A comprehensive survey: artificial bee colony (ABC) algorithm and applications. Artif Intell Rev 42(1):21–57. https://doi.org/10.1007/s10462-012-9328-0
Kazem A, Sharifi E, Hussain FK, Saberi M, Hussain OK (2013) Support vector regression with chaos-based firefly algorithm for stock market price forecasting. Appl Soft Comput 13(2):947–958. https://doi.org/10.1016/j.asoc.2012.09.024
Khan SS, Madden MG (2014) One-class classification: taxonomy of study and review of techniques. Knowl Eng Rev 29(3):345–374. https://doi.org/10.1017/S026988891300043X
Khatri Y (2015) Forensic implications of System Resource Usage Monitor (SRUM) data in Windows 8. Digit Investig 12:53–65. https://doi.org/10.1016/j.diin.2015.01.002
Khreich W, Murtaza SS, Hamou-Lhadj A, Talhi C (2018) Combining heterogeneous anomaly detectors for improved software security. J Syst Softw 137:415–429. https://doi.org/10.1016/j.jss.2017.02.050
Kirat D, Vigna G (2015) MalGene: Automatic extraction of malware analysis evasion signature. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS’15), pp 769-780. https://doi.org/10.1145/2810103.2813642. ACM, New York
Kirat D, Vigna G, Kruegel C (2014) BareCloud: bare-metal analysis-based evasive malware detection. In: Proceedings of the 23rd USENIX Security Symposium (Security’14), pp 287-301, USENIX Association, Berkeley, CA, USA
Kolbitsch C, Comparetti PM, Kruegel C, Kirda E, Zhou X, Wang X (2009) Effective and efficient malware detection at the end host. In: Proceedings of the 18th USENIX Security Symposium (Security’09), pp 351-366, USENIX Association, Berkeley, CA, USA
Kramer O (2017) Genetic algorithm essentials. Springer international publishing. Cham, Switzerland. https://doi.org/10.1007/978-3-319-52156-5
Krawczyk B, Woźniak M (2016) Dynamic classifier selection for one-class classification. Knowl-Based Syst 107:43–53. https://doi.org/10.1016/j.knosys.2016.05.054
Kuncheva LI, Whitaker CJ (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51(2):181–207. https://doi.org/10.1023/A:1022859003006
Lei B, Xu G, Feng M, Zou Y, van der Heijden F, de Ridder D, Tax DMJ (2017) Classification, parameter estimation and state estimation: an engineering approach using MATLAB, 2nd edn. Wiley, Hoboken
Liu J, Miao Q, Sun Y, Song J, Quan Y (2016) Fast structural ensemble for one-class classification. Pattern Recogn Lett 80:179–187. https://doi.org/10.1016/j.patrec.2016.06.028
Long NC, Meesad P, Unger H (2015) A highly accurate firefly based algorithm for heart disease prediction. Expert Syst Appl 42(21):8221–8231. https://doi.org/10.1016/j.eswa.2015.06.024
Luo L, Ming J, Wu D, Liu P, Zhu S (2017) Semantics-based obfuscation-resilient binary code similarity comparison with applications to software and algorithm plagiarism detection. IEEE Trans Softw Eng 43(12):1157–1177. https://doi.org/10.1109/TSE.2017.2655046
Mandayam Comar P, Liu L, Saha S, Tan PN, Nucci A (2013) Combining supervised and unsupervised learning for zero-day malware detection. In: Proceedings of the 32nd IEEE International Conference on Computer Communications (INFOCOM’13), pp 2022-2030. https://doi.org/10.1109/INFCOM.2013.6567003. IEEE, Piscataway
Miao Q, Liu J, Cao Y, Song J (2016) Malware detection using bilayer behavior abstraction and improved one-class support vector machines. Int J Inf Secur 15(4):361–379. https://doi.org/10.1007/s10207-015-0297-6
Miller RG Jr (1997) Beyond ANOVA: basics of applied statistics. Chapman and Hall/CRC, London
Naval S, Laxmi V, Rajarajan M, Gaur MS, Conti M (2015) Employing program semantics for malware detection. IEEE Trans Inf Forensics Secur 10(12):2591–2604. https://doi.org/10.1109/TIFS.2015.2469253
Neri F, Cotta C (2012) Memetic algorithms and memetic computing optimization: a literature review. Swarm Evol Comput 2:1–14. https://doi.org/10.1016/j.swevo.2011.11.003
Nissim N, Lapidot Y, Cohen A, Elovici Y (2018) Trusted system-calls analysis methodology aimed at detection of compromised virtual machines using sequential mining. Knowl-Based Syst 153:147–175. https://doi.org/10.1016/j.knosys.2018.04.033
O’Kane P, Sezer S, Mclaughlin K (2011) Obfuscation: the hidden malware. IEEE Secur Priv 9(5):41–47. https://doi.org/10.1109/MSP.2011.98
Parhizkar E, Abadi M (2015) BeeOWA: a novel approach based on ABC algorithm and induced OWA operators for constructing one-class classifier ensembles. Neurocomputing 166:367–381. https://doi.org/10.1016/j.neucom.2015.03.051
Reformat M, Yager RR (2008) Building ensemble classifiers using belief functions and OWA operators. Soft Comput 12(6):543–558. https://doi.org/10.1007/s00500-007-0227-2
Rokach L (2010) Ensemble-based classifiers. Artif Intell Rev 33(1):1–39. https://doi.org/10.1007/s10462-009-9124-7
Rudd EM, Rozsa A, Günther M, Boult TE (2017) A survey of stealth malware: attacks, mitigation measures, and steps toward autonomous open world solutions. IEEE Commun Surv Tutorials 19(2):1145–1172. https://doi.org/10.1109/COMST.2016.2636078
Sengupta S, Das AK (2016) An approach to development of an ensemble classification system. In: Proceedings of the 2016 2nd International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN’16), pp 218-223. https://doi.org/10.1109/ICRCICN.2016.7813659. IEEE, Piscataway
Shen YD, Zhang Z, Yang Q (2002) Objective-oriented utility-based association mining. In: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM’02), pp 426-433. https://doi.org/10.1109/ICDM.2002.1183938. IEEE, Piscataway
Stolfo SJ, Apap F, Eskin E, Heller KA, Hershkop S, Honig A, Svore KM (2005) A comparative evaluation of two algorithms for Windows Registry anomaly detection. J Comput Secur 13(4):659–693. https://doi.org/10.3233/JCS-2005-13403
Su H, Cai Y, Du Q (2017) Firefly-algorithm-inspired framework with band selection and extreme learning machine for hyperspectral image classification. IEEE J Sel Topics Appl Earth Observations Remote Sens 10(1):309–320. https://doi.org/10.1109/JSTARS.2016.2591004
Symantec (2016) Internet security threat report (ISTR) https://www.symantec.com/content/dam/symantec/docs/reports/istr-21-2016-en.pdf
Tax DMJ (2018) DDTools, the data description toolbox for MATLAB. Version 2.1.3
Wasikowski M, Chen XW (2010) Combating the small sample class imbalance problem using feature selection. IEEE Trans Knowl Data Eng 22(10):1388–1400. https://doi.org/10.1109/TKDE.2009.187
Xing HJ, Ji M (2018) Robust one-class support vector machine with rescaled hinge loss function. Pattern Recogn 84:152–164. https://doi.org/10.1016/j.patcog.2018.07.015
Xing HJ, Wang XZ (2017) Selective ensemble of SVDDs with Renyi entropy based diversity measure. Pattern Recogn 61:185–196. https://doi.org/10.1016/j.patcog.2016.07.038
Yager RR (1988) On ordered weighted averaging aggregation operators in multicriteria decisionmaking. IEEE Trans Syst Man Cybern 18(1):183–190. https://doi.org/10.1109/21.87068
Yager RR (1993) Families of OWA operators. Fuzzy Sets Syst 59(2):125–148. https://doi.org/10.1016/0165-0114(93)90194-M
Yager RR, Grichnik AJ, Yager RL (2014) A soft computing approach to controlling emissions under imperfect sensors. IEEE Trans Syst Man Cybern 44(6):687–691. https://doi.org/10.1109/TSMC.2013.2268735
Yahyazadeh M, Abadi M (2015) BotGrab: a negative reputation system for botnet detection. Comput Electr Eng 41:68–85. https://doi.org/10.1016/j.compeleceng.2014.10.010
Yang XS (2010) Firefly algorithm, stochastic test functions and design optimisation. Int J Bio-Inspired Comput 2(2):78–84. https://doi.org/10.1504/IJBIC.2010.032124
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Tajoddin, A., Abadi, M. RAMD: registry-based anomaly malware detection using one-class ensemble classifiers. Appl Intell 49, 2641–2658 (2019). https://doi.org/10.1007/s10489-018-01405-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-018-01405-0