Abstract
Malware detection from the smartphone has become a challenging issue for academicians and researchers. In this research paper, we applied five distinct machine learning algorithms and three different ensemble methods to develop a model for detecting malware from an Android-based smartphone. In this study, we proposed a framework that helps in selecting the right sets of the feature with an aim to improve the performance of the malware detection models. The proposed malware detection framework is then validated by considering two distinct performance parameters, i.e., accuracy and F-measure as a benchmark to detect malware from real-world apps. We performed an empirical study on thirty different categories of Android apps. The experimental data set consists of 1,94,659 benign apps and 67,538 malware apps that are collected from different promised repositories. Empirical results reveal that the models developed by using the proposed feature selection framework are able to detect more malware-infected apps when compared to all extracted feature sets. Moreover, the malware detection model build by using nonlinear ensemble decision tree forest (NDTF) approach is achieved a detection rate of 98.8%. In addition to that, the proposed malware detection framework is more effective in detecting malware-infected apps as compared to different anti-virus scanners and different frameworks or approaches developed in the literature.










Similar content being viewed by others
Notes
References
Allix K, Bissyandé TF, Jérome Q, Klein J, Traon YL et al (2016) Empirical assessment of machine learning-based malware detectors for android. Empir Softw Eng 21(1):183–211
Alzaylaee MK, Yerima SY, Sezer S (2020) Dl-droid: deep learning based android malware detection using real devices. Comput Secur 89:101663
Arp D, Spreitzenbarth M, Hubner M, Gascon H, Rieck K, Siemens C (2014) Drebin: effective and explainable detection of android malware in your pocket. In: Ndss, vol 14, pp 23–26
Azmoodeh A, Dehghantanha A, Choo KKR (2018) Robust malware detection for internet of (battlefield) things devices using deep eigenspace learning. IEEE Trans Sustain Comput 4(1):88–95
Badhani S, Muttoo SK (2019) Android malware detection using code graphs. In: Kapur P et al (eds) System Performance and management analytics. Springer, Singapore, pp 203–215
Battiti R (1992) First- and second-order methods for learning: between steepest descent and newton’s method. Neural Comput 4(2):141–166
Birendra C (2016) Android permission model. arXiv:160704256
Burguera I, Zurutuza U, Nadjm-Tehrani S (2011) Crowdroid: behavior-based malware detection system for android. In: Proceedings of the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, pp 15–26
Chen KZ, Johnson NM, D’Silva V, Dai S, MacNamara K, Magrino TR, Wu EX, Rinard M, Song DX (2013) Contextual policy enforcement in android applications with permission event graphs. In: NDSS, p 234
Chidamber SR, Kemerer CF (1991) Towards a metrics suite for object oriented design. In: Conference Proceedings on Object-Oriented Programming Systems, Languages, and Applications, pp 197–211
Desnos A et al. (2013) Androguard-reverse engineering, malware and goodware analysis of android applications. URL code google com/p/androguard 153
Dini G, Martinelli F, Saracino A, Sgandurra D (2012) Madam: a multi-level anomaly detector for android malware. In: International Conference on Mathematical Methods, Models, and Architectures for Computer Network Security. Springer, pp 240–253
Enck W, Gilbert P, Han S, Tendulkar V, Chun BG, Cox LP, Jung J, McDaniel P, Sheth AN (2014) Taintdroid: an information-flow tracking system for realtime privacy monitoring on smartphones. ACM Trans Comput Syst 32(2):1–29
Faruki P, Ganmoor V, Laxmi V, Gaur MS, Bharmal A (2013) Androsimilar: robust statistical feature signature for android malware detection. In: Proceedings of the 6th International Conference on Security of Information and Networks, pp 152–159
Fereidooni H, Conti M, Yao D, Sperduti A (2016) Anastasia: android malware detection using static analysis of applications. In: 2016 8th IFIP International Conference on New Technologies, Mobility and Security (NTMS). IEEE, pp 1–5
Gonzalez H, Stakhanova N, Ghorbani AA (2014) Droidkin: lightweight detection of android apps similarity. In: International Conference on Security and Privacy in Communication Networks. Springer, pp 436–453
Horowitz JL, Savin N (2001) Binary response models: logits, probits and semiparametrics. J Econ Perspect 15(4):43–56
Hou S, Ye Y, Song Y, Abdulhayoglu M (2017) Hindroid: an intelligent android malware detection system based on structured heterogeneous information network. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 1507–1515
Idrees F, Rajarajan M, Conti M, Chen TM, Rahulamathavan Y (2017) Pindroid: a novel android malware detection system using ensemble learning methods. Comput Secur 68:36–46
Kadir AFA, Stakhanova N, Ghorbani AA (2015) Android botnets: what urls are telling us. In: International Conference on Network and System Security. Springer, pp 78–91
Karbab EB, Debbabi M, Derhab A, Mouheb D (2018) Maldozer: automatic framework for android malware detection using deep learning. Digit Investig 24:S48–S59
Kaur J, Singh S, Kahlon KS, Bassi P (2010) Neural network: a novel technique for software effort estimation. Int J Comput Theory Eng 2(1):17
Kothari CR (2004) Research methodology: methods and techniques. New Age International, New Delhi
Kumar L, Hota C, Mahindru A, Neti LBM (2019) Android malware prediction using extreme learning machine with different kernel functions. In: Proceedings of the Asian Internet Engineering Conference, pp 33–40
Lashkari AH, Kadir AFA, Taheri L, Ghorbani AA (2018) Toward developing a systematic approach to generate benchmark android malware datasets and classification. In: 2018 International Carnahan Conference on Security Technology (ICCST). IEEE, pp 1–7
Lee WY, Saxe J, Harang R (2019) Seqdroid: obfuscated android malware detection using stacked convolutional and recurrent neural networks. In: Alazab M, Tang M (eds) Deep learning applications for cyber security. Springer, Cham, pp 197–210
Lindorfer M, Neugschwandtner M, Weichselbaum L, Fratantonio Y, Veen VVD, Platzer C (2014) Andrubis–1,000,000 apps later: a view on current android malware behaviors. In: 2014 Third International Workshop on Building Analysis Datasets and Gathering Experience Returns for Security (BADGERS). IEEE, pp 3–17
Mahindru A, Sangal A (2019) Deepdroid: feature selection approach to detect android malware using deep learning. In: 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS). IEEE, pp 16–19
Mahindru A, Sangal A (2020a) Dldroid: feature selection based malware detection framework for android apps developed during covid-19. Int J Emerg Technol 11(3):516–525
Mahindru A, Sangal A (2020b) Feature-based semi-supervised learning to detect malware from android. In: Satapathy et al. (eds) Automated software engineering: a deep learning-based approach. Springer, pp 93–118
Mahindru A, Sangal A (2020c) Gadroid: a framework for malware detection from android by using genetic algorithm as feature selection approach. Int J Adv Sci Technol 29(5):5532–5543
Mahindru A, Sangal A (2020d) Mldroid–framework for android malware detection using machine learning techniques. Neural Comput Appl 1–58. https://doi.org/10.1007/s00521-020-05309-4
Mahindru A, Sangal A (2020e) Parudroid: validation of android malware detection dataset. J Cybersecur Inf Manag 3(2):42–52
Mahindru A, Sangal A (2020f) Perbdroid: effective malware detection model developed using machine learning classification techniques. In: Singh J et al (eds) A journey towards bio-inspired techniques in software engineering. Springer, Berlin, pp 103–139
Mahindru A, Sangal A (2020g) Semidroid: a behavioral malware detector based on unsupervised machine learning techniques using feature selection approaches. Int J Mach Learn Cybern 1–43. https://doi.org/10.1007/s13042-020-01238-9
Mahindru A, Sangal A (2020h) Somdroid: android malware detection by artificial neural network trained using unsupervised learning. Evol Intell 1–31. https://doi.org/10.1007/s12065-020-00518-1
Mahindru A, Singh P (2017) Dynamic permissions based android malware detection using machine learning techniques. In: Proceedings of the 10th Innovations in Software Engineering Conference, pp 202–210
Mariconti E, Onwuzurike L, Andriotis P, De Cristofaro E, Ross G, Stringhini G (2016) Mamadroid: detecting android malware by building Markov chains of behavioral models. arXiv:161204433
Martín A, Menéndez HD, Camacho D (2017) Mocdroid: multi-objective evolutionary classifier for android malware detection. Soft Comput 21(24):7405–7415
Mas’ud MZ, Sahib S, Abdollah MF, Selamat SR, Yusof R (2014) Analysis of features selection and machine learning classifier in android malware detection. In: 2014 International Conference on Information Science & Applications (ICISA). IEEE, pp 1–5
McLaughlin N, del Rincon JM, Kang B, Yerima S, Miller P, Sezer S, Safaei Y, Trickel E, Zhao Z, Doupé A et al (2017) Deep android malware detection. In: Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy, pp 301–308
Narayanan A, Chandramohan M, Chen L, Liu Y (2018) A multi-view context-aware approach to android malware detection and malicious code localization. Empir Softw Eng 23(3):1222–1274
Narudin FA, Feizollah A, Anuar NB, Gani A (2016) Evaluation of machine learning classifiers for mobile malware detection. Soft Comput 20(1):343–357
Saracino A, Sgandurra D, Dini G, Martinelli F (2016) Madam: effective and efficient behavior-based android malware detection and prevention. IEEE Trans Dependable Secure Comput 15(1):83–97
Shabtai A, Kanonov U, Elovici Y, Glezer C, Weiss Y (2012) “Andromaly”: a behavioral malware detection framework for android devices. J Intell Inf Syst 38(1):161–190
Shahzad F, Akbar M, Khan S, Farooq M (2013) Tstructdroid: realtime malware detection using in-execution dynamic analysis of kernel process control blocks on android. National University of Computer & Emerging Sciences, Islamabad, Pakistan, Technical report
Shankar VG, Somani G, Gaur MS, Laxmi V, Conti M (2017) Androtaint: an efficient android malware detection framework using dynamic taint analysis. In: 2017 ISEA Asia Security and Privacy (ISEASP). IEEE, pp 1–13
Suarez-Tangil G, Dash SK, Ahmadi M, Kinder J, Giacinto G, Cavallaro L (2017) Droidsieve: fast and accurate classification of obfuscated android malware. In: Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy, pp 309–320
Tam K, Khan SJ, Fattori A, Cavallaro L (2015) Copperdroid: automatic reconstruction of android malware behaviors. In: Ndss
Xu R, Saïdi H, Anderson R (2012) Aurasium: practical policy enforcement for android applications. In: Presented as part of the 21st \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 12), pp 539–552
Yerima SY, Sezer S, McWilliams G, Muttik I (2013) A new android malware detection approach using Bayesian classification. In: 2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA). IEEE, pp 121–128
Yerima SY, Sezer S, McWilliams G (2014) Analysis of Bayesian classification-based approaches for android malware detection. IET Inf Secur 8(1):25–36
Yuan Z, Lu Y, Xue Y (2016) Droiddetector: android malware characterization and detection using deep learning. Tsinghua Sci Technol 21(1):114–123
Zhang C, Wei H, Xie L, Shen Y, Zhang K (2016) Direct interval forecasting of wind speed using radial basis function neural networks in a multi-objective optimization framework. Neurocomputing 205:53–63
Zhou Y, Jiang X (2012) Android malware genome project. Disponibile a http://www.malgenomeproject.org
Zhu HJ, You ZH, Zhu ZX, Shi WL, Chen X, Cheng L (2018) Droiddet: effective and robust detection of android malware using static analysis along with rotation forest model. Neurocomputing 272:638–646
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mahindru, A., Sangal, A.L. HybriDroid: an empirical analysis on effective malware detection model developed using ensemble methods. J Supercomput 77, 8209–8251 (2021). https://doi.org/10.1007/s11227-020-03569-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-020-03569-4