Abstract
Android operating system is known as one of the most popular mobile operating systems. The malware intrusion increases in the same pace as the production of applicable software. Propagation of new and transformed malware in seconds is a critical challenge in malware detection. Android software supplies thousands of features, providing assistance to identify malware applications. In this paper, a novel method based on a random forest algorithm, which applied three different feature selection techniques is proposed. This paper assesses the consequence of applying three different feature selection types including effective, high weight and effective group feature selection. Experiments conducted on Drebin dataset indicate applying the feature selection methods ameliorate the accuracy in terms of metrics and required time. In addition, comparison between the candidate feature selection model and a variety of algorithms as baselines proves the merit of applying feature selection on Random Forest, which outperforms other models based on several metrics.











Similar content being viewed by others
References
10 Years of mobile malware, Fortinet Inc, Sunnyvale, United States, 2013. https://www.itp.net
Aafer Y, Du W, Yin H (2013) (LNICST) DroidAPIMiner: mining API-Level features for robust malware detection in android. Insti Comp Sci, Soc Inform Telecommun Engin, SecureComm 127:86–103
Alatwi HA, Oh T, Fokoue E, Stackpole B (2016) Android malware detection using category-based machine learning classifier. In: SIGITE '16: Proceedings of the 17th Annual Conference on Information Technology Education, September 2016, pp 54–59. https://doi.org/10.1145/2978192.2978218
Almin SBA, Chatterjee M (2015) A novel approach to detect android malware. Procedia Comp Sci 45:407–417
Alzaylaee MK, Yerima SY, Sezer S (2017) EMULATOR vs REAL PHONE: android malware detection using machine learning. In: IWSPA ‘17: Proceedings of the 3rd ACM on International Workshop on Security and Privacy Analytics Association for Computing Machinery (ACM), pp 65–72. https://doi.org/10.1145/3041008.3041010
Arp D, Spreitzenbarth M, Hubner M, Gascon H, Rieck K (2014) DREBIN: effective and explainable detection of android malware in your pocket. In: Network and distributed system security (NDSS), San Diego. https://doi.org/10.14722/ndss.2014.23247
Baker S, Chau M. Jeronimo F (2016) International Data Corporation (IDC)
Bhattacharya A, Goswami RT (2018) A hybrid community based rough set feature selection technique in android malware detection. In: Smart trends in systems, security and sustainability. lecture notes in networks and systems, vol 18. Springer, Singapore, pp 249–258. https://doi.org/10.1007/978-981-10-6916-1_23
Bhattacharya A, Goswami RT (2018) Community based feature selection method for detection of android malware. J Global Inform Manag (JGIM) 26(3):26–77
Breiman L (2001) Random forests. Mach Learn 45(1):532
Dash S, Suarez-Tangil G, Khan S, Tam K, Ahmadi M, Kinder J, Cavallaro L (2016) DroidScribe: classifying android malware based on runtime behavior. In: In Proc. IEEE Symp. Security and Privacy Workshops (SPW), Mobile Security Technologies (MoST), pp 252–261
de la Puerta JG, Sanz B, Grueiro IS, Bringas PG (2015) The evolution of permission as feature for android malware detection. International Joint Conference, Proceedings of the Computational Intelligence in Security for Information Systems Conference, Burgos, Spain 2015:389–400
Firdaus A, Anuar NB, Karim A, Razak MF (2019) Discovering optimal features using static analysis and a genetic search based method for android malware detection. Front Inform Technol Electron Eng 19(6):712–736
García AM, Camacho D, Lara-Cabrera R (2018) Android malware detection through hybrid features fusion and ensemble classifiers: the AndroPyTool framework and the OmniDroid dataset, AlejandroMart’ın, Ra’ulLara-Cabrera, David Camachos, in information fusion. 52:128–142. https://doi.org/10.1016/j.inffus.2018.12.006
Idrees F, Rajarajan M, Conti M, Chen TM, Rahulamathavan Y (2017) PIndroid: a novel android malware detection system using ensemble learning methods. Comp Sec 68:36–46
Kavita S, Gupta BB (2018) Mitigation and risk factor analysis of android applications. Comput Electr Eng 71:416–430
Keyvanpour M (2013) Imani M B, semi-supervised text categorization: exploiting unlabeled data using ensemble learning algorithms. Intell Data Anal 17(3):367–385
Kumar MV, Ksheeraja P, Govardhana DK, Athira S (2019) A survey on android malware detection using machine learning. Int J Sci Res Rev 7(6):105–114
Kumar R, Zhang X, Khan RU, Sharif A (2019) Research on data mining of permission-induced risk for android IoT devices. Appl Sci 9(2):277
Li C, Mills K, Zhu R, Niu D, Zhang H, Kinawi H (2019) Android malware detection based on factorization machine. IEEE Access 7:184008–184019. https://doi.org/10.1109/ACCESS.2019.2958927
Liu X, Liu J (2014) A two-layered permission-based android malware detection scheme. In: 2nd IEEE international conference on Mobile cloud computing, services, and engineering, pp 142–148. https://doi.org/10.1109/MobileCloud.2014.22
Liu Z, Lai Y, Chen Y (2015) Android malware detection based on permission combinations. Int J Simul Proc Model 10:315–326
Martín I, Hernández JA, Muñoz A, Guzmán A (2018) Android malware characterization using metadata and machine learning techniques. Secur Commun Netw 2018. https://doi.org/10.1155/2018/5749481
Milosevic N, Dehghantanha A, Choo KKR (2017) Machine learning aided malware classification of Android applications. Comput Electr Eng 61:266–274
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Müller A, Nothman J, Louppe G, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
Peiravian N (2013) Data mining heuristic-based malware detection for android applications. Florida Atlantic University, Florida
Sanz B, Santos I, Laorden C, Ugarte-Pedrero X, Bringas PG (2012) On the automatic categorisation of android applications. In: 2012 IEEE Consumer Communications and Networking Conference (CCNC), pp 149–153. https://doi.org/10.1109/CCNC.2012.6181075
Sanz B, Santos I, Laorden C, Ugarte-Pedrero X, Bringas PG, Alvarez G (2012) PUMA: permission usage to detect malware in android, advances. Intel Syst Comput 189(AISC):289–298
Shirzad MB, Keyvanpour M (2018) A systematic study of feature selection methods for learning to rank algorithms. IJIRR 8(3):46–67
Shirzad MB, Keyvanpour M (2017) Weighted similarity: a new similarity measure for document ranking features. CSOC (1):273–2280
Shrivastava G, Kumar P (2019) SensDroid: analysis for malicious activity risk of android application. Multimed Tools Appl 78:35713–35731
Sogukpinar M, Sogukpinar I (2014) An android malware detection architecture based on ensemble learning. Trans Mach Learn Artificial Intel 2(3):90–106
Spreitzenbarth M, Echtler F, Schreck T, Freling FC, Hoffmann J (2013) MobileSandbox: Having a deeper look into android applications. In: 28th international AccurACM symposium on applied computing (SAC), March 2013. https://doi.org/10.1145/2480362.2480701
Wang Q, Jiang X, Chen M, Li X (2021) Autoweighted multiview feature selection with graph optimization. IEEE Trans Cybern 16. https://doi.org/10.1109/TCYB.2021.3094843
Wang Q, Li Q, Li X (2021) A fast neighborhood grouping method for hyperspectral band selection. IEEE Trans Geosci Remote Sens 59(6):5028–5039. https://doi.org/10.1109/TGRS.2020.3011002
Wen L, Haiyang Y (2017) An android malware detection system based on machine learning. AIP Conf Proc 1864:020136
Yerima S, Sezer S, McWilliams G, Muttik I (2013) A new android malware detection approach using Bayesian classification. In: IEEE 27th International Conference on Advanced Information Networking and Applications (AINA), pp 121–128. https://doi.org/10.1109/AINA.2013.88
Yerima SY, Sezer S, Muttik I (2014) Android malware detection using parallel machine learning classifiers, in next generation Mobile applications. Services and Technologies, Oxford
Yerima S, Sezer S, McWilliams G (2014) Analysis of bayesian classification based approaches for android malware detection. Inform Sec, IET 8:25–36
Yerima SY, Sezer S, Muttik I (2015) High accuracy android malware detection using ensemble learning. IET Inf Secur 9(6):313–32041
Yerima SY, Sezer S, Muttik I (2015) Android malware detection: an Eigenspace analysis approach, in Science and Information Conference (SAI), pp 1236–1242. https://doi.org/10.1109/SAI.2015.7237302
Yildiz O, Doğru IA (2019) Permission-based android malware detection system using feature selection with genetic algorithm, international. J Soft Engin Knowl Engin 29(02):245–262
Yuan Y, Xiong Z, Wang Q (July 2019) VSSA-NET: vertical spatial sequence attention network for traffic sign detection. IEEE Trans Image Proc 28(7):3423–3434. https://doi.org/10.1109/TIP.2019.2896952
Zandian ZK, Keyvanpour M (2017) Systematic identification and analysis of different fraud detection approaches based on the strategy ahead. KES J 21(2):123–134
Zandian ZK, Keyvanpour M (2019) Feature extraction method based on social network analysis. Appl Artif Intell 33(8):669–688
Funding
This research received no grant from any funding agency in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
There is no conflict of interest between the authors regarding the publication of this article.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Keyvanpour, M.R., Barani Shirzad, M. & Heydarian, F. Android malware detection applying feature selection techniques and machine learning. Multimed Tools Appl 82, 9517–9531 (2023). https://doi.org/10.1007/s11042-022-13767-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-13767-2