Skip to main content
Log in

Android malware detection applying feature selection techniques and machine learning

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Android operating system is known as one of the most popular mobile operating systems. The malware intrusion increases in the same pace as the production of applicable software. Propagation of new and transformed malware in seconds is a critical challenge in malware detection. Android software supplies thousands of features, providing assistance to identify malware applications. In this paper, a novel method based on a random forest algorithm, which applied three different feature selection techniques is proposed. This paper assesses the consequence of applying three different feature selection types including effective, high weight and effective group feature selection. Experiments conducted on Drebin dataset indicate applying the feature selection methods ameliorate the accuracy in terms of metrics and required time. In addition, comparison between the candidate feature selection model and a variety of algorithms as baselines proves the merit of applying feature selection on Random Forest, which outperforms other models based on several metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1:
Algorithm 2:
Algorithm 3:
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. 10 Years of mobile malware, Fortinet Inc, Sunnyvale, United States, 2013. https://www.itp.net

  2. Aafer Y, Du W, Yin H (2013) (LNICST) DroidAPIMiner: mining API-Level features for robust malware detection in android. Insti Comp Sci, Soc Inform Telecommun Engin, SecureComm 127:86–103

    Google Scholar 

  3. Alatwi HA, Oh T, Fokoue E, Stackpole B (2016) Android malware detection using category-based machine learning classifier. In: SIGITE '16: Proceedings of the 17th Annual Conference on Information Technology Education, September 2016, pp 54–59. https://doi.org/10.1145/2978192.2978218

    Chapter  Google Scholar 

  4. Almin SBA, Chatterjee M (2015) A novel approach to detect android malware. Procedia Comp Sci 45:407–417

    Article  Google Scholar 

  5. Alzaylaee MK, Yerima SY, Sezer S (2017) EMULATOR vs REAL PHONE: android malware detection using machine learning. In: IWSPA ‘17: Proceedings of the 3rd ACM on International Workshop on Security and Privacy Analytics Association for Computing Machinery (ACM), pp 65–72. https://doi.org/10.1145/3041008.3041010

    Chapter  Google Scholar 

  6. Arp D, Spreitzenbarth M, Hubner M, Gascon H, Rieck K (2014) DREBIN: effective and explainable detection of android malware in your pocket. In: Network and distributed system security (NDSS), San Diego. https://doi.org/10.14722/ndss.2014.23247

  7. Baker S, Chau M. Jeronimo F (2016) International Data Corporation (IDC)

  8. Bhattacharya A, Goswami RT (2018) A hybrid community based rough set feature selection technique in android malware detection. In: Smart trends in systems, security and sustainability. lecture notes in networks and systems, vol 18. Springer, Singapore, pp 249–258. https://doi.org/10.1007/978-981-10-6916-1_23

    Chapter  Google Scholar 

  9. Bhattacharya A, Goswami RT (2018) Community based feature selection method for detection of android malware. J Global Inform Manag (JGIM) 26(3):26–77

    Google Scholar 

  10. Breiman L (2001) Random forests. Mach Learn 45(1):532

    Article  MATH  Google Scholar 

  11. Dash S, Suarez-Tangil G, Khan S, Tam K, Ahmadi M, Kinder J, Cavallaro L (2016) DroidScribe: classifying android malware based on runtime behavior. In: In Proc. IEEE Symp. Security and Privacy Workshops (SPW), Mobile Security Technologies (MoST), pp 252–261

    Google Scholar 

  12. de la Puerta JG, Sanz B, Grueiro IS, Bringas PG (2015) The evolution of permission as feature for android malware detection. International Joint Conference, Proceedings of the Computational Intelligence in Security for Information Systems Conference, Burgos, Spain 2015:389–400

    Google Scholar 

  13. Firdaus A, Anuar NB, Karim A, Razak MF (2019) Discovering optimal features using static analysis and a genetic search based method for android malware detection. Front Inform Technol Electron Eng 19(6):712–736

    Article  Google Scholar 

  14. García AM, Camacho D, Lara-Cabrera R (2018) Android malware detection through hybrid features fusion and ensemble classifiers: the AndroPyTool framework and the OmniDroid dataset, AlejandroMart’ın, Ra’ulLara-Cabrera, David Camachos, in information fusion. 52:128–142. https://doi.org/10.1016/j.inffus.2018.12.006

  15. Idrees F, Rajarajan M, Conti M, Chen TM, Rahulamathavan Y (2017) PIndroid: a novel android malware detection system using ensemble learning methods. Comp Sec 68:36–46

    Article  Google Scholar 

  16. Kavita S, Gupta BB (2018) Mitigation and risk factor analysis of android applications. Comput Electr Eng 71:416–430

    Article  Google Scholar 

  17. Keyvanpour M (2013) Imani M B, semi-supervised text categorization: exploiting unlabeled data using ensemble learning algorithms. Intell Data Anal 17(3):367–385

    Article  Google Scholar 

  18. Kumar MV, Ksheeraja P, Govardhana DK, Athira S (2019) A survey on android malware detection using machine learning. Int J Sci Res Rev 7(6):105–114

    Google Scholar 

  19. Kumar R, Zhang X, Khan RU, Sharif A (2019) Research on data mining of permission-induced risk for android IoT devices. Appl Sci 9(2):277

    Article  Google Scholar 

  20. Li C, Mills K, Zhu R, Niu D, Zhang H, Kinawi H (2019) Android malware detection based on factorization machine. IEEE Access 7:184008–184019. https://doi.org/10.1109/ACCESS.2019.2958927

    Article  Google Scholar 

  21. Liu X, Liu J (2014) A two-layered permission-based android malware detection scheme. In: 2nd IEEE international conference on Mobile cloud computing, services, and engineering, pp 142–148. https://doi.org/10.1109/MobileCloud.2014.22

    Chapter  Google Scholar 

  22. Liu Z, Lai Y, Chen Y (2015) Android malware detection based on permission combinations. Int J Simul Proc Model 10:315–326

    Google Scholar 

  23. Martín I, Hernández JA, Muñoz A, Guzmán A (2018) Android malware characterization using metadata and machine learning techniques. Secur Commun Netw 2018. https://doi.org/10.1155/2018/5749481

  24. Milosevic N, Dehghantanha A, Choo KKR (2017) Machine learning aided malware classification of Android applications. Comput Electr Eng 61:266–274

    Article  Google Scholar 

  25. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Müller A, Nothman J, Louppe G, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  26. Peiravian N (2013) Data mining heuristic-based malware detection for android applications. Florida Atlantic University, Florida

    Google Scholar 

  27. Sanz B, Santos I, Laorden C, Ugarte-Pedrero X, Bringas PG (2012) On the automatic categorisation of android applications. In: 2012 IEEE Consumer Communications and Networking Conference (CCNC), pp 149–153. https://doi.org/10.1109/CCNC.2012.6181075

    Chapter  Google Scholar 

  28. Sanz B, Santos I, Laorden C, Ugarte-Pedrero X, Bringas PG, Alvarez G (2012) PUMA: permission usage to detect malware in android, advances. Intel Syst Comput 189(AISC):289–298

    Google Scholar 

  29. Shirzad MB, Keyvanpour M (2018) A systematic study of feature selection methods for learning to rank algorithms. IJIRR 8(3):46–67

    Google Scholar 

  30. Shirzad MB, Keyvanpour M (2017) Weighted similarity: a new similarity measure for document ranking features. CSOC (1):273–2280

  31. Shrivastava G, Kumar P (2019) SensDroid: analysis for malicious activity risk of android application. Multimed Tools Appl 78:35713–35731

    Article  Google Scholar 

  32. Sogukpinar M, Sogukpinar I (2014) An android malware detection architecture based on ensemble learning. Trans Mach Learn Artificial Intel 2(3):90–106

    Article  Google Scholar 

  33. Spreitzenbarth M, Echtler F, Schreck T, Freling FC, Hoffmann J (2013) MobileSandbox: Having a deeper look into android applications. In: 28th international AccurACM symposium on applied computing (SAC), March 2013. https://doi.org/10.1145/2480362.2480701

    Chapter  Google Scholar 

  34. Wang Q, Jiang X, Chen M, Li X (2021) Autoweighted multiview feature selection with graph optimization. IEEE Trans Cybern 16. https://doi.org/10.1109/TCYB.2021.3094843

  35. Wang Q, Li Q, Li X (2021) A fast neighborhood grouping method for hyperspectral band selection. IEEE Trans Geosci Remote Sens 59(6):5028–5039. https://doi.org/10.1109/TGRS.2020.3011002

    Article  Google Scholar 

  36. Wen L, Haiyang Y (2017) An android malware detection system based on machine learning. AIP Conf Proc 1864:020136

    Article  Google Scholar 

  37. Yerima S, Sezer S, McWilliams G, Muttik I (2013) A new android malware detection approach using Bayesian classification. In: IEEE 27th International Conference on Advanced Information Networking and Applications (AINA), pp 121–128. https://doi.org/10.1109/AINA.2013.88

    Chapter  Google Scholar 

  38. Yerima SY, Sezer S, Muttik I (2014) Android malware detection using parallel machine learning classifiers, in next generation Mobile applications. Services and Technologies, Oxford

    Google Scholar 

  39. Yerima S, Sezer S, McWilliams G (2014) Analysis of bayesian classification based approaches for android malware detection. Inform Sec, IET 8:25–36

  40. Yerima SY, Sezer S, Muttik I (2015) High accuracy android malware detection using ensemble learning. IET Inf Secur 9(6):313–32041

    Article  Google Scholar 

  41. Yerima SY, Sezer S, Muttik I (2015) Android malware detection: an Eigenspace analysis approach, in Science and Information Conference (SAI), pp 1236–1242. https://doi.org/10.1109/SAI.2015.7237302

    Book  Google Scholar 

  42. Yildiz O, Doğru IA (2019) Permission-based android malware detection system using feature selection with genetic algorithm, international. J Soft Engin Knowl Engin 29(02):245–262

    Article  Google Scholar 

  43. Yuan Y, Xiong Z, Wang Q (July 2019) VSSA-NET: vertical spatial sequence attention network for traffic sign detection. IEEE Trans Image Proc 28(7):3423–3434. https://doi.org/10.1109/TIP.2019.2896952

  44. Zandian ZK, Keyvanpour M (2017) Systematic identification and analysis of different fraud detection approaches based on the strategy ahead. KES J 21(2):123–134

    Article  Google Scholar 

  45. Zandian ZK, Keyvanpour M (2019) Feature extraction method based on social network analysis. Appl Artif Intell 33(8):669–688

    Article  Google Scholar 

Download references

Funding

This research received no grant from any funding agency in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Reza Keyvanpour.

Ethics declarations

Conflict of interest

There is no conflict of interest between the authors regarding the publication of this article.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Keyvanpour, M.R., Barani Shirzad, M. & Heydarian, F. Android malware detection applying feature selection techniques and machine learning. Multimed Tools Appl 82, 9517–9531 (2023). https://doi.org/10.1007/s11042-022-13767-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13767-2

Keywords

Navigation