Skip to main content

Anomaly Android Malware Detection: A Comparative Analysis of Six Classifiers

  • Conference paper
  • First Online:
Information and Communication Technology and Applications (ICTA 2020)

Abstract

The high proliferation rate of Android devices has exposed the platform to wider vulnerabilities of increasing malware attacks. Emerging trends of the malware threats are employing highly sophisticated and dynamic detection avoidance techniques. This has continued to weaken the capacity of existing signature-based detection systems in their protection against new and unknown threats. Thus, the need for effective detection approaches for unknown and novel Android malware has remained a growing challenge in the field of mobile and information security. This study therefore aimed at investigating the best performing machine learning classification algorithm for the anomaly Android malware detection, leveraging on permission-based feature sets, by conduction a performance comparison analysis between six different classification algorithms namely: Naïve Bayes, Simple Logistics, Random Forest, PART, k-Nearest Neighbours (k-NN), and Support Vector Machine (SVM). The Machine learning tool that was used for the pre-processing of the feature sets and the classification processes is WEKA 3.8.2 suite. Findings of the study showed that Random Forest had the best detection result with false alarm rate of 2.2%, accuracy of \(97.4\%\), error rate of \(2.6\%\) and ROC Area of \(99.6\%\). The study concluded that, using Android permission features, Random Forest and k-Nearest Neighbours recoded best performances in Android malware detection, followed by Support Vector Machine and Simple Logistics classification algorithms. Partial Decision Tree (PART) performed relatively well, while Naïve Baye recorded the least performance. Consequently, the deployment of Random Forest model and k-NN model are recommended for the development of an anomaly Android malware detection paradigm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Arp, D., Spreitzenbarth, M., Malte, H., Gascon, H., Rieck, K.: Drebin: effective and explainable detection of android malware in your pocket. In: Symposium on Network and Distributed System Security (NDSS), pp. 23–26, February 2014

    Google Scholar 

  2. Russell, I., Markov, Z.: An introduction to the weka data mining system. In: Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education - SIGCSE 2017, p. 742 (2017)

    Google Scholar 

  3. Namratha, M., Prajwala, T.R.: A comprehensive overview of clustering algorithms in pattern recognition. J. Comput. Eng. 4(6), 23–30 (2012)

    Google Scholar 

  4. Garg, B.: Design and Development of Naive Bayes Classifier (Master Thesis). North Dakota State University of Agriculture and Applied Science (2013)

    Google Scholar 

  5. Brownlee, J.: Supervised and unsupervised machine learning algorithms. Understand Machine Learning Algorithms (2016). Accessed 13 May 2018

    Google Scholar 

  6. Scott, G.: ML 101: Reinforcement Learning (2015). Accessed 23 Nov 2017

    Google Scholar 

  7. Guyon, I., Elisseeff, A.: Feature extraction, foundations and applications: an introduction to feature extraction. Stud. Fuzziness Soft Comput. 207, 1–25 (2006)

    Article  Google Scholar 

  8. da Gama, J.M.P.: Combining Classification Algorithms (Doctoral Thesis). Faculdade de Ci^encias da Universidade do Porto (1999)

    Google Scholar 

  9. Komal, A.: A Survey on malicious detection technique using data mining and analyzing in web security. In: 2016 IJEDR, vol. 4, no. 2, pp. 319–322 (2016)

    Google Scholar 

  10. Duch, W., Grudzinski, K.: Meta-learning: searching in the model space. In: Proceedings of the International Conference on Neural Information Processing (ICONIP), pp. 235–240 (2001)

    Google Scholar 

  11. Tan, P.-N., Steinbach, M., Kumar, V.: Classification: basic concepts , decision trees , and model evaluation classification. In: Introduction to Data Mining, vol. 1, pp. 145–205 (2006)

    Google Scholar 

  12. Musthaler, L.: Forget signatures for malware detection. SparkCognition says AI is 99% effective. Network World (2017). Accessed 03 Jun 2019

    Google Scholar 

  13. Richter, L.: Common weaknesses of android malware analysis frameworks. In: IT Security Conference, University of Erlangen-Nuremberg During Summer Term 2015, pp. 1–10 (2015)

    Google Scholar 

  14. Vidas, T., Tan, J., Nahata, J., Tan, C.L., Christin, N., Tague, P.: A5: automated analysis of adversarial android applications. In: SPSM 2014 Proc. 4th ACM Working Security and Privacy Smartphones Mobile Devices, pp. 39–50 (2014)

    Google Scholar 

  15. Yerima, S.Y., Sezer, S.: DroidFusion: a novel multilevel classifier fusion approach for android malware detection. IEEE Trans. Cybern. 49, 453–4566 (2018)

    Article  Google Scholar 

  16. Alzaylaee, M.K., Yerima, S.Y., Sezer, S.: DL-droid: deep learning based android malware detection using real devices. Comput. Secur. 89, 101663 (2020)

    Google Scholar 

  17. Shhadat, I., Bataineh, B., Hayajneh, A., Al-Sharif, Z.A.: The use of machine learning techniques to advance the detection and classification of unknown malware. Procedia Comput. Sci. 170(2019), 917–922 (2020)

    Article  Google Scholar 

  18. Jiang, X., Mao, B., Guan, J., Huang, X.: Android malware detection using fine-grained features. Sci. Program. 2020, 1–3 (2020)

    Google Scholar 

  19. Liu, K., Xu, S., Xu, G., Zhang, M., Sun, D., Liu, H.: A review of android malware detection approaches based on machine learning. IEEE Access 8, 124579–124607 (2020)

    Article  Google Scholar 

  20. Arslan, R.S., Yurttakal, A.H.: K-nearest neighbour classifier usage for permission based malware detection in android. Icontech Int. J. 4(2), 15–27 (2020)

    Article  Google Scholar 

  21. Kedziora, M., Gawin, P., Szczepanik, M., Jozwiak, I.: Malware detection using machine learning algorithms and reverse engineering of android java code. Int. J. Netw. Secur. Appl. 11(01), 01–14 (2019)

    Google Scholar 

  22. Dada, E.G., Bassi, J.S., Hurcha, Y.J., Alkali, A.H.: Performance evaluation of machine learning algorithms for detection and prevention of malware attacks. J. Comput. Eng. 21(3), 18–27 (2019)

    Google Scholar 

  23. Memon, L.U., Bawany, N.Z., Shamsi, J.A.: A comparison of machine learning techniques for android malware detection using apache spark. J. Eng. Sci. Technol. 14(3), 1572–1586 (2019)

    Google Scholar 

  24. Rashidi, B., Fung, C., Bertino, E.: Android malicious application detection using support vector machine and active learning. In: 2017 13th International Conference on Network and Service Management, CNSM 2017, vol. 2018 (2018)

    Google Scholar 

  25. Ucci, D., Aniello, L., Baldoni, R.: Survey on the usage of machine learning techniques for malware analysis. Comput. Secur. 1(1), 1–67 (2018)

    Google Scholar 

  26. Idrees, F., Rajarajan, M., Conti, M., Chen, T.M., Rahulamathavan, Y.: PIndroid: a novel android malware detection system using ensemble learning methods. Comput. Secur. 68, 36–46 (2017)

    Article  Google Scholar 

  27. Al Ali, M., Svetinovic, D., Aung, Z., Lukman, S.: Malware detection in android mobile platform using machine learning algorithms. In: International Conference on Infocom Technologies and Unmanned Systems (ICTUS 2017), pp. 4–9 (2017)

    Google Scholar 

  28. Yerima, S.Y., Sezer, S., Muttik, I.: Android malware detection using parallel machine learning classifiers. In: 2014 Eighth International Conference on Next Generation Mobile Apps, Serving Technology, NGMAST, pp. 37–42 (2016)

    Google Scholar 

  29. Coronado-De-Alba, L.D., Rodriguez-Mota, A., Ambrosio, P.J.E.: Feature selection and ensemble of classifiers for Android malware detection. In: 2016 8th IEEE Latin-American Conference on Communications (LATINCOM), pp. 1–6 (2016)

    Google Scholar 

  30. Alzaylaee, M.K., Yerima, S.Y., Sezer, S.: DynaLog: an automated dynamic analysis framework for characterizing android applications. In: 2016 International Conference on Cyber Security and Protection of Digital Services, Cyber Security 2016, pp. 1–8 (2016)

    Google Scholar 

  31. Azeez, N.A., Atiku, O., Misra, S., Adewumi, A., Ahuja, R., Damasevicius, R.: Detection of malicious URLs on Twitter. In: Sengodan, T., Murugappan, M., Misra, S. (eds.) Advances in Electrical and Computer Technologies. LNEE, vol. 672, pp. 309–318. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-5558-9_29

    Chapter  Google Scholar 

  32. Yajin, Z., Xuxian, J.: Android Malware Genome Project (2015). Accessed 08 Mar 2017

    Google Scholar 

  33. Contagio, M.: Pokemon GO with Droidjack - Android sample. Mobile Malware Mini Dump (2016). Accessed 23 Nov 2017

    Google Scholar 

  34. Oberheide, J., Miller, C.: Dissecting the android bouncer. In: Summercon 2012 (2012). Accessed 02 Feb 2016

    Google Scholar 

  35. Asghar, M.R.: Dissecting Google Bouncer Lecture 11a, Auckland (2017)

    Google Scholar 

  36. Rahman, M., Rahman, M., Carbunar, B., Chau, D.H.: FairPlay: fraud and malware detection in Google play. In: Proceedings of 2016 SIAM International Conference on Data Mining 2016, vol. 29, no. 6, pp. 1329–1342. Society for Industrial Application and Mathematics (2017)

    Google Scholar 

  37. VirusTotal. VirusTotal (2017). Accessed 05 Dec 2017

    Google Scholar 

  38. Omar, S., Ngadi, A., Jebur, H.H.: Machine learning techniques for anomaly detection: an overview. Int. J. Comput. Appl. 79(2), 975–8887 (2013)

    Google Scholar 

  39. Shabtai, Y., Kanonov, A., Elovici, U., Glezer, Y., Weiss, C.: Andromaly: a behavioral malware detection framework for android devices. J. Intell. Inf. Syst. 38(1), 161–190 (2012)

    Article  Google Scholar 

  40. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  41. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Elsevier, Amsterdam (2011)

    MATH  Google Scholar 

  42. Yerima, S.Y., Sezer, S., Muttik, I.: Android malware detection: an eigenspace analysis approach. In: 2015 Science and Information Conference, no. November, pp. 1236–1242 (2015)

    Google Scholar 

  43. Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression. Annals Stat. 28(2), 337–374 (2000)

    Article  Google Scholar 

  44. Sunil, R.: 6 Easy Steps to Learn Naive Bayes Algorithm (with code in Python) (2015). Accessed 18 Mar 2017

    Google Scholar 

  45. Frank, E., Witten, I.H.: Generating accurate rule sets without global optimization. In: Proceedings of Fifteenth International Conference on Machine Learning, pp. 144–151 (1998)

    Google Scholar 

  46. Milosevic, N., Dehghantanha, A., Choo, K.K.R.: Machine learning aided Android malware classification. Comput. Electr. Eng. 61, 266–274 (2017)

    Article  Google Scholar 

  47. Masud, M., Khan, L., Thuraisingham, B.: Data Mining Tools for Malware Detection, 1st edn. Auerbach Publications, Boston (2011)

    Google Scholar 

  48. Asiri, S.: Machine Learning Classifiers. In: Towards Data Science (2018). Accessed 04 Jan 2019

    Google Scholar 

  49. Team AVC: A Complete Tutorial to learn Data Science in R from Scratch. Data Science (2016). Accessed 12 Apr 2017

    Google Scholar 

  50. Waikato, M.L.G.: Waikato Environment for Knowledge Analysis (WEKA). University of Waikato, Waikato (2017)

    Google Scholar 

  51. Jagtap, S.B.: Census data mining and data analysis using WEKA. In: ICETSTM – 2013 International Conference in Emerging Trends in Science, Technology and Management-2013, pp. 35–40 (2013)

    Google Scholar 

  52. Wahbeh, A.H., Al-Radaideh, Q.A., Al-Kabi, M.N., Al-Shawakfa, E.M.: A comparison study between data mining tools over some classification methods. Int. J. Adv. Comput. Sci. Appl. 1(3), 18–26 (2011)

    Google Scholar 

  53. Liu, H., Setiono, R.: Feature selection via discretization. IEEE Trans. Knowl. Data Eng. 9(4), 642–645 (1997)

    Article  Google Scholar 

  54. Pehlivan, U., Baltaci, N., Acarturk, C., Baykal, N.: The analysis of feature selection methods and classification algorithms in permission based Android malware detection. In: IEEE SSCI 2014: 2014 IEEE Symposium Series on Computational Intelligence - CICS 2014: 2014 IEEE Symposium on Computational Intelligence in Cyber Security, Proceedings, pp. 1–8 (2014)

    Google Scholar 

  55. Song, F., Guo, Z., Mei, D.: Feature selection using principal component analysis. In: 2010 International Conference on System Science, Engineering Design and Manufacturing Informatization, pp. 27–30 (2010)

    Google Scholar 

  56. Feng, P., Ma, J., Sun, C., Xu, X., Ma, Y.: A novel dynamic android malware detection system with ensemble learning. IEEE Access 6, 30996–31011 (2018)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Benjamin A. Gyunka .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gyunka, B.A., Abikoye, O.C., Adekunle, A.S. (2021). Anomaly Android Malware Detection: A Comparative Analysis of Six Classifiers. In: Misra, S., Muhammad-Bello, B. (eds) Information and Communication Technology and Applications. ICTA 2020. Communications in Computer and Information Science, vol 1350. Springer, Cham. https://doi.org/10.1007/978-3-030-69143-1_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-69143-1_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-69142-4

  • Online ISBN: 978-3-030-69143-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics