Abstract
The high proliferation rate of Android devices has exposed the platform to wider vulnerabilities of increasing malware attacks. Emerging trends of the malware threats are employing highly sophisticated and dynamic detection avoidance techniques. This has continued to weaken the capacity of existing signature-based detection systems in their protection against new and unknown threats. Thus, the need for effective detection approaches for unknown and novel Android malware has remained a growing challenge in the field of mobile and information security. This study therefore aimed at investigating the best performing machine learning classification algorithm for the anomaly Android malware detection, leveraging on permission-based feature sets, by conduction a performance comparison analysis between six different classification algorithms namely: Naïve Bayes, Simple Logistics, Random Forest, PART, k-Nearest Neighbours (k-NN), and Support Vector Machine (SVM). The Machine learning tool that was used for the pre-processing of the feature sets and the classification processes is WEKA 3.8.2 suite. Findings of the study showed that Random Forest had the best detection result with false alarm rate of 2.2%, accuracy of \(97.4\%\), error rate of \(2.6\%\) and ROC Area of \(99.6\%\). The study concluded that, using Android permission features, Random Forest and k-Nearest Neighbours recoded best performances in Android malware detection, followed by Support Vector Machine and Simple Logistics classification algorithms. Partial Decision Tree (PART) performed relatively well, while Naïve Baye recorded the least performance. Consequently, the deployment of Random Forest model and k-NN model are recommended for the development of an anomaly Android malware detection paradigm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arp, D., Spreitzenbarth, M., Malte, H., Gascon, H., Rieck, K.: Drebin: effective and explainable detection of android malware in your pocket. In: Symposium on Network and Distributed System Security (NDSS), pp. 23–26, February 2014
Russell, I., Markov, Z.: An introduction to the weka data mining system. In: Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education - SIGCSE 2017, p. 742 (2017)
Namratha, M., Prajwala, T.R.: A comprehensive overview of clustering algorithms in pattern recognition. J. Comput. Eng. 4(6), 23–30 (2012)
Garg, B.: Design and Development of Naive Bayes Classifier (Master Thesis). North Dakota State University of Agriculture and Applied Science (2013)
Brownlee, J.: Supervised and unsupervised machine learning algorithms. Understand Machine Learning Algorithms (2016). Accessed 13 May 2018
Scott, G.: ML 101: Reinforcement Learning (2015). Accessed 23 Nov 2017
Guyon, I., Elisseeff, A.: Feature extraction, foundations and applications: an introduction to feature extraction. Stud. Fuzziness Soft Comput. 207, 1–25 (2006)
da Gama, J.M.P.: Combining Classification Algorithms (Doctoral Thesis). Faculdade de Ci^encias da Universidade do Porto (1999)
Komal, A.: A Survey on malicious detection technique using data mining and analyzing in web security. In: 2016 IJEDR, vol. 4, no. 2, pp. 319–322 (2016)
Duch, W., Grudzinski, K.: Meta-learning: searching in the model space. In: Proceedings of the International Conference on Neural Information Processing (ICONIP), pp. 235–240 (2001)
Tan, P.-N., Steinbach, M., Kumar, V.: Classification: basic concepts , decision trees , and model evaluation classification. In: Introduction to Data Mining, vol. 1, pp. 145–205 (2006)
Musthaler, L.: Forget signatures for malware detection. SparkCognition says AI is 99% effective. Network World (2017). Accessed 03 Jun 2019
Richter, L.: Common weaknesses of android malware analysis frameworks. In: IT Security Conference, University of Erlangen-Nuremberg During Summer Term 2015, pp. 1–10 (2015)
Vidas, T., Tan, J., Nahata, J., Tan, C.L., Christin, N., Tague, P.: A5: automated analysis of adversarial android applications. In: SPSM 2014 Proc. 4th ACM Working Security and Privacy Smartphones Mobile Devices, pp. 39–50 (2014)
Yerima, S.Y., Sezer, S.: DroidFusion: a novel multilevel classifier fusion approach for android malware detection. IEEE Trans. Cybern. 49, 453–4566 (2018)
Alzaylaee, M.K., Yerima, S.Y., Sezer, S.: DL-droid: deep learning based android malware detection using real devices. Comput. Secur. 89, 101663 (2020)
Shhadat, I., Bataineh, B., Hayajneh, A., Al-Sharif, Z.A.: The use of machine learning techniques to advance the detection and classification of unknown malware. Procedia Comput. Sci. 170(2019), 917–922 (2020)
Jiang, X., Mao, B., Guan, J., Huang, X.: Android malware detection using fine-grained features. Sci. Program. 2020, 1–3 (2020)
Liu, K., Xu, S., Xu, G., Zhang, M., Sun, D., Liu, H.: A review of android malware detection approaches based on machine learning. IEEE Access 8, 124579–124607 (2020)
Arslan, R.S., Yurttakal, A.H.: K-nearest neighbour classifier usage for permission based malware detection in android. Icontech Int. J. 4(2), 15–27 (2020)
Kedziora, M., Gawin, P., Szczepanik, M., Jozwiak, I.: Malware detection using machine learning algorithms and reverse engineering of android java code. Int. J. Netw. Secur. Appl. 11(01), 01–14 (2019)
Dada, E.G., Bassi, J.S., Hurcha, Y.J., Alkali, A.H.: Performance evaluation of machine learning algorithms for detection and prevention of malware attacks. J. Comput. Eng. 21(3), 18–27 (2019)
Memon, L.U., Bawany, N.Z., Shamsi, J.A.: A comparison of machine learning techniques for android malware detection using apache spark. J. Eng. Sci. Technol. 14(3), 1572–1586 (2019)
Rashidi, B., Fung, C., Bertino, E.: Android malicious application detection using support vector machine and active learning. In: 2017 13th International Conference on Network and Service Management, CNSM 2017, vol. 2018 (2018)
Ucci, D., Aniello, L., Baldoni, R.: Survey on the usage of machine learning techniques for malware analysis. Comput. Secur. 1(1), 1–67 (2018)
Idrees, F., Rajarajan, M., Conti, M., Chen, T.M., Rahulamathavan, Y.: PIndroid: a novel android malware detection system using ensemble learning methods. Comput. Secur. 68, 36–46 (2017)
Al Ali, M., Svetinovic, D., Aung, Z., Lukman, S.: Malware detection in android mobile platform using machine learning algorithms. In: International Conference on Infocom Technologies and Unmanned Systems (ICTUS 2017), pp. 4–9 (2017)
Yerima, S.Y., Sezer, S., Muttik, I.: Android malware detection using parallel machine learning classifiers. In: 2014 Eighth International Conference on Next Generation Mobile Apps, Serving Technology, NGMAST, pp. 37–42 (2016)
Coronado-De-Alba, L.D., Rodriguez-Mota, A., Ambrosio, P.J.E.: Feature selection and ensemble of classifiers for Android malware detection. In: 2016 8th IEEE Latin-American Conference on Communications (LATINCOM), pp. 1–6 (2016)
Alzaylaee, M.K., Yerima, S.Y., Sezer, S.: DynaLog: an automated dynamic analysis framework for characterizing android applications. In: 2016 International Conference on Cyber Security and Protection of Digital Services, Cyber Security 2016, pp. 1–8 (2016)
Azeez, N.A., Atiku, O., Misra, S., Adewumi, A., Ahuja, R., Damasevicius, R.: Detection of malicious URLs on Twitter. In: Sengodan, T., Murugappan, M., Misra, S. (eds.) Advances in Electrical and Computer Technologies. LNEE, vol. 672, pp. 309–318. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-5558-9_29
Yajin, Z., Xuxian, J.: Android Malware Genome Project (2015). Accessed 08 Mar 2017
Contagio, M.: Pokemon GO with Droidjack - Android sample. Mobile Malware Mini Dump (2016). Accessed 23 Nov 2017
Oberheide, J., Miller, C.: Dissecting the android bouncer. In: Summercon 2012 (2012). Accessed 02 Feb 2016
Asghar, M.R.: Dissecting Google Bouncer Lecture 11a, Auckland (2017)
Rahman, M., Rahman, M., Carbunar, B., Chau, D.H.: FairPlay: fraud and malware detection in Google play. In: Proceedings of 2016 SIAM International Conference on Data Mining 2016, vol. 29, no. 6, pp. 1329–1342. Society for Industrial Application and Mathematics (2017)
VirusTotal. VirusTotal (2017). Accessed 05 Dec 2017
Omar, S., Ngadi, A., Jebur, H.H.: Machine learning techniques for anomaly detection: an overview. Int. J. Comput. Appl. 79(2), 975–8887 (2013)
Shabtai, Y., Kanonov, A., Elovici, U., Glezer, Y., Weiss, C.: Andromaly: a behavioral malware detection framework for android devices. J. Intell. Inf. Syst. 38(1), 161–190 (2012)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Elsevier, Amsterdam (2011)
Yerima, S.Y., Sezer, S., Muttik, I.: Android malware detection: an eigenspace analysis approach. In: 2015 Science and Information Conference, no. November, pp. 1236–1242 (2015)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression. Annals Stat. 28(2), 337–374 (2000)
Sunil, R.: 6 Easy Steps to Learn Naive Bayes Algorithm (with code in Python) (2015). Accessed 18 Mar 2017
Frank, E., Witten, I.H.: Generating accurate rule sets without global optimization. In: Proceedings of Fifteenth International Conference on Machine Learning, pp. 144–151 (1998)
Milosevic, N., Dehghantanha, A., Choo, K.K.R.: Machine learning aided Android malware classification. Comput. Electr. Eng. 61, 266–274 (2017)
Masud, M., Khan, L., Thuraisingham, B.: Data Mining Tools for Malware Detection, 1st edn. Auerbach Publications, Boston (2011)
Asiri, S.: Machine Learning Classifiers. In: Towards Data Science (2018). Accessed 04 Jan 2019
Team AVC: A Complete Tutorial to learn Data Science in R from Scratch. Data Science (2016). Accessed 12 Apr 2017
Waikato, M.L.G.: Waikato Environment for Knowledge Analysis (WEKA). University of Waikato, Waikato (2017)
Jagtap, S.B.: Census data mining and data analysis using WEKA. In: ICETSTM – 2013 International Conference in Emerging Trends in Science, Technology and Management-2013, pp. 35–40 (2013)
Wahbeh, A.H., Al-Radaideh, Q.A., Al-Kabi, M.N., Al-Shawakfa, E.M.: A comparison study between data mining tools over some classification methods. Int. J. Adv. Comput. Sci. Appl. 1(3), 18–26 (2011)
Liu, H., Setiono, R.: Feature selection via discretization. IEEE Trans. Knowl. Data Eng. 9(4), 642–645 (1997)
Pehlivan, U., Baltaci, N., Acarturk, C., Baykal, N.: The analysis of feature selection methods and classification algorithms in permission based Android malware detection. In: IEEE SSCI 2014: 2014 IEEE Symposium Series on Computational Intelligence - CICS 2014: 2014 IEEE Symposium on Computational Intelligence in Cyber Security, Proceedings, pp. 1–8 (2014)
Song, F., Guo, Z., Mei, D.: Feature selection using principal component analysis. In: 2010 International Conference on System Science, Engineering Design and Manufacturing Informatization, pp. 27–30 (2010)
Feng, P., Ma, J., Sun, C., Xu, X., Ma, Y.: A novel dynamic android malware detection system with ensemble learning. IEEE Access 6, 30996–31011 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Gyunka, B.A., Abikoye, O.C., Adekunle, A.S. (2021). Anomaly Android Malware Detection: A Comparative Analysis of Six Classifiers. In: Misra, S., Muhammad-Bello, B. (eds) Information and Communication Technology and Applications. ICTA 2020. Communications in Computer and Information Science, vol 1350. Springer, Cham. https://doi.org/10.1007/978-3-030-69143-1_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-69143-1_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69142-4
Online ISBN: 978-3-030-69143-1
eBook Packages: Computer ScienceComputer Science (R0)