Abstract
Today anti-malware community is facing challenges due to ever-increasing sophistication and volume of malware attacks developed by adversaries. Traditional malware detection mechanisms are not able to cope-up against next-generation malware attacks. Therefore in this paper, we propose effective and efficient Android malware detection models based on machine learning and deep learning integrated with clustering. We performed a comprehensive study of different feature reduction, classification and clustering algorithms over various performance metrics to construct the Android malware detection models. Our experimental results show that malware detection models developed using Random Forest eclipsed deep neural network and other classifiers on the majority of performance metrics. The baseline Random Forest model without any feature reduction achieved the highest AUC of \(99.4\%\). Also, the segregating of vector space using clustering integrated with Random Forest further boosted the AUC to \(99.6\%\) in one cluster and direct detection of Android malware in another cluster, thus reducing the curse of dimensionality. Additionally, we found that feature reduction in detection models does improve the model efficiency (training and testing time) many folds without much penalty on effectiveness of detection model .
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
References
Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K.: Drebin: effective and explainable detection of android malware in your pocket. In: Network and Distributed System Security (NDSS) Symposium, vol. 14, pp. 23–26 (2014)
Buczak, A.L., Guven, E.: A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun. Surv. Tutor. 18(2), 1153–1176 (2015)
Chau, M., Reith, R., (IDC-Corporate): Smartphone Market Share (2018). https://www.idc.com/promo/smartphone-market-share/os. Accessed May 2020
Clooke, R. :(GDATA) Cyber attacks on Android devices on the rise (2018). https://www.idc.com/promo/smartphone-market-share/os. Accessed May 2020
Egele, M., Scholte, T., Kirda, E., Kruegel, C.: A survey on automated dynamic malware-analysis techniques and tools. ACM Comput. Surv. (CSUR) 44(2), 1–42 (2008)
Ganesh, M., Pednekar, P., Prabhuswamy, P., Nair, D.S., Park, Y., Jeon, H.: CNN-based android malware detection. In: International Conference on Software Security and Assurance (ICSSA), pp. 60–65. IEEE (2017)
Griffin, K., Schneider, S., Hu, X., Chiueh, T.: Automatic generation of string signatures for malware detection. In: Kirda, E., Jha, S., Balzarotti, D. (eds.) RAID 2009. LNCS, vol. 5758, pp. 101–120. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04342-0_6
Henchiri, O., Japkowicz, N.: A feature selection and evaluation scheme for computer virus detection. In: 6th International Conference on Data Mining (ICDM’06), pp. 891–895. IEEE (2006)
Hou, S., Saas, A., Ye, Y., Chen, L.: DroidDelver: an android malware detection system using deep belief network based on API call blocks. In: Song, S., Tong, Y. (eds.) WAIM 2016. LNCS, vol. 9998, pp. 54–66. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47121-1_5
Kemp, S.:(WeAreSocial) Global Digital Report (2018). https://digitalreport.wearesocial.com/. Accessed May 2020
Li, W., Wang, Z., Cai, J., Cheng, S.: An android malware detection approach using weight-adjusted deep learning. In: International Conference on Computing, Networking and Communications (ICNC), pp. 437–441. IEEE (2018)
Lindorfer, M., Neugschwandtner, M., Weichselbaum, L., Fratantonio, Y., Van Der Veen, V., Platzer, C.: Andrubis-1,000,000 apps later: a view on current android malware behaviors. In: 3rd International Workshop on Building Analysis Datasets and Gathering Experience Returns for Security (BADGERS), pp. 3–17. IEEE (2014)
Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 8, 1226–1238 (2005)
Rastogi, V., Chen, Y., Jiang, X.: Droidchameleon: evaluating android anti-malware against transformation attacks. In: 8th ACM SIGSAC Symposium on Information, Computer and Communications Security (ASIA CCS), pp. 329–334. ACM (2013)
Sarma, B.P., Li, N., Gates, C., Potharaju, R., Nita-Rotaru, C., Molloy, I.: Android permissions: a perspective combining risks and benefits. In: 17th ACM Symposium on Access Control Models and Technologies (SACMAT), pp. 13–22. ACM (2012)
Sewak, M., Sahay, S.K., Rathore, H.: An investigation of a deep learning based malware detection system. In: 13th International Conference on Availability, Reliability and Security (ARES), pp. 1–5 (2018)
Sewak, M., Sahay, S.K., Rathore, H.: DOOM: a novel adversarial-DRL-based op-code level metamorphic malware obfuscator for the enhancement of ids. In: Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers, pp. 131–134 (2020)
Sewak, M., Sahay, S.K., Rathore, H.: An overview of deep learning architecture of deep neural networks and autoencoders. J. Comput. Theor. Nanosci. 17(1), 182–188 (2020)
Sharma, A., Sahay, S.K.: An investigation of the classifiers to detect android malicious apps. In: Mishra, D., Azar, A., Joshi, A. (eds.) Information and Communication Technology. AISC, vol. 625, pp. 207–217. Springer, Singapore (2018). https://doi.org/10.1007/978-981-10-5508-9_20
Sun, L., Li, Z., Yan, Q., Srisa-an, W., Pan, Y.: SIGPID: significant permission identification for android malware detection. In: 11th International Conference on Malicious and Unwanted Software (MALWARE), pp. 1–8. IEEE (2016)
Symantec: Internet Security Threat Report (ISTR), Volume 24, February 2019. https://www.symantec.com/content/dam/symantec/docs/reports/istr-24-2019-en.pdf. Accessed May 2020
Turner, A.:(BankMyCell) How many smartphones are in the world? (2020). https://www.bankmycell.com/blog/how-many-phones-are-in-the-world. Accessed May 2020
Wang, Z., Cai, J., Cheng, S., Li, W.: Droiddeeplearner: identifying android malware using deep learning. In: IEEE 37th Sarnoff Symposium, pp. 160–165. IEEE (2016)
Wu, D.J., Mao, C.H., Wei, T.E., Lee, H.M., Wu, K.P.: Droidmat: android malware detection through manifest and API calls tracing. In: Asia Joint Conference on Information Security (AsiaJCIS), pp. 62–69. IEEE (2012)
Xu, M., et al.: Toward engineering a secure android ecosystem: a survey of existing techniques. ACM Comput. Surv. (CSUR) 49(2), 1–47 (2016)
Yan, P., Yan, Z.: A survey on dynamic mobile malware detection. Softw. Q. J. 26(3), 891–919 (2018)
Yang, W., Xiao, X., Andow, B., Li, S., Xie, T., Enck, W.: Appcontext: differentiating malicious and benign mobile app behaviors using context. In: 37th International Conference on Software Engineering (ICSE), pp. 303–313. IEEE (2015)
Ye, Y., Li, T., Adjeroh, D., Iyengar, S.S.: A survey on malware detection using data mining techniques. ACM Comput. Surv. (CSUR) 50(3), 41 (2017)
Zhou, Y., Jiang, X.: Dissecting android malware: characterization and evolution. In: IEEE Symposium on Security and Privacy (IEEE S&P), pp. 95–109. IEEE (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Rathore, H., Sahay, S.K., Thukral, S., Sewak, M. (2021). Detection of Malicious Android Applications: Classical Machine Learning vs. Deep Neural Network Integrated with Clustering. In: Gao, H., J. Durán Barroso, R., Shanchen, P., Li, R. (eds) Broadband Communications, Networks, and Systems. BROADNETS 2020. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 355. Springer, Cham. https://doi.org/10.1007/978-3-030-68737-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-68737-3_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68736-6
Online ISBN: 978-3-030-68737-3
eBook Packages: Computer ScienceComputer Science (R0)