Abstract
Mobile malware detection remains a significant challenge in the rapidly evolving cyber threat landscape. Although the research about the application of machine learning methods to this problem has provided promising results, still, maintaining continued success at detecting malware in operational environments depends on holistically solving challenges regarding the feature variations of malware apps that occur over time and the high costs associated with data labeling. The present study explores the adaptation of the active learning approach for inducing detection models in a non-stationary setting and shows that this approach provides high detection performance with a minimal set of labeled data for a long time when the uncertainty-based sampling strategy is applied. The models that are induced using dynamic, static and hybrid features of mobile malware are compared against baseline approaches. Although active learning has been adapted to many problem domains, it has not been explored in mobile malware detection extensively, especially for non-stationary settings.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Almgren, M., Jonsson, E.: Using active learning in intrusion detection. In: 2004 Proceedings of the 17th IEEE Computer Security Foundations Workshop, pp. 88–98. IEEE (2004)
Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K., Siemens, C.: Drebin: effective and explainable detection of android malware in your pocket. In: NDSS, vol. 14, pp. 23–26 (2014)
Beaugnon, A., Chifflier, P., Bach, F.: ILAB: an interactive labelling strategy for intrusion detection. In: Dacier, M., Bailey, M., Polychronakis, M., Antonakakis, M. (eds.) RAID 2017. LNCS, vol. 10453, pp. 120–140. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66332-6_6
Google: Google play protect (2021). https://developers.google.com/android/play-protect
Guerra-Manzanares, A., Bahsi, H., Luckner, M.: Leveraging the first line of defense: a study on the evolution and usage of android security permissions for enhanced android malware detection. J. Comput. Virol. Hacking Tech. 19, 1–32 (2022)
Guerra-Manzanares, A., Bahsi, H., Nõmm, S.: KronoDroid: time-based hybrid-featured dataset for effective android malware detection and characterization. Comput. Secur. 110, 102399 (2021)
Guerra-Manzanares, A., Luckner, M., Bahsi, H.: Android malware concept drift using system calls: detection, characterization and challenges. Expert Syst. Appl. 117200 (2022). https://doi.org/10.1016/j.eswa.2022.117200
Guerra-Manzanares, A., Luckner, M., Bahsi, H.: Concept drift and cross-device behavior: challenges and implications for effective android malware detection. Comput. Secur. 120, 102757 (2022). https://doi.org/10.1016/j.cose.2022.102757
Guerra-Manzanares, A., Nomm, S., Bahsi, H.: In-depth feature selection and ranking for automated detection of mobile malware. In: ICISSP, pp. 274–283 (2019)
Kaspersky: Mobile security: Android vs ios - which one is safer? (2020). https://www.kaspersky.com/resource-center/threats/android-vs-iphone-mobile-security
Li, Y., Guo, L.: An active learning based TCM-KNN algorithm for supervised network intrusion detection. Comput. Secur. 26(7–8), 459–467 (2007)
Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., Zhang, G.: Learning under concept drift: a review. IEEE Trans. Knowl. Data Eng. 31(12), 2346–2363 (2018)
Nissim, N., Cohen, A., Elovici, Y.: ALDOCX: detection of unknown malicious Microsoft office documents using designated active learning methods based on new structural feature extraction methodology. IEEE Trans. Inf. Forensics Secur. 12(3), 631–646 (2016)
Onwuzurike, L., Mariconti, E., Andriotis, P., Cristofaro, E.D., Ross, G., Stringhini, G.: MaMaDroid: detecting android malware by building Markov chains of behavioral models (extended version). ACM Trans. Priv. Secur. 22(2) (2019). https://doi.org/10.1145/3313391
Pendlebury, F., Pierazzi, F., Jordaney, R., Kinder, J., Cavallaro, L.: \(\{\)TESSERACT\(\}\): eliminating experimental bias in malware classification across space and time. In: 28th USENIX Security Symposium (USENIX Security 2019), pp. 729–746 (2019)
Samsung: About knox (2021). https://www.samsungknox.com/en/about-knox
Schütze, H., Velipasaoglu, E., Pedersen, J.O.: Performance thresholding in practical text classification. In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management, pp. 662–671 (2006)
Settles, B.: Active learning literature survey (2009)
Settles, B., Craven, M.: An analysis of active learning strategies for sequence labeling tasks. In: proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp. 1070–1079 (2008)
Sharma, T., Rattan, D.: Malicious application detection in android - a systematic literature review. Comput. Sci. Rev. 40, 100373 (2021)
Xu, K., Li, Y., Deng, R., Chen, K., Xu, J.: DroidEvolver: self-evolving android malware detection system. In: 2019 IEEE European Symposium on Security and Privacy (EuroS P), pp. 47–62 (2019). https://doi.org/10.1109/EuroSP.2019.00014
Zhang, X., et al.: Enhancing state-of-the-art classifiers with API semantics to detect evolved android malware. In: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, pp. 757–770 (2020)
Acknowledgments
This work is partially funded by the European Union’s Horizon 2020 Research and Innovation Programme through ECHO (https://echonetwork.eu/) project under Grant Agreement No. 830943.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Guerra-Manzanares, A., Bahsi, H. (2023). On the Application of Active Learning to Handle Data Evolution in Android Malware Detection. In: Goel, S., Gladyshev, P., Nikolay, A., Markowsky, G., Johnson, D. (eds) Digital Forensics and Cyber Crime. ICDF2C 2022. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 508. Springer, Cham. https://doi.org/10.1007/978-3-031-36574-4_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-36574-4_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36573-7
Online ISBN: 978-3-031-36574-4
eBook Packages: Computer ScienceComputer Science (R0)