Skip to main content

On the Application of Active Learning to Handle Data Evolution in Android Malware Detection

  • Conference paper
  • First Online:
Digital Forensics and Cyber Crime (ICDF2C 2022)

Abstract

Mobile malware detection remains a significant challenge in the rapidly evolving cyber threat landscape. Although the research about the application of machine learning methods to this problem has provided promising results, still, maintaining continued success at detecting malware in operational environments depends on holistically solving challenges regarding the feature variations of malware apps that occur over time and the high costs associated with data labeling. The present study explores the adaptation of the active learning approach for inducing detection models in a non-stationary setting and shows that this approach provides high detection performance with a minimal set of labeled data for a long time when the uncertainty-based sampling strategy is applied. The models that are induced using dynamic, static and hybrid features of mobile malware are compared against baseline approaches. Although active learning has been adapted to many problem domains, it has not been explored in mobile malware detection extensively, especially for non-stationary settings.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Almgren, M., Jonsson, E.: Using active learning in intrusion detection. In: 2004 Proceedings of the 17th IEEE Computer Security Foundations Workshop, pp. 88–98. IEEE (2004)

    Google Scholar 

  2. Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K., Siemens, C.: Drebin: effective and explainable detection of android malware in your pocket. In: NDSS, vol. 14, pp. 23–26 (2014)

    Google Scholar 

  3. Beaugnon, A., Chifflier, P., Bach, F.: ILAB: an interactive labelling strategy for intrusion detection. In: Dacier, M., Bailey, M., Polychronakis, M., Antonakakis, M. (eds.) RAID 2017. LNCS, vol. 10453, pp. 120–140. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66332-6_6

    Chapter  Google Scholar 

  4. Google: Google play protect (2021). https://developers.google.com/android/play-protect

  5. Guerra-Manzanares, A., Bahsi, H., Luckner, M.: Leveraging the first line of defense: a study on the evolution and usage of android security permissions for enhanced android malware detection. J. Comput. Virol. Hacking Tech. 19, 1–32 (2022)

    Google Scholar 

  6. Guerra-Manzanares, A., Bahsi, H., Nõmm, S.: KronoDroid: time-based hybrid-featured dataset for effective android malware detection and characterization. Comput. Secur. 110, 102399 (2021)

    Google Scholar 

  7. Guerra-Manzanares, A., Luckner, M., Bahsi, H.: Android malware concept drift using system calls: detection, characterization and challenges. Expert Syst. Appl. 117200 (2022). https://doi.org/10.1016/j.eswa.2022.117200

  8. Guerra-Manzanares, A., Luckner, M., Bahsi, H.: Concept drift and cross-device behavior: challenges and implications for effective android malware detection. Comput. Secur. 120, 102757 (2022). https://doi.org/10.1016/j.cose.2022.102757

  9. Guerra-Manzanares, A., Nomm, S., Bahsi, H.: In-depth feature selection and ranking for automated detection of mobile malware. In: ICISSP, pp. 274–283 (2019)

    Google Scholar 

  10. Kaspersky: Mobile security: Android vs ios - which one is safer? (2020). https://www.kaspersky.com/resource-center/threats/android-vs-iphone-mobile-security

  11. Li, Y., Guo, L.: An active learning based TCM-KNN algorithm for supervised network intrusion detection. Comput. Secur. 26(7–8), 459–467 (2007)

    Article  Google Scholar 

  12. Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., Zhang, G.: Learning under concept drift: a review. IEEE Trans. Knowl. Data Eng. 31(12), 2346–2363 (2018)

    Google Scholar 

  13. Nissim, N., Cohen, A., Elovici, Y.: ALDOCX: detection of unknown malicious Microsoft office documents using designated active learning methods based on new structural feature extraction methodology. IEEE Trans. Inf. Forensics Secur. 12(3), 631–646 (2016)

    Article  Google Scholar 

  14. Onwuzurike, L., Mariconti, E., Andriotis, P., Cristofaro, E.D., Ross, G., Stringhini, G.: MaMaDroid: detecting android malware by building Markov chains of behavioral models (extended version). ACM Trans. Priv. Secur. 22(2) (2019). https://doi.org/10.1145/3313391

  15. Pendlebury, F., Pierazzi, F., Jordaney, R., Kinder, J., Cavallaro, L.: \(\{\)TESSERACT\(\}\): eliminating experimental bias in malware classification across space and time. In: 28th USENIX Security Symposium (USENIX Security 2019), pp. 729–746 (2019)

    Google Scholar 

  16. Samsung: About knox (2021). https://www.samsungknox.com/en/about-knox

  17. Schütze, H., Velipasaoglu, E., Pedersen, J.O.: Performance thresholding in practical text classification. In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management, pp. 662–671 (2006)

    Google Scholar 

  18. Settles, B.: Active learning literature survey (2009)

    Google Scholar 

  19. Settles, B., Craven, M.: An analysis of active learning strategies for sequence labeling tasks. In: proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp. 1070–1079 (2008)

    Google Scholar 

  20. Sharma, T., Rattan, D.: Malicious application detection in android - a systematic literature review. Comput. Sci. Rev. 40, 100373 (2021)

    Google Scholar 

  21. Xu, K., Li, Y., Deng, R., Chen, K., Xu, J.: DroidEvolver: self-evolving android malware detection system. In: 2019 IEEE European Symposium on Security and Privacy (EuroS P), pp. 47–62 (2019). https://doi.org/10.1109/EuroSP.2019.00014

  22. Zhang, X., et al.: Enhancing state-of-the-art classifiers with API semantics to detect evolved android malware. In: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, pp. 757–770 (2020)

    Google Scholar 

Download references

Acknowledgments

This work is partially funded by the European Union’s Horizon 2020 Research and Innovation Programme through ECHO (https://echonetwork.eu/) project under Grant Agreement No. 830943.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alejandro Guerra-Manzanares .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Guerra-Manzanares, A., Bahsi, H. (2023). On the Application of Active Learning to Handle Data Evolution in Android Malware Detection. In: Goel, S., Gladyshev, P., Nikolay, A., Markowsky, G., Johnson, D. (eds) Digital Forensics and Cyber Crime. ICDF2C 2022. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 508. Springer, Cham. https://doi.org/10.1007/978-3-031-36574-4_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-36574-4_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-36573-7

  • Online ISBN: 978-3-031-36574-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics