Abstract
Android malware detection is a significant problem that affects billions of users using millions of Android applications (apps) in existing markets. Thiss paper proposes PetaDroid, a framework for accurate Android malware detection and family clustering on top of static analyses. PetaDroid automatically adapts to Android malware and benign changes over time with resilience to common binary obfuscation techniques. The framework employs novel techniques elaborated on top of natural language processing (NLP) and machine learning techniques to achieve accurate, adaptive, and resilient Android malware detection and family clustering. We extensively evaluated PetaDroid on multiple reference datasets. PetaDroid achieved a high detection rate (98–99% f1-score) under different evaluation settings with high homogeneity in the produced clusters (96%). We conducted a thorough quantitative comparison with state-of-the-art solutions MaMaDroid, DroidAPIMiner, MalDozer, in which PetaDroid outperforms them under all the evaluation settings.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
https://VirusShare.com.
- 7.
- 8.
- 9.
References
Cyber attacks on Android devices on the rise (2018). https://www.gdatasoftware.com/blog/2018/11/31255-cyber-attacks-on-android-devices-on-the-rise
Mobile OS market share (2019). http://gs.statcounter.com/os-market-share/mobile/worldwide
Aafer, Y., Du, W., Yin, H.: DroidAPIMiner: mining API-level features for robust malware detection in Android. In: Zia, T., Zomaya, A., Varadharajan, V., Mao, M. (eds.) SecureComm 2013. LNICST, vol. 127, pp. 86–103. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-04283-1_6
Allix, K., Bissyandé, T.F., Klein, J., Le Traon, Y.: AndroZoo: collecting millions of android apps for the research community. In: Proceedings of the 13th International Conference on Mining Software Repositories (2016)
Amira, A., Derhab, A., Karbab, E.B., Nouali, O., Khan, F.A.: Tridroid: a triage and classification framework for fast detection of mobile threats in android markets. J. Ambient Intell. Humaniz. Comput. 12, 1731–1755 (2021)
Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., et al.: DREBIN: effective and explainable detection of Android malware in your pocket. In: Symposium Network and Distributed System Security (2014)
Bai, Y., Xing, Z., Ma, D., Li, X., Feng, Z.: Comparative analysis of feature representations and machine learning methods in android family classification. Comput. Netw. 184, 107639 (2021)
Canfora, G., Medvet, E.: Acquiring and analyzing app metrics for effective mobile malware detection. In: Proceedings of the 2016 ACM on International Workshop on Security and Privacy Analytics (2016)
Chen, X., et al.: Android HIV: a study of repackaging malware for evading machine-learning detection. IEEE Trans. Inf. Forensics Secur. 15, 987–1001 (2020)
Ding, S.H.H., Fung, B.C.M., Charland, P.: Asm2Vec: boosting static representation robustness for binary clone search against code obfuscation and compiler optimization. In: Security and Privacy (2019)
Ester, M., Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. AAAI Press (1996)
Garcia, J., Hammad, M., Malek, S.: Lightweight, obfuscation-resilient detection and family identification of Android malware. ACM Trans. Softw. Eng. Methodol. 26, 1–29 (2018)
Goodfellow, I., Bengio, Y., et al.: Deep Learning. MIT Press, Cambridge (2016)
Jordaney, R., et al.: Transcend: detecting concept drift in malware classification models. In: 26th USENIX Security Symposium, USENIX Security 2017, Vancouver, BC, Canada, August 16–18, 2017 (2017)
Karbab, E.B., Debbabi, M.: ToGather: automatic investigation of android malware cyber-infrastructures. In: Proceedings of the 13th International Conference on Availability, Reliability and Security, ARES (2018)
Karbab, E.B., Debbabi, M.: Maldy: portable, data-driven malware detection using natural language processing and machine learning techniques on behavioral analysis reports. Digit. Investig. 28, S77–S87 (2019)
Karbab, E.B., Debbabi, M., Derhab, A., Mouheb, D.: Cypider: building community-based cyber-defense infrastructure for Android malware detection. In: ACM Computer Security Applications Conference (ACSAC) (2016)
Karbab, E.B., Debbabi, M., Derhab, A., Mouheb, D.: MalDozer: automatic framework for Android malware detection using deep learning. Digit. Investig. 24, S48–S59 (2018)
Karbab, E.B., Debbabi, M., Derhab, A., Mouheb, D.: Scalable and robust unsupervised android malware fingerprinting using community-based network partitioning. Comput. Secur. 97, 101965 (2020)
Karbab, E.B., Debbabi, M., Mouheb, D.: Fingerprinting Android packaging: generating DNAs for malware detection. Digit. Investig. 18, S33–S45 (2016)
Karbab, E.M.B., Debbabi, M., Alrabaee, S., Mouheb, D.: DySign: dynamic fingerprinting for the automatic detection of Android malware. In: International Conference on Malicious and Unwanted Software (2016)
Kim, J., al. Structural information based malicious app similarity calculation and clustering. In: Proceedings of the 2015 Conference on Research in Adaptive and Convergent Systems (2015)
Kim, Y.: Convolutional neural networks for sentence classification. CoRR (2014)
Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and scalable predictive uncertainty estimation using deep ensembles. In: Annual Conference on Neural Information Processing Systems (2017)
Lindorfer, M., Neugschwandtner, M., et al.: Andrubis-1,000,000 apps later: a view on current Android malware behaviors. In: Building Analysis Datasets and Gathering Experience Returns for Security (BADGERS). IEEE (2014)
Maiorca, D., Ariu, D., Corona, I., Aresu, M., Giacinto, G.: Stealth attacks: an extended insight into the obfuscation effects on Android malware. Comput. Secur. 51, 16–31 (2015)
Mariconti, E., Onwuzurike, L., Andriotis, P., De Cristofaro, E., Ross, G., Stringhini, G.: MaMaDroid: detecting Android malware by building Markov chains of behavioral models. In: NDSS (2017)
Massarelli, L., Aniello, L., Ciccotelli, C., Querzoni, L., Ucci, D., Baldoni, R.: Android malware family classification based on resource consumption over time. In: 12th International Conference on Malicious and Unwanted Software, MALWARE 2017, Fajardo, PR, USA, October 11–14, 2017 (2017)
McLaughlin, N., et al.: Deep Android malware detection. In: CODASPY (2017)
Mikolov, T., Sutskever, I., et al.: Distributed representations of words and phrases and their compositionality. In: NIPS Neural Information Processing Systems (2013)
Onwuzurike, L., Mariconti, E., Andriotis, P., Cristofaro, E.D., Ross, G.J., Stringhini, G.: MaMaDroid: Detecting Android malware by building Markov chains of behavioral models (extended version). ACM Trans. Priv. Secur. 22, 1–34 (2019)
Pendlebury, F., Pierazzi, F., Jordaney, R., Kinder, J., Cavallaro, L.: TESSERACT: eliminating experimental bias in malware classification across space and time. In: USENIX (2019)
Rastogi, V., Chen, Y., Jiang, X.: DroidChameleon: evaluating android anti-malware against transformation attacks. In: 8th ACM Symposium on Information, Computer and Communications Security, ASIA CCS 2013 (2013)
Rosenberg, A., Hirschberg, J.: V-measure: a conditional entropy-based external cluster evaluation measure. In: EMNLP-CoNLL (2007)
Shi, Q., et al.: Hash kernels. In: International Conference on Artificial Intelligence and Statistics (AISTATS) (2009)
Suarez-Tangil, G., et al.: DroidSieve: fast and accurate classification of obfuscated Android malware. In: Proceedings of the 7th ACM Conference on Data and Application Security and Privacy (CODASPY 2017), pp. 309–320 (2017)
Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I.J., Boneh, D., McDaniel, P.D.: Ensemble adversarial training: attacks and defenses. In: 6th International Conference on Learning Representations, ICLR 2018 (2018)
Wei, F., Li, Y., Roy, S., Ou, X., Zhou, W.: Deep ground truth analysis of current Android malware. In: Polychronakis, M., Meier, M. (eds.) DIMVA 2017. LNCS, vol. 10327, pp. 252–276. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60876-1_12
Wu, Y., Li, X., Zou, D., Yang, W., Zhang, X., Jin, H.: MalScan: fast market-wide mobile malware scanning by social-network centrality analysis. In: 34th IEEE/ACM International Conference on Automated Software Engineering (2019)
Xu, K., Li, Y., Deng, R., Chen, K., Xu, J.: DroidEvolver: self-evolving android malware detection system. In: IEEE European Symposium on Security and Privacy (2019)
Yuan, Z., Lu, Y., Wang, Z., Xue, Y.: Droid-Sec: deep learning in android malware detection. In: ACM SIGCOMM Computer Communication Review (2014)
Zhang, X., Zhao, J.J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems (2015)
Zhang, Y., et al.: Familial clustering for weakly-labeled Android malware using hybrid representation learning. IEEE Trans. Inf. Forensics Secur. 15, 3401–3414 (2020)
Zhou, Y., Jiang, X.: Dissecting Android malware: characterization and evolution. In: IEEE Symposium on Security and Privacy (SP) (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Karbab, E.B., Debbabi, M. (2021). PetaDroid: Adaptive Android Malware Detection Using Deep Learning. In: Bilge, L., Cavallaro, L., Pellegrino, G., Neves, N. (eds) Detection of Intrusions and Malware, and Vulnerability Assessment. DIMVA 2021. Lecture Notes in Computer Science(), vol 12756. Springer, Cham. https://doi.org/10.1007/978-3-030-80825-9_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-80825-9_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-80824-2
Online ISBN: 978-3-030-80825-9
eBook Packages: Computer ScienceComputer Science (R0)