Abstract
This chapter investigates the potential of deep learning architectures for Android malware detection, specifically convolutional neural networks (CNNs) using natural language processing (NLP) concepts. The proposed solution is based on static analysis of raw opcode sequences from disassembled programs and other complementary features such as API calls and permissions, with features indicative of malware automatically learned by the network. This removes the need for hand-engineered malware features while performing classification. Using the Drebin and AMD benchmark datasets, the benefits of this multi-view architecture to combine multiple feature sources are demonstrated in our findings. We conclude the use of deep learning architectures enables state-of-art results in automatic malware detection, while reducing the dependency on feature engineering and domain expertise. Using multi-view compared to single-view architectures improves performance through exposure to simultaneous sources of information, learning a more effective set of features. The model achieves state-of-the art detection performance in a challenging zero-day scenario, reducing false positives by 77% in relative terms on average, an important metric for potential real-world deployment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aafer, Y., Du, W., Yin, H.: Droidapiminer: Mining API-level features for robust malware detection in android. In: Security and Privacy in Communication Networks. 9th International ICST Conference, SecureComm 2013 (2014)
Afonso, V., Amorim, M., Grégio, A., Junquera, G., De Geus, P.: Identifying android malware using dynamically obtained features. J. Comput. Virol. Hacking Tech. 11, 9–17 (2014)
Alazab, M., Alazab, M., Shalaginov, A., Mesleh, A., Awajan, A.: Intelligent mobile malware detection using permission requests and API calls. Future Generation Computer Systems 107, 509–521 (2020)
Allix, K., Jerome, Q., Bissyandé, T.F., Klein, J., State, R., Traon, Y.L.: A forensic analysis of android malware—how is malware written and how it could be detected? In: 2014 IEEE 38th Annual Computer Software and Applications Conference, pp. 384–393 (2014)
Alzaylaee, M.K., Yerima, S.Y., Sezer, S.: Emulator vs real phone: android malware detection using machine learning. In: Proceedings of the 3rd ACM on International Workshop on Security And Privacy Analytics, IWSPA ’17, pp. 65–72. ACM, New York (2017)
Apktool: Apktool—a tool for reverse engineering 3rd party, closed, binary Android aps. https://ibotpeaches.github.io/Apktool/. Accessed April 2020
Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K.: Drebin: effective and explainable detection of android malware in your pocket. In: 24th Annual Network and Distributed System Security Symposium, NDSS 2014 (2014)
Barrera, D., Kayacik, H.G., van Oorschot, P.C., Somayaji, A.: A methodology for empirical analysis of permission-based security models and its application to android. In: Proceedings of the 17th ACM Conference on Computer and Communications Security, pp. 73–84. ACM, New York (2010)
Burguera, I., Zurutuza, U., Nadjm-Tehrani, S.: Crowdroid: Behavior-based malware detection system for android. In: Proceedings of the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, SPSM ’11, pp. 15–26. ACM, New York (2011)
Cai, H., Meng, N., Ryder, B., Yao, D.: Droidcat: Effective android malware detection and categorization via app-level profiling. IEEE Transactions on Information Forensics and Security 14(6), 1455–1470 (2019)
Canfora, G., Mercaldo, F., Visaggio, C.A.: Mobile malware detection using op-code frequency histograms. In: Proc.of Int. Conf. on Security and Cryptography (SECRYPT) (2015)
Cashman, D., Perer, A., Chang, R., Strobelt, H.: Ablate, variate, and contemplate: Visual analytics for discovering neural architectures. IEEE Trans. Visualization Comput. Graph. 26(1), 863–873 (2020). https://doi.org/10.1109/TVCG.2019.2934261
Chen, S., Xue, M., Tang, Z., Xu, L., Zhu, H.: Stormdroid: A streaminglized machine learning-based system for detecting android malware. In: Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security, ASIA CCS ’16, pp. 377–388. ACM, New York (2016)
Chen, W., Aspinall, D., Gordon, A.D., Sutton, C., Muttik, I.: More semantics more robust: improving android malware classifiers. In: Proceedings of the 9th ACM Conference on Security & Privacy in Wireless and Mobile Networks, WiSec ’16, pp. 147–158. ACM, New York (2016)
Ding, Y., Zhao, W., Wang, Z., Wang, L.: Automaticlly learning features of android apps using CNN. In: 2018 International Conference on Machine Learning and Cybernetics (ICMLC), vol. 1, pp. 331–336 (2018)
Fan, M., Liu, J., Luo, X., Chen, K., Tian, Z., Zheng, Q., Liu, T.: Android malware familial classification and representative sample selection via frequent subgraph analysis. IEEE Trans. Inform. Forensics Secur. 13(8), 1890–1905 (2018)
Faruki, P., Laxmi, V., Bharmal, A., Gaur, M., Ganmoor, V.: Androsimilar: Robust signature for detecting variants of android malware. J. Infor. Secur. Appl. 22, 66–80 (2015). Special issue on Security of Information and Networks
Fawcett, C., Hoos, H.H.: Analysing differences between algorithm configurations through ablation. J. Heuristics 22(4), 431–458 (2016).
Felt, A.P., Finifter, M., Chin, E., Hanna, S., Wagner, D.: A survey of mobile malware in the wild. In: Proceedings of the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, pp. 3–14. ACM, New York (2011)
Felt, A.P., Greenwood, K., Wagner, D.: The effectiveness of application permissions. In: Proceedings of the 2Nd USENIX Conference on Web Application Development, p. 7 (2011)
Fratantonio, Y., Qian, C., Chung, S.P., Lee, W.: Cloak and dagger: from two permissions to complete control of the UI feedback loop. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 1041–1057 (2017)
Garcia, J., Hammad, M., Malek, S.: Lightweight, obfuscation-resilient detection and family identification of android malware. ACM Trans. Softw. Eng. Methodol. 26(3), 11:1–11:29 (2018)
Gascon, H., Yamaguchi, F., Arp, D., Rieck, K.: Structural detection of android malware using embedded call graphs. In: Proceedings of the 2013 ACM Workshop on Artificial Intelligence and Security, AISec ’13, pp. 45–54. ACM, New York (2013)
Goodfellow, I., Bengio, Y., Courville, A.: Sequence modelling: recurrent and recursive nets. In: Deep Learning, pp. 371–372. MIT Press, Cambridge (2016)
Google: Android API Developer Reference. https://developer.android.com/reference/packages. Accessed April 2020
Google: Dalvik. https://source.android.com/devices/tech/dalvik/dalvik-bytecode.html. Accessed April 2020
Google: Google APIs for Android. https://developers.google.com/android/reference/packages Accessed April 2020
Grace, M., Zhou, Y., Wang, Z., Jiang, X.: Systematic detection of capability leaks in stock android smartphones. In: NDSS Symposium 2012 (2012)
GSMA: Mobile Scholar 2019 Finalists. https://vimeo.com/325173012. Accessed July 2020
Intel Developer Zone: Intel 64 and IA-32 Architectures Software Developer’s Manual: Volume 2. https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4. Accessed April 2020
Jerome, Q., Allix, K., State, R., Engel, T.: Using opcode-sequences to detect malicious android applications. In: 2014 IEEE Int. Conf. on Communications (ICC) (2014)
Jesus Freke: Baksmali. https://github.com/JesusFreke/smali. Accessed April 2020
Kang, B., Kang, B., Kim, J., Im, E.G.: Android malware classification method: Dalvik bytecode frequency analysis. In: Proc. of the 2013 Research in Adaptive and Convergent Systems (2013)
Karbab, E.B., Debbabi, M., Derhab, A., Mouheb, D.: Maldozer: Automatic framework for android malware detection using deep learning. Digital Invest. 24, S48–S59 (2018)
Kim, T., Kang, B., Rho, M., Sezer, S., Im, E.G.: A multimodal deep learning method for android malware detection using various features. IEEE Trans. Inform. Forensics Secur. 14(3), 773–788 (2019).
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751. Association for Computational Linguistics (2014)
Lashkari, A.H., A.Kadir, A.F., Gonzalez, H., Mbah, K.F., A. Ghorbani, A.: Towards a network-based framework for android malware detection and characterization. In: 2017 15th Annual Conference on Privacy, Security and Trust (PST), pp. 233–23309 (2017)
Lee, W.Y., Saxe, J., Harang, R.: Seqdroid: Obfuscated android malware detection using stacked convolutional and recurrent neural networks. In: Deep Learning Applications for Cyber Security, pp. 197–210. Springer, Cham (2019)
Liu, X., Liu, J.: A two-layered permission-based android malware detection scheme. In: 2014 2nd IEEE Int. Conf. on Mobile Cloud Computing, Services and Engineering (MobileCloud) (2014)
Liu, X., Liu, J.: A two-layered permission-based android malware detection scheme. In: 2nd IEEE International Conference on Mobile Cloud Computing, Services, and Engineering (MobileCloud), pp. 142–148. IEEE, Piscataway (2014)
Ma, Z., Ge, H., Liu, Y., Zhao, M., Ma, J.: A combination method for android malware detection based on control flow graphs and machine learning algorithms. IEEE Access 7, 21235–21245 (2019)
Maiorca, D., Mercaldo, F., Giacinto, G., Visaggio, C.A., Martinelli, F.: R-packdroid: Api package-based characterization and detection of mobile ransomware. In: Proceedings of the Symposium on Applied Computing, SAC ’17, pp. 1718–1723. ACM, New York (2017)
Mariconti, E., Onwuzurike, L., Andriotis, P., Cristofaro, E.D., Ross, G.J., Stringhini, G.: Mamadroid: Detecting android malware by building Markov chains of behavioral models. In: 24th Annual Network and Distributed System Security Symposium, NDSS 2017 (2017)
Martinelli, F., Marulli, F., Mercaldo, F.: Evaluating convolutional neural network for effective mobile malware detection. Procedia Comput. Sci. 112(C), 2372–2381 (2017)
McLaughlin, N., Martinez del Rincon, J., Kang, B., Yerima, S., Miller, P., Sezer, S., Safaei, Y., Trickel, E., Zhao, Z., Doupé, A., Joon Ahn, G.: Deep android malware detection. In: Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy, CODASPY ’17. ACM, New York (2017)
Mikolov, T., Chen, K., Corrado, G.S., Dean, J.: Efficient estimation of word representations in vector space (2013). CoRR abs/1301.3781
Millar, S., McLaughlin, N., Martinez del Rincon, J., Miller, P., Zhao, Z.: Dandroid: a multi-view discriminative adversarial network for obfuscated android malware detection. In: Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy, CODASPY ’20 (2020)
Nix, R., Zhang, J.: Classification of android apps and malware using deep neural networks. In: 2017 International Joint Conference on Neural Networks, IJCNN 2017, Anchorage, AK, USA, May 14–19, 2017, pp. 1871–1878 (2017)
Oak, R., Du, M., Yan, D., Takawale, H., Amit, I.: Malware detection on highly imbalanced data through sequence modeling. In: Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security, AISec’19, pp. 37–48. Association for Computing Machinery (2019)
Onwuzurike, L., Almeida, M., Mariconti, E., Blackburn, J., Stringhini, G., De Cristofaro, E.: A family of droids-android malware detection via behavioral modeling: Static vs dynamic analysis. In: 2018 16th Annual Conference on Privacy, Security and Trust (PST), pp. 1–10 (2018)
Palo Alto Networks: WildFire malware analysis service. https://docs.paloaltonetworks.com/wildfire.html. Accessed April 2020
Peiravian, N., Zhu, X.: Machine learning for android malware detection using permission and api calls. In: 2013 IEEE 25th International Conference on Tools with Artificial Intelligence, pp. 300–305 (2013)
Peng, H., Gates, C., Sarma, B., Li, N., Qi, Y., Potharaju, R., Nita-Rotaru, C., Molloy, I.: Using probabilistic generative models for ranking risks of android apps. In: Proceedings of the 2012 ACM Conference on Computer and Communications Security, pp. 241–252. ACM, New York (2012)
Raff, E., Barker, J., Sylvester, J., Brandon, R., Catanzaro, B., Nicholas, C.K.: Malware detection by eating a whole exe. In: Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Rosen, S., Qian, Z., Mao, Z.M.: Appprofiler: A flexible method of exposing privacy-related behavior in android applications to end users. In: Proceedings of the Third ACM Conference on Data and Application Security and Privacy, CODASPY ’13, pp. 221–232 (2013)
Sanz, B., Santos, I., Laorden, C., Ugarte-Pedrero, X., Bringas, P.G., Marañón, G.A.: Puma: Permission usage to detect malware in android. In: CISIS/ICEUTE/SOCO Special Sessions 2012, Advances in Intelligent Systems and Computing, vol. 189, pp. 289–298. Springer, Berlin (2012)
Sanz, B., Santos, I., Laorden, C., X. Ugarte-Pedrero, P.G.B., Alvarez, G.: Puma: permission usage to detect malware in android. In: Int. Joint Conf. CISIS’12-ICEUTE’12-SOCO’12 (2012)
Saracino, A., Sgandurra, D., Dini, G., Martinelli, F.: Madam: Effective and efficient behavior-based android malware detection and prevention. IEEE Trans. Dependable Secure Comput. 15(1), 83–97 (2018)
Shabtai, A., Kanonov, U., Elovici, Y., Glezer, C., Weiss, Y.: Andromaly: a behavioral malware detection framework for android devices. J. Intell. Inform. Syst. 38(1), 161–190 (2012)
Sharma, A., Dash, S.K.: Mining api calls and permissions for android malware detection. In: Cryptology and Network Security (2014)
Skovoroda, A., Gamayunov, D.: Automated static analysis and classification of android malware using permission and api calls models. In: 2017 15th Annual Conference on Privacy, Security and Trust (PST), pp. 243–24309 (2017)
Su, X., Chuah, M.C., Tan, G.: Smartphone dual defense protection framework: detecting malicious applications in android markets. In: 2012 Eighth Int. Conf. on Mobile Ad-hoc and Sensor Networks (MSN) (2012)
Sun, M., Li, X., Lui, J.C.S., Ma, R.T.B., Liang, Z.: Monet: A user-oriented behavior-based malware variants detection system for android. IEEE Trans. Inform. Forensics Secur. 12(5), 1103–1112 (2017)
Tian, K., Yao, D., Ryder, B.G., Tan, G., Peng, G.: Detection of repackaged android malware with code-heterogeneity features. IEEE Trans. Dependable Secure Comput. 17(1), 64–77 (2020)
Viola, P., Jones, M.J.: Rapid object detection using a boosted cascade of simple features. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 511–518. IEEE, Piscataway (2001)
Wang, C., Dong, S., Zhao, X., Papanastasiou, G., Zhang, H., Yang, G.: Saliencygan: Deep learning semisupervised salient object detection in the fog of IoT. IEEE Trans. Ind. Inform. 16(4), 2667–2676 (2020)
Wang, S., Yan, Q., Chen, Z., Yang, B., Zhao, C., Conti, M.: Detecting android malware leveraging text semantics of network flows. IEEE Trans. Inform. Forensics Secur. 13(5), 1096–1109 (2018)
Wei, F., Li, Y., Roy, S., Ou, X., Zhou, W.: Deep ground truth analysis of current android malware. In: International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA’17), pp. 252–276. Springer, Bonn (2017)
Wong, M., Lie, D.: Intellidroid: A targeted input generator for the dynamic analysis of android malware. In: Proceedings of the 2016 Symposium on Network and Distributed System Security (NDSS) (2016)
Xu, K., Li, Y., Deng, R.H., Chen, K.: Deeprefiner: Multi-layer android malware detection system applying deep neural networks. In: 2018 IEEE European Symposium on Security and Privacy (Euro SP), pp. 473–487 (2018)
Ye, Y., Hou, S., Chen, L., Lei, J., Wan, W., Wang, J., Xiong, Q., Shao, F.: Out-of-sample node representation learning for heterogeneous graph in real-time android malware detection. In: Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, pp. 4150–4156 (2019)
Yerima, S., Sezer, S., McWilliams, G., Muttik, I.: A new android malware detection approach using bayesian classification. In: Advanced Information Networking and Applications (AINA), 2013 IEEE 27th International Conference on, pp. 121–128. IEEE, Piscataway (2013)
Yerima, S.Y., Sezer, S., Muttik, I.: Android malware detection using parallel machine learning classifiers. In: Proceedings of the 2014 Eighth International Conference on Next Generation Mobile Apps, Services and Technologies, NGMAST ’14, pp. 37–42. IEEE Computer Society (2014)
Yerima, S.Y., Sezer, S., Muttik, I.: High accuracy android malware detection using ensemble learning. IET Inform. Secur. 9(6), 313–320 (2015)
Yuan, Z., Lu, Y., Xue, Y.: Droiddetector: android malware characterization and detection using deep learning. Tsinghua Sci. Technol. 21(1), 114–123 (2016)
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Proceedings of the 28th International Conference on Neural Information Processing Systems—Volume 1, NIPS’15, pp. 649–657. MIT Press, Cambridge (2015)
Zhang, Y., Wallace, B.C.: A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. In: IJCNLP (2015)
Zhou, Y., Jiang, X.: Dissecting android malware: characterization and evolution. In: 2012 IEEE Symposium on Security and Privacy (SP), pp. 95–109. IEEE, Piscataway (2012)
Zhu, J., Wu, Z., Guan, Z., Chen, Z.: Api sequences based malware detection for android. In: 2015 IEEE 12th Intl Conf on Ubiquitous Intelligence and Computing and 2015 IEEE 12th Intl Conf on Autonomic and Trusted Computing and 2015 IEEE 15th Intl Conf on Scalable Computing and Communications and its associated Workshops (UIC-ATC-ScalCom), pp. 673–676 (2015)
Zonouz, S., Houmansadr, A., Berthier, R., Borisov, N., Sanders, W.: Secloud: a cloud-based comprehensive and lightweight security solution for smartphones. Comput. Secur. 37, 215–227 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Millar, S., McLaughlin, N., Rincon, J.M.d., Miller, P. (2023). Android Malware Detection Using Deep Learning. In: Sipola, T., Kokkonen, T., Karjalainen, M. (eds) Artificial Intelligence and Cybersecurity. Springer, Cham. https://doi.org/10.1007/978-3-031-15030-2_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-15030-2_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15029-6
Online ISBN: 978-3-031-15030-2
eBook Packages: Computer ScienceComputer Science (R0)