Abstract
With the widespread adoption of smartphones and the exponential growth of the mobile Internet, the Android platform has emerged as a highly popular choice. However, the platform’s open-source nature has also made it vulnerable to a surge in malware attacks. To address this pressing issue, this research paper introduces a robust malware detection system based on Smali-GRU (gated recurrent unit) network, aimed at enhancing the efficiency of malware detection on the Android platform. The proposed detection system employs a static analysis approach to extract Smali files from Android application packages (APKs). These extracted Smali files then undergo a series of pre-processing steps to extract pertinent features. To ensure compatibility with the GRU model, the preprocessed Smali files are fragmented into smaller segments. The paper explores and tests fragments of varying sizes to identify the optimal configuration that yields the most promising results. The study’s findings highlight that the proposed Smali-GRU model outperforms existing works that employ the same dataset and GRU model, resulting in an impressive accuracy of 98.29%. Furthermore, the robustness of the model is evaluated using a dataset of obfuscated malware. The results obtained highlight the efficacy and superiority of the proposed model in successfully detecting obfuscated malware in Android applications.
Graphic abstract














Similar content being viewed by others
Data availability
The dataset for the current study is available at: https://www.sec.tu-bs.de/~danarp/drebin and https://www.unb.ca/cic/datasets/maldroid-2020.html
References
Meijin L, Zhiyang F, Junfeng W, Luyu C, Qi Z, Tao Y, Yinwei W, Jiaxuan G (2022) A systematic overview of Android malware detection. Appl Artif Intell 36(1):2007327. https://doi.org/10.1080/08839514.2021.2007327
Yumlembam R, Issac B, Jacob SM, Yang L (2022) IoT-based Android malware detection using graph neural network with adversarial defense. IEEE Internet Things J. https://doi.org/10.1109/JIOT.2022.3188583
Li Y, Xu G, Xian H, Rao L, Shi J (2019) Novel Android malware detection method based on multi-dimensional hybrid features extraction and analysis. Intell Autom Soft Comput. https://doi.org/10.31209/2019.100000118
Alzaylaee MK, Yerima SY, Sezer S (2020) Dl-droid: deep learning based Android malware detection using real devices. Comput Secur 89:101663. https://doi.org/10.1016/j.cose.2019.101663
Feng R, Lim JQ, Chen S, Lin S-W, Liu Y (2020) Seqmobile: an efficient sequence-based malware detection system using RNN on mobile devices. In: 2020 25th International Conference on Engineering of Complex Computer Systems (ICECCS), IEEE, pp 63–72. https://doi.org/10.1109/ICECCS51672.2020.00015
Zhu H, Wei H, Wang L, Xu Z, Sheng VS (2023) An effective end-to-end Android malware detection method. Expert Syst Appl 218:119593. https://doi.org/10.1016/j.eswa.2023.119593
Fang Y, Gao Y, Jing F, Zhang L (2020) Android malware familial classification based on dex file section features. IEEE Access 8:10614–10627. https://doi.org/10.1109/ACCESS.2020.2965646
Wang S-W, Zhou G, Lu J-C, Zhang F-J (2019) A novel malware detection and classification method based on capsule network. In: Artificial Intelligence and Security: 5th International Conference, ICAIS 2019, New York, NY, USA, July 26–28, 2019, Proceedings, Part I 5, Springer, pp 573–584. https://doi.org/10.1007/978-3-030-24274-9-52
Garcia J, Hammad M, Malek S (2018) Lightweight, obfuscation-resilient detection and family identification of Android malware. ACM Trans Softw Eng Methodol 26(3):1–29. https://doi.org/10.1145/3162625
Chen X, Li C, Wang D, Wen S, Zhang J, Nepal S, Xiang Y, Ren K (2019) Android HIV: a study of repackaging malware for evading machine-learning detection. IEEE Trans Inf Forensics Secur 15:987–1001. https://doi.org/10.1109/TIFS.2019.2932228
Qiu J, Zhang J, Luo W, Pan L, Nepal S, Xiang Y (2020) A survey of Android malware detection with deep neural models. ACM Comput Surv 53(6):1–36. https://doi.org/10.1145/3417978
Kelec A, Djuric Z (2020) A proposal for addressing security issues related to dynamic code loading on Android platform. Comput Syst Sci Eng 35(4):271–282. https://doi.org/10.32604/csse.2020.35.271
Kim T, Kang B, Rho M, Sezer S, Im EG (2018) A multimodal deep learning method for Android malware detection using various features. IEEE Trans Inf Forensics Secur 14(3):773–788. https://doi.org/10.1109/TIFS.2018.2866319
Naeem H (2019) Detection of malicious activities in internet of things environment based on binary visualization and machine intelligence. Wirel Pers Commun 108(4):2609–2629. https://doi.org/10.1007/s11277-019-06540-6
Sharma G, Hiran D (2020) Reverse engineering for potential malware detection: Android APK Smali to Java. J Inf Assur Secur 15(1):26–34
Zhu H, Li Y, Li R, Li J, You Z, Song H (2020) Sedmdroid: an enhanced stacking ensemble framework for android malware detection. IEEE Trans Network Sci Eng 8(2):984–994. https://doi.org/10.1109/TNSE.2020.2996379
Balikcioglu PG, Sirlanci M, Kucuk OA, Ulukapi B, Turkmen RK, Acarturk C (2023) Malicious code detection in Android: the role of sequence characteristics and disassembling methods. Int J Inf Secur 22(1):107–118. https://doi.org/10.48550/arXiv.2312.01113
Mahesh PS, Hemalatha S (2022) An efficient Android malware detection using adaptive red fox optimization based CNN. Wirel Pers Commun 126(1):679–700. https://doi.org/10.1007/s11277-022-09765-0
Nauman M, Tanveer TA, Khan S, Syed TA (2018) Deep neural architectures for large scale Android malware analysis. Clust Comput 21:569–588. https://doi.org/10.1007/s10586-017-0944-y
Yan J, Qi Y, Rao Q (2018) LSTM-based hierarchical denoising network for Android malware detection. Secur Commun Netw 2018:1–18. https://doi.org/10.1155/2018/5249190
Ma Z, Ge H, Liu Y, Zhao M, Ma J (2019) A combination method for Android malware detection based on control flow graphs and machine learning algorithms. IEEE Access 7:21235–21245. https://doi.org/10.1109/ACCESS.2019.2896003
Jahromi AN, Hashemi S, Dehghantanha A, Parizi RM, Choo K-KR (2020) An enhanced stacked LSTM method with no random initialization for malware threat hunting in safety and time-critical systems. IEEE Trans Emerg Top Comput Intell 4(5):630–640. https://doi.org/10.1109/TETCI.2019.2910243
Xu Y, Yan X, Wu Y, Hu Y, Liang W, Zhang J (2021) Hierarchical bidirectional RNN for safety-enhanced b5g heterogeneous networks. IEEE Trans Netw Sci Eng 8(4):2946–2957. https://doi.org/10.1109/TNSE.2021.3055762
Yang Q, Wang X, Zheng J, Ge W, Bai M, Jiang F (2021) LSTM Android malicious behavior analysis based on feature weighting. KSII Trans Internet Inf Syst 15(6):2188–2203. https://doi.org/10.3837/tiis.2021.06.014
Gupta R, Sharma K, Garg RK (2024) Innovative approach to android malware detection: prioritizing critical features using rough set theory. Electronics 13(3):482. https://doi.org/10.3390/electronics13030482
Sasidharan SK, Thomas C (2021) Memdroid-LSTM based malware detection framework for Android devices. In: IEEE Pune Section International Conference (PuneCon). IEEE, pp 1–6. https://doi.org/10.1109/PuneCon52575.2021.9686531
Kim J, Ban Y, Ko E, Cho H, Yi JH (2022) MAPAS: a practical deep learning-based Android malware detection system. Int J Inf Secur 21(4):725–738. https://doi.org/10.1007/s10207-022-00579-6
Alzubi OA, Alzubi JA, Alzubi TM, Singh A (2023) Quantum mayfly optimization with encoder-decoder driven LSTM networks for malware detection and classification model. Mob Netw Appl. https://doi.org/10.1007/s11036-023-02105-x
Shen L, Feng J, Chen Z, Sun Z, Liang D, Li H, Wang Y (2023) Self-attention based convolutional-LSTM for Android malware detection using network traffics grayscale image. Appl Intell 53(1):683–705. https://doi.org/10.1007/s10489-022-03523-2
Sartea R, Farinelli A, Murari M (2020) Secur-ama: active malware analysis based on Monte Carlo tree search for Android systems. Eng Appl Artif Intell 87:103303. https://doi.org/10.1016/j.engappai.2019.103303
Bidoki SM, Jalili S, Tajoddin A (2017) Pbmmd: a novel policy based multi-process malware detection. Eng Appl Artif Intell 60:57–70. https://doi.org/10.1016/j.engappai.2016.12.008
Shaukat K, Luo S, Varadharajan V (2022) A novel method for improving the robustness of deep learning-based malware detectors against adversarial attacks. Eng Appl Artif Intell 116:105461. https://doi.org/10.1016/j.engappai.2022.105461
Zhang H, Luo S, Zhang Y, Pan L (2019) An efficient Android malware detection system based on method-level behavioral semantic analysis. IEEE Access 7:69246–69256. https://doi.org/10.1109/ACCESS.2019.2919796
Xiao X, Zhang S, Mercaldo F, Hu G, Sangaiah AK (2019) Android malware detection based on system call sequences and LSTM. Multimed Tools Appl 78:3979–3999. https://doi.org/10.1007/s11042-017-5104-0
Wang R, Gao J, Huang S (2023) AIHGAT: a novel method of malware detection and homology analysis using assembly instruction heterogeneous graph. Int J Inf Secur. https://doi.org/10.1007/s10207-023-00699-7
Qiu J, Han Q-L, Luo W, Pan L, Nepal S, Zhang J, Xiang Y (2022) Cyber code intelligence for Android malware detection. IEEE Trans Cybern 53(1):617–627. https://doi.org/10.1109/TCYB.2022.3164625
Arp D, Spreitzenbarth M, Hubner M, Gascon H, Rieck K, Siemens C (2014) Drebin: effective and explainable detection of Android malware in your pocket. In: Ndss, vol 14, pp 23–26. https://doi.org/10.14722/ndss.2014.23247
Mahdavifar S, Alhadidi D, Ghorbani AA (2022) Effective and efficient hybrid Android malware classification using pseudo-label stacked auto-encoder. J Netw Syst Manag 30:1–34. https://doi.org/10.1007/s10922-021-09634-4
Kumar S, Mishra D, Panda B, Shukla SK (2022) AndroOBFS: time-tagged obfuscated Android malware dataset with family information. In: Proceedings of the 19th International Conference on Mining Software Repositories, pp 454–458. https://doi.org/10.1145/3524842.3528493
Martín A, Lara-Cabrera R, Camacho D (2019) Android malware detection through hybrid features fusion and ensemble classifiers: the andropytool framework and the omnidroid dataset. Inf Fusion 52:128–142. https://doi.org/10.1016/J.INFFUS.2018.12.006
Hu H, Liu Y, Zhao Y, Liu Y, Sun X, Tantithamthavorn C, Li L (2023) Detecting temporal inconsistency in biased datasets for android malware detection. In: 2023 38th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW). IEEE, pp 17–23. https://doi.org/10.1109/ASEW60602.2023.00007
Cai H, Meng N, Ryder B, Yao D (2018) Droidcat: effective Android malware detection and categorization via app-level profiling. IEEE Trans Inf Forensics Secur 14(6):1455–1470. https://doi.org/10.1109/TIFS.2018.2879302
Martín A, Rodríguez-Fernández V, Camacho D (2018) Candyman: classifying Android malware families by modelling dynamic traces with Markov chains. Eng Appl Artif Intell 74:121–133. https://doi.org/10.1016/j.engappai.2018.06.006
Arora A, Peddoju SK, Conti M (2019) Permpair: Android malware detection using permission pairs. IEEE Trans Inf Forensics Secur 15:1968–1982. https://doi.org/10.1109/TIFS.2019.2950134
Sasidharan SK, Thomas C (2021) ProDroid-An Android malware detection framework based on profile hidden Markov model. Pervasive Mob Comput 72:101336. https://doi.org/10.1016/j.pmcj.2021.101336
Khalid S, Hussain FB (2022) Evaluating opcodes for detection of obfuscated Android malware. In: 2022 International Conference on Artificial Intelligence in Information and Communication (ICAIIC). IEEE, pp 044–049. https://doi.org/10.1109/ICAIIC54071.2022.9722669
Rafiq H, Aslam N, Aleem M, Issac B, Randhawa RH (2022) Andromalpack: enhancing the ML-based malware classification by detection and removal of repacked apps for Android systems. Sci Rep 12(1):19534. https://doi.org/10.1038/s41598-022-23766-w
Rodrigo C, Pierre S, Beaubrun R, El Khoury F (2021) Brainshield: a hybrid machine learning-based malware detection model for android devices. Electronics 10(23):2948. https://doi.org/10.3390/electronics10232948
de Oliveira AS, Sassi RJ (2021) Hunting android malware using multimodal deep learning and hybrid analysis data. Sociedade Brasileira de Inteligência Computacional. https://doi.org/10.21528/CBIC2021-32
Funding
No funding was received to support this research.
Author information
Authors and Affiliations
Contributions
All the authors contributed equally.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Anand, A., Singh, J.P. & Singh, A.K. Smali code-based deep learning model for Android malware detection. J Supercomput 81, 522 (2025). https://doi.org/10.1007/s11227-025-07055-7
Accepted:
Published:
DOI: https://doi.org/10.1007/s11227-025-07055-7