Skip to main content

Advertisement

Log in

Smali code-based deep learning model for Android malware detection

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

With the widespread adoption of smartphones and the exponential growth of the mobile Internet, the Android platform has emerged as a highly popular choice. However, the platform’s open-source nature has also made it vulnerable to a surge in malware attacks. To address this pressing issue, this research paper introduces a robust malware detection system based on Smali-GRU (gated recurrent unit) network, aimed at enhancing the efficiency of malware detection on the Android platform. The proposed detection system employs a static analysis approach to extract Smali files from Android application packages (APKs). These extracted Smali files then undergo a series of pre-processing steps to extract pertinent features. To ensure compatibility with the GRU model, the preprocessed Smali files are fragmented into smaller segments. The paper explores and tests fragments of varying sizes to identify the optimal configuration that yields the most promising results. The study’s findings highlight that the proposed Smali-GRU model outperforms existing works that employ the same dataset and GRU model, resulting in an impressive accuracy of 98.29%. Furthermore, the robustness of the model is evaluated using a dataset of obfuscated malware. The results obtained highlight the efficacy and superiority of the proposed model in successfully detecting obfuscated malware in Android applications.

Graphic abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data availability

The dataset for the current study is available at: https://www.sec.tu-bs.de/~danarp/drebin and https://www.unb.ca/cic/datasets/maldroid-2020.html

References

  1. Meijin L, Zhiyang F, Junfeng W, Luyu C, Qi Z, Tao Y, Yinwei W, Jiaxuan G (2022) A systematic overview of Android malware detection. Appl Artif Intell 36(1):2007327. https://doi.org/10.1080/08839514.2021.2007327

    Article  Google Scholar 

  2. Yumlembam R, Issac B, Jacob SM, Yang L (2022) IoT-based Android malware detection using graph neural network with adversarial defense. IEEE Internet Things J. https://doi.org/10.1109/JIOT.2022.3188583

    Article  Google Scholar 

  3. Li Y, Xu G, Xian H, Rao L, Shi J (2019) Novel Android malware detection method based on multi-dimensional hybrid features extraction and analysis. Intell Autom Soft Comput. https://doi.org/10.31209/2019.100000118

    Article  MATH  Google Scholar 

  4. Alzaylaee MK, Yerima SY, Sezer S (2020) Dl-droid: deep learning based Android malware detection using real devices. Comput Secur 89:101663. https://doi.org/10.1016/j.cose.2019.101663

    Article  Google Scholar 

  5. Feng R, Lim JQ, Chen S, Lin S-W, Liu Y (2020) Seqmobile: an efficient sequence-based malware detection system using RNN on mobile devices. In: 2020 25th International Conference on Engineering of Complex Computer Systems (ICECCS), IEEE, pp 63–72. https://doi.org/10.1109/ICECCS51672.2020.00015

  6. Zhu H, Wei H, Wang L, Xu Z, Sheng VS (2023) An effective end-to-end Android malware detection method. Expert Syst Appl 218:119593. https://doi.org/10.1016/j.eswa.2023.119593

    Article  MATH  Google Scholar 

  7. Fang Y, Gao Y, Jing F, Zhang L (2020) Android malware familial classification based on dex file section features. IEEE Access 8:10614–10627. https://doi.org/10.1109/ACCESS.2020.2965646

    Article  MATH  Google Scholar 

  8. Wang S-W, Zhou G, Lu J-C, Zhang F-J (2019) A novel malware detection and classification method based on capsule network. In: Artificial Intelligence and Security: 5th International Conference, ICAIS 2019, New York, NY, USA, July 26–28, 2019, Proceedings, Part I 5, Springer, pp 573–584. https://doi.org/10.1007/978-3-030-24274-9-52

  9. Garcia J, Hammad M, Malek S (2018) Lightweight, obfuscation-resilient detection and family identification of Android malware. ACM Trans Softw Eng Methodol 26(3):1–29. https://doi.org/10.1145/3162625

    Article  MATH  Google Scholar 

  10. Chen X, Li C, Wang D, Wen S, Zhang J, Nepal S, Xiang Y, Ren K (2019) Android HIV: a study of repackaging malware for evading machine-learning detection. IEEE Trans Inf Forensics Secur 15:987–1001. https://doi.org/10.1109/TIFS.2019.2932228

    Article  Google Scholar 

  11. Qiu J, Zhang J, Luo W, Pan L, Nepal S, Xiang Y (2020) A survey of Android malware detection with deep neural models. ACM Comput Surv 53(6):1–36. https://doi.org/10.1145/3417978

    Article  Google Scholar 

  12. Kelec A, Djuric Z (2020) A proposal for addressing security issues related to dynamic code loading on Android platform. Comput Syst Sci Eng 35(4):271–282. https://doi.org/10.32604/csse.2020.35.271

    Article  MATH  Google Scholar 

  13. Kim T, Kang B, Rho M, Sezer S, Im EG (2018) A multimodal deep learning method for Android malware detection using various features. IEEE Trans Inf Forensics Secur 14(3):773–788. https://doi.org/10.1109/TIFS.2018.2866319

    Article  MATH  Google Scholar 

  14. Naeem H (2019) Detection of malicious activities in internet of things environment based on binary visualization and machine intelligence. Wirel Pers Commun 108(4):2609–2629. https://doi.org/10.1007/s11277-019-06540-6

    Article  MATH  Google Scholar 

  15. Sharma G, Hiran D (2020) Reverse engineering for potential malware detection: Android APK Smali to Java. J Inf Assur Secur 15(1):26–34

    Google Scholar 

  16. Zhu H, Li Y, Li R, Li J, You Z, Song H (2020) Sedmdroid: an enhanced stacking ensemble framework for android malware detection. IEEE Trans Network Sci Eng 8(2):984–994. https://doi.org/10.1109/TNSE.2020.2996379

    Article  MATH  Google Scholar 

  17. Balikcioglu PG, Sirlanci M, Kucuk OA, Ulukapi B, Turkmen RK, Acarturk C (2023) Malicious code detection in Android: the role of sequence characteristics and disassembling methods. Int J Inf Secur 22(1):107–118. https://doi.org/10.48550/arXiv.2312.01113

    Article  Google Scholar 

  18. Mahesh PS, Hemalatha S (2022) An efficient Android malware detection using adaptive red fox optimization based CNN. Wirel Pers Commun 126(1):679–700. https://doi.org/10.1007/s11277-022-09765-0

    Article  Google Scholar 

  19. Nauman M, Tanveer TA, Khan S, Syed TA (2018) Deep neural architectures for large scale Android malware analysis. Clust Comput 21:569–588. https://doi.org/10.1007/s10586-017-0944-y

    Article  Google Scholar 

  20. Yan J, Qi Y, Rao Q (2018) LSTM-based hierarchical denoising network for Android malware detection. Secur Commun Netw 2018:1–18. https://doi.org/10.1155/2018/5249190

    Article  MATH  Google Scholar 

  21. Ma Z, Ge H, Liu Y, Zhao M, Ma J (2019) A combination method for Android malware detection based on control flow graphs and machine learning algorithms. IEEE Access 7:21235–21245. https://doi.org/10.1109/ACCESS.2019.2896003

    Article  MATH  Google Scholar 

  22. Jahromi AN, Hashemi S, Dehghantanha A, Parizi RM, Choo K-KR (2020) An enhanced stacked LSTM method with no random initialization for malware threat hunting in safety and time-critical systems. IEEE Trans Emerg Top Comput Intell 4(5):630–640. https://doi.org/10.1109/TETCI.2019.2910243

    Article  Google Scholar 

  23. Xu Y, Yan X, Wu Y, Hu Y, Liang W, Zhang J (2021) Hierarchical bidirectional RNN for safety-enhanced b5g heterogeneous networks. IEEE Trans Netw Sci Eng 8(4):2946–2957. https://doi.org/10.1109/TNSE.2021.3055762

    Article  MATH  Google Scholar 

  24. Yang Q, Wang X, Zheng J, Ge W, Bai M, Jiang F (2021) LSTM Android malicious behavior analysis based on feature weighting. KSII Trans Internet Inf Syst 15(6):2188–2203. https://doi.org/10.3837/tiis.2021.06.014

    Article  Google Scholar 

  25. Gupta R, Sharma K, Garg RK (2024) Innovative approach to android malware detection: prioritizing critical features using rough set theory. Electronics 13(3):482. https://doi.org/10.3390/electronics13030482

    Article  MATH  Google Scholar 

  26. Sasidharan SK, Thomas C (2021) Memdroid-LSTM based malware detection framework for Android devices. In: IEEE Pune Section International Conference (PuneCon). IEEE, pp 1–6. https://doi.org/10.1109/PuneCon52575.2021.9686531

  27. Kim J, Ban Y, Ko E, Cho H, Yi JH (2022) MAPAS: a practical deep learning-based Android malware detection system. Int J Inf Secur 21(4):725–738. https://doi.org/10.1007/s10207-022-00579-6

    Article  Google Scholar 

  28. Alzubi OA, Alzubi JA, Alzubi TM, Singh A (2023) Quantum mayfly optimization with encoder-decoder driven LSTM networks for malware detection and classification model. Mob Netw Appl. https://doi.org/10.1007/s11036-023-02105-x

    Article  Google Scholar 

  29. Shen L, Feng J, Chen Z, Sun Z, Liang D, Li H, Wang Y (2023) Self-attention based convolutional-LSTM for Android malware detection using network traffics grayscale image. Appl Intell 53(1):683–705. https://doi.org/10.1007/s10489-022-03523-2

    Article  Google Scholar 

  30. Sartea R, Farinelli A, Murari M (2020) Secur-ama: active malware analysis based on Monte Carlo tree search for Android systems. Eng Appl Artif Intell 87:103303. https://doi.org/10.1016/j.engappai.2019.103303

    Article  MATH  Google Scholar 

  31. Bidoki SM, Jalili S, Tajoddin A (2017) Pbmmd: a novel policy based multi-process malware detection. Eng Appl Artif Intell 60:57–70. https://doi.org/10.1016/j.engappai.2016.12.008

    Article  MATH  Google Scholar 

  32. Shaukat K, Luo S, Varadharajan V (2022) A novel method for improving the robustness of deep learning-based malware detectors against adversarial attacks. Eng Appl Artif Intell 116:105461. https://doi.org/10.1016/j.engappai.2022.105461

    Article  MATH  Google Scholar 

  33. Zhang H, Luo S, Zhang Y, Pan L (2019) An efficient Android malware detection system based on method-level behavioral semantic analysis. IEEE Access 7:69246–69256. https://doi.org/10.1109/ACCESS.2019.2919796

    Article  MATH  Google Scholar 

  34. Xiao X, Zhang S, Mercaldo F, Hu G, Sangaiah AK (2019) Android malware detection based on system call sequences and LSTM. Multimed Tools Appl 78:3979–3999. https://doi.org/10.1007/s11042-017-5104-0

    Article  Google Scholar 

  35. Wang R, Gao J, Huang S (2023) AIHGAT: a novel method of malware detection and homology analysis using assembly instruction heterogeneous graph. Int J Inf Secur. https://doi.org/10.1007/s10207-023-00699-7

    Article  MATH  Google Scholar 

  36. Qiu J, Han Q-L, Luo W, Pan L, Nepal S, Zhang J, Xiang Y (2022) Cyber code intelligence for Android malware detection. IEEE Trans Cybern 53(1):617–627. https://doi.org/10.1109/TCYB.2022.3164625

    Article  Google Scholar 

  37. Arp D, Spreitzenbarth M, Hubner M, Gascon H, Rieck K, Siemens C (2014) Drebin: effective and explainable detection of Android malware in your pocket. In: Ndss, vol 14, pp 23–26. https://doi.org/10.14722/ndss.2014.23247

  38. Mahdavifar S, Alhadidi D, Ghorbani AA (2022) Effective and efficient hybrid Android malware classification using pseudo-label stacked auto-encoder. J Netw Syst Manag 30:1–34. https://doi.org/10.1007/s10922-021-09634-4

    Article  MATH  Google Scholar 

  39. Kumar S, Mishra D, Panda B, Shukla SK (2022) AndroOBFS: time-tagged obfuscated Android malware dataset with family information. In: Proceedings of the 19th International Conference on Mining Software Repositories, pp 454–458. https://doi.org/10.1145/3524842.3528493

  40. Martín A, Lara-Cabrera R, Camacho D (2019) Android malware detection through hybrid features fusion and ensemble classifiers: the andropytool framework and the omnidroid dataset. Inf Fusion 52:128–142. https://doi.org/10.1016/J.INFFUS.2018.12.006

    Article  MATH  Google Scholar 

  41. Hu H, Liu Y, Zhao Y, Liu Y, Sun X, Tantithamthavorn C, Li L (2023) Detecting temporal inconsistency in biased datasets for android malware detection. In: 2023 38th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW). IEEE, pp 17–23. https://doi.org/10.1109/ASEW60602.2023.00007

  42. Cai H, Meng N, Ryder B, Yao D (2018) Droidcat: effective Android malware detection and categorization via app-level profiling. IEEE Trans Inf Forensics Secur 14(6):1455–1470. https://doi.org/10.1109/TIFS.2018.2879302

    Article  Google Scholar 

  43. Martín A, Rodríguez-Fernández V, Camacho D (2018) Candyman: classifying Android malware families by modelling dynamic traces with Markov chains. Eng Appl Artif Intell 74:121–133. https://doi.org/10.1016/j.engappai.2018.06.006

    Article  Google Scholar 

  44. Arora A, Peddoju SK, Conti M (2019) Permpair: Android malware detection using permission pairs. IEEE Trans Inf Forensics Secur 15:1968–1982. https://doi.org/10.1109/TIFS.2019.2950134

    Article  Google Scholar 

  45. Sasidharan SK, Thomas C (2021) ProDroid-An Android malware detection framework based on profile hidden Markov model. Pervasive Mob Comput 72:101336. https://doi.org/10.1016/j.pmcj.2021.101336

    Article  Google Scholar 

  46. Khalid S, Hussain FB (2022) Evaluating opcodes for detection of obfuscated Android malware. In: 2022 International Conference on Artificial Intelligence in Information and Communication (ICAIIC). IEEE, pp 044–049. https://doi.org/10.1109/ICAIIC54071.2022.9722669

  47. Rafiq H, Aslam N, Aleem M, Issac B, Randhawa RH (2022) Andromalpack: enhancing the ML-based malware classification by detection and removal of repacked apps for Android systems. Sci Rep 12(1):19534. https://doi.org/10.1038/s41598-022-23766-w

    Article  Google Scholar 

  48. Rodrigo C, Pierre S, Beaubrun R, El Khoury F (2021) Brainshield: a hybrid machine learning-based malware detection model for android devices. Electronics 10(23):2948. https://doi.org/10.3390/electronics10232948

    Article  MATH  Google Scholar 

  49. de Oliveira AS, Sassi RJ (2021) Hunting android malware using multimodal deep learning and hybrid analysis data. Sociedade Brasileira de Inteligência Computacional. https://doi.org/10.21528/CBIC2021-32

Download references

Funding

No funding was received to support this research.

Author information

Authors and Affiliations

Authors

Contributions

All the authors contributed equally.

Corresponding author

Correspondence to Abhishek Anand.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Anand, A., Singh, J.P. & Singh, A.K. Smali code-based deep learning model for Android malware detection. J Supercomput 81, 522 (2025). https://doi.org/10.1007/s11227-025-07055-7

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-025-07055-7

Keywords