Skip to main content
Log in

A novel malware detection method based on API embedding and API parameters

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Malware is becoming increasingly prevalent in recent years with the widespread deployment of the information system. Many malicious programs pose a great threat to information systems. In the past decade, various malware detection methods are proposed. Particularly, many studies rely on API features for identifying malware. However, the existing methods do not fully make use of the API features. To address these issues, we propose APInspector, a novel dynamic malware detection solution by carefully inspecting API invocations. This method first leverages a dynamic instrumentation tool to hook the target program for collecting the API sequence and argument features. Then, it exploits a HAN (Hierarchical Attention Network) model to analyze the API sequence features. For analyzing the API argument features, we apply an MLP (Multi-Layer Perceptron) model. To fully leverage the API sequence and argument features, we propose a hybrid model, which combines the HAN and MLP models. The evaluation shows that our approach can detect and classify malware effectively and it outperforms the single models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability and materials

The supporting data and materials will be available upon request.

Notes

  1. https://www.autoitscript.com/site/.

References

  1. Aghakhani H, Gritti F, Mecca F, Lindorfer M, Ortolani S, Balzarotti D, Vigna G, Kruegel C (2020) When malware is packin’ heat; limits of machine learning classifiers based on static analysis features. In: Proceedings of Symposium on Network and Distributed System Security (NDSS). The Internet Society

  2. Ahmed F, Hameed H, Zubair Shafiq M, Farooq M (2009). Using spatio-temporal information in API calls with machine learning algorithms for malware detection. In: Proceedings of the 2nd ACM Workshop on Security and Artificial Intelligence, AISec ’09, pp 55–62

  3. Bagher BM, Mahdi A, Asghar T (2019) HLMD: a signature-based approach to hardware-level behavioral malware detection and classification. J. Supercomput. 75(8):5551–5582

    Article  Google Scholar 

  4. Borrello P, Coppa E, D’Elia DC (2021) Hiding in the particles: When return-oriented programming meets program obfuscation. In: 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp 555–568

  5. Canzanese R, Mancoridis S, Kam M (2015) System call-based detection of malicious processes. In: 2015 IEEE International Conference on Software Quality, Reliability and Security, pp 119–124

  6. Chen L, Sultana S, Sahita R (2018) Henet: A deep learning approach on intel®processor trace for effective exploit detection. In: 2018 IEEE Security and Privacy Workshops (SPW), pp 109–115

  7. Chen Xiaohui, Hao Zhiyu, Li Lun, Cui Lei, Zhu Yiran, Ding Zhenquan, Liu Yongji (2022) Cruparamer: learning on parameter-augmented API sequences for malware detection. IEEE Tran Inf Forensics Secur 17:788–803

    Article  Google Scholar 

  8. Chen X, Tong Y, Du C, Liu Y, Ding Z, Ran Q, Zhang Y, Cui L, Hao Z (2022) Malpro: Learning on process-aware behaviors for malware detection. In: 2022 IEEE Symposium on Computers and Communications (ISCC), pp 01–07

  9. Christodorescu M, Jha S, Seshia SA, Song D, Bryant RE (2005) Semantics-aware malware detection. In: 2005 IEEE Symposium on Security and Privacy (S P’05), pp 32–46

  10. Darem A, Abawajy J, Makkar A, Alhashmi A, Alanazi S (2021) Visualization and deep-learning-based malware variant detection using opcode-level features. Fut Gener Comput Syst 125:314–323

    Article  Google Scholar 

  11. Han W, Xue J, Wang Y, Huang L, Kong Z, Limin M (2019) Maldae: detecting and explaining malware based on correlation and fusion of static and dynamic characteristics. Comput Secur 83:208–233

    Article  Google Scholar 

  12. Han W, Xue J, Wang Y, Liu Z, Kong Z (2019) Malinsight: a systematic profiling based malware detection framework. J Netw Comput Appl 125:236–250

    Article  Google Scholar 

  13. Jindal C, Salls C, Aghakhani H, Long K, Kruegel C, Vigna G (2019) Neurlux: dynamic malware analysis without feature engineering. In: Proceedings of the 35th Annual Computer Security Applications Conference, ACSAC ’19, pp 444–455

  14. Kalash M, Rochan M, Mohammed N, Bruce ND, Wang Y, Iqbal F (2018) Malware classification with deep convolutional neural networks. In: 2018 9th IFIP international conference on new technologies, mobility and security (NTMS), pp 1–5

  15. Kolosnjaji B, Zarras A, Webster G, Eckert C (2016) Deep learning for classification of malware system call sequences. In: Kang BH, Bai Q, editors, AI 2016: Advances in Artificial Intelligence, pp 137–149

  16. Kong D, Yan G (2013) Discriminant malware distance learning on structural information for automated malware classification. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 1357–1365

  17. Lanzi A, Balzarotti D, Kruegel C, Christodorescu M, Kirda E (2010) Accessminer: Using system-centric models for malware protection. In: Proceedings of the 17th ACM Conference on Computer and Communications Security, CCS ’10, pp 399–412

  18. Li C, Cheng Z, Zhu H, Wang L, Lv Q, Wang Y, Li N, Sun D (2022) DMalNet: dynamic malware analysis based on API feature engineering and graph learning. Comput Secur 122:102872

    Article  Google Scholar 

  19. Li D, Li Q (2020) Adversarial deep ensemble: evasion attacks and defenses for malware detection. IEEE Trans Inf Forensics Secur 15:3886–3900

    Article  Google Scholar 

  20. Lu R (2019) Malware detection with LSTM using opcode language. arxiv:1906.04593

  21. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems - Vol 2, NIPS’13, pp 3111–3119

  22. Min D, Park D, Ahn J, Walker R, Lee J, Park S, Kim Y (2018) Amoeba: an autonomous backup and recovery SSD for ransomware attack defense. IEEE Comput Architect Lett 17(2):245–248

    Article  Google Scholar 

  23. Moskovitch R, Feher C, Tzachar N, Berger E, Gitelman M, Dolev S, Elovici Y (2008) Unknown malcode detection using opcode representation. In: Intelligence and Security Informatics, pp 204–215

  24. Naeem H, Ullah F, Naeem MR, Khalid S, Vasan D, Jabbar S, Saeed S (2020) Malware detection in industrial internet of things based on hybrid image visualization and deep learning model. Ad Hoc Netw 105:102154

    Article  Google Scholar 

  25. Narouei M, Ahmadi M, Giacinto G, Takabi H, Sami A (2015) DLLMiner: structural mining for malware detection. Secur Commun Netw 8(18):3311–3322

    Article  Google Scholar 

  26. Ni S, Qian Q, Zhang R (2018) Malware identification using visualization images and deep learning. Comput Secur 77:871–885

    Article  Google Scholar 

  27. Qiang W, Yang L, Jin H (2022) Efficient and robust malware detection based on control flow traces using deep neural networks. Comput Secur 122:102871

    Article  Google Scholar 

  28. Ravnas OAV (2022) FRIDA: a world-class dynamic instrumentation toolkit. https://frida.re/

  29. Ren K, Zheng T, Qin Z, Liu X (2020) Adversarial attacks and defenses in deep learning. Engineering 6(3):346–360

    Article  Google Scholar 

  30. Sayadi H, Patel N, Sasan A, Rafatirad S, Homayoun H (2018) Ensemble learning for effective run-time hardware-based malware detection: a comprehensive analysis and classification. In: 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC), pp 1–6

  31. Sebastio S, Baranov E, Biondi F, Decourbe O, Given-Wilson T, Legay A, Puodzius C, Quilbeuf J (2020) Optimizing symbolic execution for malware behavior classification. Comput Secur 93:101775

    Article  Google Scholar 

  32. SL SD, Jaidhar CD (2019) Windows malware detector using convolutional neural network based on visualization images. In: IEEE Transactions on Emerging Topics in Computing

  33. Sun G, Qian Q (2018) Deep learning and visualization for identifying malware families. In: IEEE Transactions on Dependable and Secure Computing

  34. Tang A, Sethumadhavan S, Stolfo SJ (2014) Unsupervised anomaly-based malware detection using hardware features. In: Stavrou A, Bos H, Portokalidis G (eds) Research in attacks, intrusions and defenses. Springer International Publishing, Cham, pp 109–129

    Chapter  Google Scholar 

  35. Tang F, Ma B, Li J, Zhang F, Su J, Ma J (2020) Ransomspector: an introspection-based approach to detect crypto ransomware. Comput Secur 97:101997

    Article  Google Scholar 

  36. Tian D, Ying Q, Jia X, Ma R, Hu C, Liu W (2021) MDCHD: a novel malware detection method in cloud using hardware trace and deep learning. Comput Netw 198:108394

    Article  Google Scholar 

  37. Xiaofeng L, Fangshuo J, Xiao Z, Shengwei Y, Jing S, Lio P (2019) ASSCA: API sequence and statistics features combined architecture for malware detection. Comput Netw 157:99–111

    Article  Google Scholar 

  38. Xu D, Ming J, Fu Y, Wu D (2018) Vmhunt: A verifiable approach to partially-virtualized binary code simplification. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS ’18, pp 442-458

  39. Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1480–1489

Download references

Funding

This work was supported by Science and Technology Project of State Grid Corporation of China under Grant 5108-202218280A-2-154-XG.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed equally to this study.

Corresponding author

Correspondence to Donghai Tian.

Ethics declarations

Conflict of interest

The authors state that they have no known competing financial interests or personal ties that could have appeared to affect the work reported in this study.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, B., Huang, H., Xia, J. et al. A novel malware detection method based on API embedding and API parameters. J Supercomput 80, 2748–2766 (2024). https://doi.org/10.1007/s11227-023-05556-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-023-05556-x

Keywords

Navigation