Abstract
Malware is becoming increasingly prevalent in recent years with the widespread deployment of the information system. Many malicious programs pose a great threat to information systems. In the past decade, various malware detection methods are proposed. Particularly, many studies rely on API features for identifying malware. However, the existing methods do not fully make use of the API features. To address these issues, we propose APInspector, a novel dynamic malware detection solution by carefully inspecting API invocations. This method first leverages a dynamic instrumentation tool to hook the target program for collecting the API sequence and argument features. Then, it exploits a HAN (Hierarchical Attention Network) model to analyze the API sequence features. For analyzing the API argument features, we apply an MLP (Multi-Layer Perceptron) model. To fully leverage the API sequence and argument features, we propose a hybrid model, which combines the HAN and MLP models. The evaluation shows that our approach can detect and classify malware effectively and it outperforms the single models.
Similar content being viewed by others
Data availability and materials
The supporting data and materials will be available upon request.
References
Aghakhani H, Gritti F, Mecca F, Lindorfer M, Ortolani S, Balzarotti D, Vigna G, Kruegel C (2020) When malware is packin’ heat; limits of machine learning classifiers based on static analysis features. In: Proceedings of Symposium on Network and Distributed System Security (NDSS). The Internet Society
Ahmed F, Hameed H, Zubair Shafiq M, Farooq M (2009). Using spatio-temporal information in API calls with machine learning algorithms for malware detection. In: Proceedings of the 2nd ACM Workshop on Security and Artificial Intelligence, AISec ’09, pp 55–62
Bagher BM, Mahdi A, Asghar T (2019) HLMD: a signature-based approach to hardware-level behavioral malware detection and classification. J. Supercomput. 75(8):5551–5582
Borrello P, Coppa E, D’Elia DC (2021) Hiding in the particles: When return-oriented programming meets program obfuscation. In: 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp 555–568
Canzanese R, Mancoridis S, Kam M (2015) System call-based detection of malicious processes. In: 2015 IEEE International Conference on Software Quality, Reliability and Security, pp 119–124
Chen L, Sultana S, Sahita R (2018) Henet: A deep learning approach on intel®processor trace for effective exploit detection. In: 2018 IEEE Security and Privacy Workshops (SPW), pp 109–115
Chen Xiaohui, Hao Zhiyu, Li Lun, Cui Lei, Zhu Yiran, Ding Zhenquan, Liu Yongji (2022) Cruparamer: learning on parameter-augmented API sequences for malware detection. IEEE Tran Inf Forensics Secur 17:788–803
Chen X, Tong Y, Du C, Liu Y, Ding Z, Ran Q, Zhang Y, Cui L, Hao Z (2022) Malpro: Learning on process-aware behaviors for malware detection. In: 2022 IEEE Symposium on Computers and Communications (ISCC), pp 01–07
Christodorescu M, Jha S, Seshia SA, Song D, Bryant RE (2005) Semantics-aware malware detection. In: 2005 IEEE Symposium on Security and Privacy (S P’05), pp 32–46
Darem A, Abawajy J, Makkar A, Alhashmi A, Alanazi S (2021) Visualization and deep-learning-based malware variant detection using opcode-level features. Fut Gener Comput Syst 125:314–323
Han W, Xue J, Wang Y, Huang L, Kong Z, Limin M (2019) Maldae: detecting and explaining malware based on correlation and fusion of static and dynamic characteristics. Comput Secur 83:208–233
Han W, Xue J, Wang Y, Liu Z, Kong Z (2019) Malinsight: a systematic profiling based malware detection framework. J Netw Comput Appl 125:236–250
Jindal C, Salls C, Aghakhani H, Long K, Kruegel C, Vigna G (2019) Neurlux: dynamic malware analysis without feature engineering. In: Proceedings of the 35th Annual Computer Security Applications Conference, ACSAC ’19, pp 444–455
Kalash M, Rochan M, Mohammed N, Bruce ND, Wang Y, Iqbal F (2018) Malware classification with deep convolutional neural networks. In: 2018 9th IFIP international conference on new technologies, mobility and security (NTMS), pp 1–5
Kolosnjaji B, Zarras A, Webster G, Eckert C (2016) Deep learning for classification of malware system call sequences. In: Kang BH, Bai Q, editors, AI 2016: Advances in Artificial Intelligence, pp 137–149
Kong D, Yan G (2013) Discriminant malware distance learning on structural information for automated malware classification. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 1357–1365
Lanzi A, Balzarotti D, Kruegel C, Christodorescu M, Kirda E (2010) Accessminer: Using system-centric models for malware protection. In: Proceedings of the 17th ACM Conference on Computer and Communications Security, CCS ’10, pp 399–412
Li C, Cheng Z, Zhu H, Wang L, Lv Q, Wang Y, Li N, Sun D (2022) DMalNet: dynamic malware analysis based on API feature engineering and graph learning. Comput Secur 122:102872
Li D, Li Q (2020) Adversarial deep ensemble: evasion attacks and defenses for malware detection. IEEE Trans Inf Forensics Secur 15:3886–3900
Lu R (2019) Malware detection with LSTM using opcode language. arxiv:1906.04593
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems - Vol 2, NIPS’13, pp 3111–3119
Min D, Park D, Ahn J, Walker R, Lee J, Park S, Kim Y (2018) Amoeba: an autonomous backup and recovery SSD for ransomware attack defense. IEEE Comput Architect Lett 17(2):245–248
Moskovitch R, Feher C, Tzachar N, Berger E, Gitelman M, Dolev S, Elovici Y (2008) Unknown malcode detection using opcode representation. In: Intelligence and Security Informatics, pp 204–215
Naeem H, Ullah F, Naeem MR, Khalid S, Vasan D, Jabbar S, Saeed S (2020) Malware detection in industrial internet of things based on hybrid image visualization and deep learning model. Ad Hoc Netw 105:102154
Narouei M, Ahmadi M, Giacinto G, Takabi H, Sami A (2015) DLLMiner: structural mining for malware detection. Secur Commun Netw 8(18):3311–3322
Ni S, Qian Q, Zhang R (2018) Malware identification using visualization images and deep learning. Comput Secur 77:871–885
Qiang W, Yang L, Jin H (2022) Efficient and robust malware detection based on control flow traces using deep neural networks. Comput Secur 122:102871
Ravnas OAV (2022) FRIDA: a world-class dynamic instrumentation toolkit. https://frida.re/
Ren K, Zheng T, Qin Z, Liu X (2020) Adversarial attacks and defenses in deep learning. Engineering 6(3):346–360
Sayadi H, Patel N, Sasan A, Rafatirad S, Homayoun H (2018) Ensemble learning for effective run-time hardware-based malware detection: a comprehensive analysis and classification. In: 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC), pp 1–6
Sebastio S, Baranov E, Biondi F, Decourbe O, Given-Wilson T, Legay A, Puodzius C, Quilbeuf J (2020) Optimizing symbolic execution for malware behavior classification. Comput Secur 93:101775
SL SD, Jaidhar CD (2019) Windows malware detector using convolutional neural network based on visualization images. In: IEEE Transactions on Emerging Topics in Computing
Sun G, Qian Q (2018) Deep learning and visualization for identifying malware families. In: IEEE Transactions on Dependable and Secure Computing
Tang A, Sethumadhavan S, Stolfo SJ (2014) Unsupervised anomaly-based malware detection using hardware features. In: Stavrou A, Bos H, Portokalidis G (eds) Research in attacks, intrusions and defenses. Springer International Publishing, Cham, pp 109–129
Tang F, Ma B, Li J, Zhang F, Su J, Ma J (2020) Ransomspector: an introspection-based approach to detect crypto ransomware. Comput Secur 97:101997
Tian D, Ying Q, Jia X, Ma R, Hu C, Liu W (2021) MDCHD: a novel malware detection method in cloud using hardware trace and deep learning. Comput Netw 198:108394
Xiaofeng L, Fangshuo J, Xiao Z, Shengwei Y, Jing S, Lio P (2019) ASSCA: API sequence and statistics features combined architecture for malware detection. Comput Netw 157:99–111
Xu D, Ming J, Fu Y, Wu D (2018) Vmhunt: A verifiable approach to partially-virtualized binary code simplification. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS ’18, pp 442-458
Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1480–1489
Funding
This work was supported by Science and Technology Project of State Grid Corporation of China under Grant 5108-202218280A-2-154-XG.
Author information
Authors and Affiliations
Contributions
All authors contributed equally to this study.
Corresponding author
Ethics declarations
Conflict of interest
The authors state that they have no known competing financial interests or personal ties that could have appeared to affect the work reported in this study.
Ethical approval
This article does not contain any studies with human participants performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhou, B., Huang, H., Xia, J. et al. A novel malware detection method based on API embedding and API parameters. J Supercomput 80, 2748–2766 (2024). https://doi.org/10.1007/s11227-023-05556-x
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-023-05556-x