A novel malware detection method based on API embedding and API parameters

Zhou, Bo; Huang, Hai; Xia, Jun; Tian, Donghai

doi:10.1007/s11227-023-05556-x

A novel malware detection method based on API embedding and API parameters

Published: 21 August 2023

Volume 80, pages 2748–2766, (2024)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Bo Zhou¹,
Hai Huang¹,
Jun Xia² &
…
Donghai Tian³

155 Accesses
Explore all metrics

Abstract

Malware is becoming increasingly prevalent in recent years with the widespread deployment of the information system. Many malicious programs pose a great threat to information systems. In the past decade, various malware detection methods are proposed. Particularly, many studies rely on API features for identifying malware. However, the existing methods do not fully make use of the API features. To address these issues, we propose APInspector, a novel dynamic malware detection solution by carefully inspecting API invocations. This method first leverages a dynamic instrumentation tool to hook the target program for collecting the API sequence and argument features. Then, it exploits a HAN (Hierarchical Attention Network) model to analyze the API sequence features. For analyzing the API argument features, we apply an MLP (Multi-Layer Perceptron) model. To fully leverage the API sequence and argument features, we propose a hybrid model, which combines the HAN and MLP models. The evaluation shows that our approach can detect and classify malware effectively and it outperforms the single models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Method of Multi-feature Fusion Based on Attention Mechanism in Malicious Software Detection

Detecting Malware Based on Dynamic Analysis Techniques Using Deep Graph Learning

A Hierarchical Graph-Based Neural Network for Malware Classification

Data availability and materials

The supporting data and materials will be available upon request.

Notes

https://www.autoitscript.com/site/.

References

Aghakhani H, Gritti F, Mecca F, Lindorfer M, Ortolani S, Balzarotti D, Vigna G, Kruegel C (2020) When malware is packin’ heat; limits of machine learning classifiers based on static analysis features. In: Proceedings of Symposium on Network and Distributed System Security (NDSS). The Internet Society
Ahmed F, Hameed H, Zubair Shafiq M, Farooq M (2009). Using spatio-temporal information in API calls with machine learning algorithms for malware detection. In: Proceedings of the 2nd ACM Workshop on Security and Artificial Intelligence, AISec ’09, pp 55–62
Bagher BM, Mahdi A, Asghar T (2019) HLMD: a signature-based approach to hardware-level behavioral malware detection and classification. J. Supercomput. 75(8):5551–5582
Article Google Scholar
Borrello P, Coppa E, D’Elia DC (2021) Hiding in the particles: When return-oriented programming meets program obfuscation. In: 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp 555–568
Canzanese R, Mancoridis S, Kam M (2015) System call-based detection of malicious processes. In: 2015 IEEE International Conference on Software Quality, Reliability and Security, pp 119–124
Chen L, Sultana S, Sahita R (2018) Henet: A deep learning approach on intel®processor trace for effective exploit detection. In: 2018 IEEE Security and Privacy Workshops (SPW), pp 109–115
Chen Xiaohui, Hao Zhiyu, Li Lun, Cui Lei, Zhu Yiran, Ding Zhenquan, Liu Yongji (2022) Cruparamer: learning on parameter-augmented API sequences for malware detection. IEEE Tran Inf Forensics Secur 17:788–803
Article Google Scholar
Chen X, Tong Y, Du C, Liu Y, Ding Z, Ran Q, Zhang Y, Cui L, Hao Z (2022) Malpro: Learning on process-aware behaviors for malware detection. In: 2022 IEEE Symposium on Computers and Communications (ISCC), pp 01–07
Christodorescu M, Jha S, Seshia SA, Song D, Bryant RE (2005) Semantics-aware malware detection. In: 2005 IEEE Symposium on Security and Privacy (S P’05), pp 32–46
Darem A, Abawajy J, Makkar A, Alhashmi A, Alanazi S (2021) Visualization and deep-learning-based malware variant detection using opcode-level features. Fut Gener Comput Syst 125:314–323
Article Google Scholar
Han W, Xue J, Wang Y, Huang L, Kong Z, Limin M (2019) Maldae: detecting and explaining malware based on correlation and fusion of static and dynamic characteristics. Comput Secur 83:208–233
Article Google Scholar
Han W, Xue J, Wang Y, Liu Z, Kong Z (2019) Malinsight: a systematic profiling based malware detection framework. J Netw Comput Appl 125:236–250
Article Google Scholar
Jindal C, Salls C, Aghakhani H, Long K, Kruegel C, Vigna G (2019) Neurlux: dynamic malware analysis without feature engineering. In: Proceedings of the 35th Annual Computer Security Applications Conference, ACSAC ’19, pp 444–455
Kalash M, Rochan M, Mohammed N, Bruce ND, Wang Y, Iqbal F (2018) Malware classification with deep convolutional neural networks. In: 2018 9th IFIP international conference on new technologies, mobility and security (NTMS), pp 1–5
Kolosnjaji B, Zarras A, Webster G, Eckert C (2016) Deep learning for classification of malware system call sequences. In: Kang BH, Bai Q, editors, AI 2016: Advances in Artificial Intelligence, pp 137–149
Kong D, Yan G (2013) Discriminant malware distance learning on structural information for automated malware classification. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 1357–1365
Lanzi A, Balzarotti D, Kruegel C, Christodorescu M, Kirda E (2010) Accessminer: Using system-centric models for malware protection. In: Proceedings of the 17th ACM Conference on Computer and Communications Security, CCS ’10, pp 399–412
Li C, Cheng Z, Zhu H, Wang L, Lv Q, Wang Y, Li N, Sun D (2022) DMalNet: dynamic malware analysis based on API feature engineering and graph learning. Comput Secur 122:102872
Article Google Scholar
Li D, Li Q (2020) Adversarial deep ensemble: evasion attacks and defenses for malware detection. IEEE Trans Inf Forensics Secur 15:3886–3900
Article Google Scholar
Lu R (2019) Malware detection with LSTM using opcode language. arxiv:1906.04593
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems - Vol 2, NIPS’13, pp 3111–3119
Min D, Park D, Ahn J, Walker R, Lee J, Park S, Kim Y (2018) Amoeba: an autonomous backup and recovery SSD for ransomware attack defense. IEEE Comput Architect Lett 17(2):245–248
Article Google Scholar
Moskovitch R, Feher C, Tzachar N, Berger E, Gitelman M, Dolev S, Elovici Y (2008) Unknown malcode detection using opcode representation. In: Intelligence and Security Informatics, pp 204–215
Naeem H, Ullah F, Naeem MR, Khalid S, Vasan D, Jabbar S, Saeed S (2020) Malware detection in industrial internet of things based on hybrid image visualization and deep learning model. Ad Hoc Netw 105:102154
Article Google Scholar
Narouei M, Ahmadi M, Giacinto G, Takabi H, Sami A (2015) DLLMiner: structural mining for malware detection. Secur Commun Netw 8(18):3311–3322
Article Google Scholar
Ni S, Qian Q, Zhang R (2018) Malware identification using visualization images and deep learning. Comput Secur 77:871–885
Article Google Scholar
Qiang W, Yang L, Jin H (2022) Efficient and robust malware detection based on control flow traces using deep neural networks. Comput Secur 122:102871
Article Google Scholar
Ravnas OAV (2022) FRIDA: a world-class dynamic instrumentation toolkit. https://frida.re/
Ren K, Zheng T, Qin Z, Liu X (2020) Adversarial attacks and defenses in deep learning. Engineering 6(3):346–360
Article Google Scholar
Sayadi H, Patel N, Sasan A, Rafatirad S, Homayoun H (2018) Ensemble learning for effective run-time hardware-based malware detection: a comprehensive analysis and classification. In: 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC), pp 1–6
Sebastio S, Baranov E, Biondi F, Decourbe O, Given-Wilson T, Legay A, Puodzius C, Quilbeuf J (2020) Optimizing symbolic execution for malware behavior classification. Comput Secur 93:101775
Article Google Scholar
SL SD, Jaidhar CD (2019) Windows malware detector using convolutional neural network based on visualization images. In: IEEE Transactions on Emerging Topics in Computing
Sun G, Qian Q (2018) Deep learning and visualization for identifying malware families. In: IEEE Transactions on Dependable and Secure Computing
Tang A, Sethumadhavan S, Stolfo SJ (2014) Unsupervised anomaly-based malware detection using hardware features. In: Stavrou A, Bos H, Portokalidis G (eds) Research in attacks, intrusions and defenses. Springer International Publishing, Cham, pp 109–129
Chapter Google Scholar
Tang F, Ma B, Li J, Zhang F, Su J, Ma J (2020) Ransomspector: an introspection-based approach to detect crypto ransomware. Comput Secur 97:101997
Article Google Scholar
Tian D, Ying Q, Jia X, Ma R, Hu C, Liu W (2021) MDCHD: a novel malware detection method in cloud using hardware trace and deep learning. Comput Netw 198:108394
Article Google Scholar
Xiaofeng L, Fangshuo J, Xiao Z, Shengwei Y, Jing S, Lio P (2019) ASSCA: API sequence and statistics features combined architecture for malware detection. Comput Netw 157:99–111
Article Google Scholar
Xu D, Ming J, Fu Y, Wu D (2018) Vmhunt: A verifiable approach to partially-virtualized binary code simplification. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS ’18, pp 442-458
Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1480–1489

Download references

Funding

This work was supported by Science and Technology Project of State Grid Corporation of China under Grant 5108-202218280A-2-154-XG.

Author information

Authors and Affiliations

Hunan SGIT Technology Company Limited, Changsha, 410118, China
Bo Zhou & Hai Huang
State Grid Hunan Electric Power Company Limited, Changsha, 410007, China
Jun Xia
Beijing Fanwang Hulian Technology Company Limited and Beijing Institute of Technology, Beijing, 100081, China
Donghai Tian

Authors

Bo Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Hai Huang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Xia
View author publications
You can also search for this author in PubMed Google Scholar
Donghai Tian
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed equally to this study.

Corresponding author

Correspondence to Donghai Tian.

Ethics declarations

Conflict of interest

The authors state that they have no known competing financial interests or personal ties that could have appeared to affect the work reported in this study.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhou, B., Huang, H., Xia, J. et al. A novel malware detection method based on API embedding and API parameters. J Supercomput 80, 2748–2766 (2024). https://doi.org/10.1007/s11227-023-05556-x

Download citation

Accepted: 02 August 2023
Published: 21 August 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s11227-023-05556-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel malware detection method based on API embedding and API parameters

Abstract

Access this article

Similar content being viewed by others

Method of Multi-feature Fusion Based on Attention Mechanism in Malicious Software Detection

Detecting Malware Based on Dynamic Analysis Techniques Using Deep Graph Learning

A Hierarchical Graph-Based Neural Network for Malware Classification

Data availability and materials

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A novel malware detection method based on API embedding and API parameters

Abstract

Access this article

Similar content being viewed by others

Method of Multi-feature Fusion Based on Attention Mechanism in Malicious Software Detection

Detecting Malware Based on Dynamic Analysis Techniques Using Deep Graph Learning

A Hierarchical Graph-Based Neural Network for Malware Classification

Data availability and materials

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation