Abstract
As malware threats continue to increase in both complexity and sophistication, the adoption of advanced detection methods, such as deep neural networks (DNNs) for malware classification, has become increasingly vital to safeguard digital infrastructure and protect sensitive data. In order to measure progress in this safety-critical landscape, we propose two malware classification benchmarks: a feature-based benchmark and an image-based benchmark. Feature-based datasets provide a detailed understanding of malware characteristics, and image-based datasets transform raw malware binary data into grayscale images for swift processing. These datasets can be used for both binary classification (benign vs. malicious) as well as classifying known malware into a particular family. This paper, therefore, introduces two benchmark datasets for binary and family classification with varying difficulty levels to quantify improvements in malware classification strategies. Key contributions include the creation of feature and image dataset benchmarks, and the validation of a trained binary classification network using the feature dataset benchmark. Benchmarks as well as example training code are available.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
Code available here: https://github.com/pkrobinette/malware_benchmarks.
References
Antonakakis, M., et al.: Understanding the MIRAI botnet. In: 26th \(\{\)USENIX\(\}\) security symposium (\(\{\)USENIX\(\}\) Security 17), pp. 1093–1110 (2017)
Awan, M.J., et al.: Image-based malware classification using VGG19 network and spatial convolutional attention. Electronics 10(19), 2444 (2021)
Bhodia, N., Prajapati, P., Di Troia, F., Stamp, M.: Transfer learning for image-based malware classification. arXiv preprint arXiv:1903.11551 (2019)
Carvalho, M., DeMott, J., Ford, R., Wheeler, D.A.: Heartbleed 101. IEEE Secur. Privacy 12(4), 63–67 (2014). https://doi.org/10.1109/MSP.2014.66
Khan, I., Kwon, Y.W.: Attention-based malware detection of android applications. In: 2022 IEEE International Conference on Big Data (Big Data), pp. 6693–6695 (2022). https://doi.org/10.1109/BigData55660.2022.10020684
Lipp, M., et al.: Meltdown: reading kernel memory from user space. In: 27th USENIX Security Symposium (USENIX Security 18) (2018)
McIntosh, T., Kayes, A.S.M., Chen, Y.P.P., Ng, A., Watters, P.: Ransomware mitigation in the modern era: a comprehensive review, research challenges, and future directions. ACM Comput. Surv. 54(9) (2021). https://doi.org/10.1145/3479393
Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B.S.: Malware images: visualization and automatic classification. In: Proceedings of the 8th International Symposium on Visualization for Cyber Security, pp. 1–7 (2011)
Oyama, Y., Miyashita, T., Kokubo, H.: Identifying useful features for malware detection in the ember dataset. In: 2019 Seventh International Symposium on Computing and Networking Workshops (CANDARW), pp. 360–366 (2019). https://doi.org/10.1109/CANDARW.2019.00069
Singh, A., Handa, A., Kumar, N., Shukla, S.K.: Malware classification using image representation. In: Dolev, S., Hendler, D., Lodha, S., Yung, M. (eds.) CSCML 2019. LNCS, vol. 11527, pp. 75–92. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20951-3_6
Tekiner, E., Acar, A., Uluagac, A.S., Kirda, E., Selcuk, A.A.: SOK: cryptojacking malware. In: 2021 IEEE European Symposium on Security and Privacy (EuroS &P), pp. 120–139. IEEE (2021)
Vasan, D., Alazab, M., Wassan, S., Naeem, H., Safaei, B., Zheng, Q.: IMCFN: image-based malware classification using fine-tuned convolutional neural network architecture. Comput. Netw. 171, 107138 (2020)
Vasan, D., Alazab, M., Wassan, S., Safaei, B., Zheng, Q.: Image-based malware classification using ensemble of CNN architectures (IMCEC). Comput. Secur. 92, 101748 (2020)
Yang, L., Ciptadi, A., Laziuk, I., Ahmadzadeh, A., Wang, G.: BODMAS: an open dataset for learning based temporal analysis of PE malware. In: 4th Deep Learning and Security Workshop (2021)
Zhang, F., Wang, H., Leach, K., Stavrou, A.: A framework to secure peripherals at runtime. In: Kutyłowski, M., Vaidya, J. (eds.) ESORICS 2014. LNCS, vol. 8712, pp. 219–238. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11203-9_13
Acknowledgement
This paper was supported in part by a fellowship award under contract FA9550-21-F-0003 through the National Defense Science and Engineering Graduate (NDSEG) Fellowship Program, sponsored by the Air Force Research Laboratory (AFRL), the Office of Naval Research (ONR), and the Army Research Office (ARO). The material presented in this paper is based upon work supported by the National Science Foundation (NSF) through grant numbers 2220426 and 2220401, the Defense Advanced Research Projects Agency (DARPA) under contract number FA8750-23-C-0518, and the Air Force Office of Scientific Research (AFOSR) under contract number FA9550-22-1-0019 and FA9550-23-1-0135. Any opinions, findings, and conclusions or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of AFOSR, DARPA, or NSF.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Robinette, P.K., Lopez, D.M., Johnson, T.T. (2024). Benchmark: Neural Network Malware Classification. In: Steffen, B. (eds) Bridging the Gap Between AI and Reality. AISoLA 2023. Lecture Notes in Computer Science, vol 14380. Springer, Cham. https://doi.org/10.1007/978-3-031-46002-9_17
Download citation
DOI: https://doi.org/10.1007/978-3-031-46002-9_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46001-2
Online ISBN: 978-3-031-46002-9
eBook Packages: Computer ScienceComputer Science (R0)