Benchmark: Neural Network Malware Classification

Robinette, Preston K.; Lopez, Diego Manzanas; Johnson, Taylor T.

doi:10.1007/978-3-031-46002-9_17

Preston K. Robinette⁸,
Diego Manzanas Lopez⁸ &
Taylor T. Johnson⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14380))

Included in the following conference series:

International Conference on Bridging the Gap between AI and Reality

524 Accesses

Abstract

As malware threats continue to increase in both complexity and sophistication, the adoption of advanced detection methods, such as deep neural networks (DNNs) for malware classification, has become increasingly vital to safeguard digital infrastructure and protect sensitive data. In order to measure progress in this safety-critical landscape, we propose two malware classification benchmarks: a feature-based benchmark and an image-based benchmark. Feature-based datasets provide a detailed understanding of malware characteristics, and image-based datasets transform raw malware binary data into grayscale images for swift processing. These datasets can be used for both binary classification (benign vs. malicious) as well as classifying known malware into a particular family. This paper, therefore, introduces two benchmark datasets for binary and family classification with varying difficulty levels to quantify improvements in malware classification strategies. Key contributions include the creation of feature and image dataset benchmarks, and the validation of a trained binary classification network using the feature dataset benchmark. Benchmarks as well as example training code are available.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Code: https://github.com/pkrobinette/malware_benchmarks.
2.
Code available here: https://github.com/pkrobinette/malware_benchmarks.

References

Antonakakis, M., et al.: Understanding the MIRAI botnet. In: 26th \(\{\)USENIX\(\}\) security symposium (\(\{\)USENIX\(\}\) Security 17), pp. 1093–1110 (2017)
Google Scholar
Awan, M.J., et al.: Image-based malware classification using VGG19 network and spatial convolutional attention. Electronics 10(19), 2444 (2021)
Article Google Scholar
Bhodia, N., Prajapati, P., Di Troia, F., Stamp, M.: Transfer learning for image-based malware classification. arXiv preprint arXiv:1903.11551 (2019)
Carvalho, M., DeMott, J., Ford, R., Wheeler, D.A.: Heartbleed 101. IEEE Secur. Privacy 12(4), 63–67 (2014). https://doi.org/10.1109/MSP.2014.66
Article Google Scholar
Khan, I., Kwon, Y.W.: Attention-based malware detection of android applications. In: 2022 IEEE International Conference on Big Data (Big Data), pp. 6693–6695 (2022). https://doi.org/10.1109/BigData55660.2022.10020684
Lipp, M., et al.: Meltdown: reading kernel memory from user space. In: 27th USENIX Security Symposium (USENIX Security 18) (2018)
Google Scholar
McIntosh, T., Kayes, A.S.M., Chen, Y.P.P., Ng, A., Watters, P.: Ransomware mitigation in the modern era: a comprehensive review, research challenges, and future directions. ACM Comput. Surv. 54(9) (2021). https://doi.org/10.1145/3479393
Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B.S.: Malware images: visualization and automatic classification. In: Proceedings of the 8th International Symposium on Visualization for Cyber Security, pp. 1–7 (2011)
Google Scholar
Oyama, Y., Miyashita, T., Kokubo, H.: Identifying useful features for malware detection in the ember dataset. In: 2019 Seventh International Symposium on Computing and Networking Workshops (CANDARW), pp. 360–366 (2019). https://doi.org/10.1109/CANDARW.2019.00069
Singh, A., Handa, A., Kumar, N., Shukla, S.K.: Malware classification using image representation. In: Dolev, S., Hendler, D., Lodha, S., Yung, M. (eds.) CSCML 2019. LNCS, vol. 11527, pp. 75–92. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20951-3_6
Chapter Google Scholar
Tekiner, E., Acar, A., Uluagac, A.S., Kirda, E., Selcuk, A.A.: SOK: cryptojacking malware. In: 2021 IEEE European Symposium on Security and Privacy (EuroS &P), pp. 120–139. IEEE (2021)
Google Scholar
Vasan, D., Alazab, M., Wassan, S., Naeem, H., Safaei, B., Zheng, Q.: IMCFN: image-based malware classification using fine-tuned convolutional neural network architecture. Comput. Netw. 171, 107138 (2020)
Article Google Scholar
Vasan, D., Alazab, M., Wassan, S., Safaei, B., Zheng, Q.: Image-based malware classification using ensemble of CNN architectures (IMCEC). Comput. Secur. 92, 101748 (2020)
Article Google Scholar
Yang, L., Ciptadi, A., Laziuk, I., Ahmadzadeh, A., Wang, G.: BODMAS: an open dataset for learning based temporal analysis of PE malware. In: 4th Deep Learning and Security Workshop (2021)
Google Scholar
Zhang, F., Wang, H., Leach, K., Stavrou, A.: A framework to secure peripherals at runtime. In: Kutyłowski, M., Vaidya, J. (eds.) ESORICS 2014. LNCS, vol. 8712, pp. 219–238. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11203-9_13
Chapter Google Scholar

Download references

Acknowledgement

This paper was supported in part by a fellowship award under contract FA9550-21-F-0003 through the National Defense Science and Engineering Graduate (NDSEG) Fellowship Program, sponsored by the Air Force Research Laboratory (AFRL), the Office of Naval Research (ONR), and the Army Research Office (ARO). The material presented in this paper is based upon work supported by the National Science Foundation (NSF) through grant numbers 2220426 and 2220401, the Defense Advanced Research Projects Agency (DARPA) under contract number FA8750-23-C-0518, and the Air Force Office of Scientific Research (AFOSR) under contract number FA9550-22-1-0019 and FA9550-23-1-0135. Any opinions, findings, and conclusions or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of AFOSR, DARPA, or NSF.

Author information

Authors and Affiliations

Vanderbilt University, Nashville, TN, 37212, USA
Preston K. Robinette, Diego Manzanas Lopez & Taylor T. Johnson

Authors

Preston K. Robinette
View author publications
You can also search for this author in PubMed Google Scholar
Diego Manzanas Lopez
View author publications
You can also search for this author in PubMed Google Scholar
Taylor T. Johnson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Preston K. Robinette .

Editor information

Editors and Affiliations

TU Dortmund University, Dortmund, Germany
Bernhard Steffen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Robinette, P.K., Lopez, D.M., Johnson, T.T. (2024). Benchmark: Neural Network Malware Classification. In: Steffen, B. (eds) Bridging the Gap Between AI and Reality. AISoLA 2023. Lecture Notes in Computer Science, vol 14380. Springer, Cham. https://doi.org/10.1007/978-3-031-46002-9_17

Download citation

DOI: https://doi.org/10.1007/978-3-031-46002-9_17
Published: 14 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46001-2
Online ISBN: 978-3-031-46002-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics