A Vision Transformer Enhanced with Patch Encoding for Malware Classification

Park, Kyoung-Won; Cho, Sung-Bae

doi:10.1007/978-3-031-21753-1_29

Kyoung-Won Park¹⁰ &
Sung-Bae Cho^10,11

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13756))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

944 Accesses
2 Citations

Abstract

With various benefits through software technology development, malicious attacks to steal confidential and company information have constantly been increasing. Recent deep learning models with images converted from malicious code achieve meaningful results, but they have challenges in classifying the same malware family, like Ramnit, Tracur, and Obfuscator. ACY that have similar structures in the image. Instead of observing the overall global features, there is a need for a method of considering the position of local features and learning the relationships between them. In this paper, we propose a vision transformer enhanced with the additional encoding of multiple patches for location information of local features and relationship information between them. For learning considering position information and all relationships between patches, [CLS] tokens that can summarize all information are utilized. 10-fold cross-validation with the Microsoft challenge dataset shows that the proposed model produces better accuracy than comparable studies. The misclassification analysis confirms that the proposed method can detect the same malware family penetrated by the conventional deep learning model. Additional analysis with the activation map emphasizes which structural and sequential features are extracted to detect different codes belonging to the same malware family.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Conti, G., Dean, E., Sinda, M., Sangster, B.: Visual reverse engineering of binary and data files. In: International Workshop on Visualization for Computer Security, pp. 1–17 (2008)
Google Scholar
Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B.S.: Malware images: visualization and automatic classification. In: Proceedings of the 8th International Symposium on Visualization for Cyber Security, pp. 1–7 (2011)
Google Scholar
Kancherla, K., Mukkamala, S.: Image visualization based malware detection. In: IEEE Symposium on Computational Intelligence in Cyber Security (CICS), pp. 40–44 (2013)
Google Scholar
Rezende, E., Ruppert, G., Carvalho, T., Ramos, F., De Geus, P.: Malicious software classification using transfer learning of resnet-50 deep neural network. In: 16th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 1011–1014 (2017)
Google Scholar
Kalash, M., Rochan, M., Mohammed, N., Bruce, N.D., Wang, Y., Iqbal, F.: Malware classification with deep convolutional neural networks. In: 9th IFIP International Conference on New Technologies, Mobility and Security (NTMS), pp. 1–5 (2018)
Google Scholar
Rezende, E., Ruppert, G., Carvalho, T., Theophilo, A., Ramos, F., Geus, P.D.: Malicious software classification using VGG16 deep neural network’s bottleneck features. In: Information Technology-New Generations, pp. 51–59 (2018)
Google Scholar
Vasan, D., Alazab, M., Wassan, S., Safaei, B., Zheng, Q.: Image-based malware classification using ensemble of CNN architectures (IMCEC). Comput. Secur. 92, 101748 (2020)
Article Google Scholar
Khan, R.U., Zhang, X., Kumar, R.: Analysis of ResNet and GoogleNet models for malware detection. J. Comput. Virol. Hack. Tech. 15(1), 29–37 (2019)
Article Google Scholar
Cui, Z., Xue, F., Cai, X., Cao, Y., Wang, G.G., Chen, J.: Detection of malicious code variants based on deep learning. IEEE Trans. Ind. Inf. 14(7), 3187–3196 (2018)
Article Google Scholar
Bhodia, N., Prajapati, P., Di Troia, F., Stamp, M.: Transfer learning for image-based malware classification. In: ICISSP (2019)
Google Scholar
Su, J., Vasconcellos, D.V., Prasad, S., Sgandurra, D., Feng, Y., Sakurai, K.: Lightweight classification of IoT malware based on image recognition. In: IEEE 42Nd Annual Computer Software and Applications Conference (COMPSAC), vol. 2, pp. 664–669 (2018)
Google Scholar
Yajamanam, S., Selvin, V.R.S., Di Troia, F., Stamp, M.: Deep learning versus gist descriptors for image-based malware classification. In: ICISSP, pp. 553–561 (2018)
Google Scholar
Cui, Z., Du, L., Wang, P., Cai, X., Zhang, W.: Malicious code detection based on CNNs and multi-objective algorithm. J. Parallel Distrib. Comput. 129, 50–58 (2019)
Article Google Scholar
Jung, B., Kim, T., Im, E.G.: Malware classification using byte sequence information. In: Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems, pp. 143–148 (2018)
Google Scholar
Han, K., Lim, J.H., Im, E.G.: Malware analysis method using visualization of binary files. In: Proceedings of the 2013 Research in Adaptive and Convergent Systems, pp. 317–321 (2013)
Google Scholar
Azab, A., Khasawneh, M.: MSIC: malware spectrogram image classification. IEEE Access 8, 102007–102021 (2020)
Article Google Scholar
Li, L., Ding, Y., Li, B., Qiao, M., Ye, B.: Malware classification based on double byte feature encoding. Alexandria Eng. J. 61(1), 91–99 (2022)
Article Google Scholar
Nataraj, L., Kirat, D., Manjunath, B.S., Vigna, G.: Sarvam: search and retrieval of malware. In: Proceedings of the Annual Computer Security Conference (ACSAC) Worshop on Next Generation Malware Attacks and Defense (NGMAD) (2013)
Google Scholar
Kim, J.Y., Bu, S.J., Cho, S.B.: Malware detection using deep transferred generative adversarial networks. In: International Conference on Neural Information Proceedings, pp. 556–564 (2017)
Google Scholar
Kim, J.Y., Bu, S.J., Cho, S.B.: Zero-day malware detection using transferred generative adversarial networks based on deep autoencoders. Inf. Sci. 460, 83–102 (2018)
Article Google Scholar
Catak, F.O., Ahmed, J., Sahinbas, K., Khand, Z.H.: Data augmentation based malware detection using convolutional neural networks. PeerJ Comput. Sci. 7, e346 (2021)
Article Google Scholar
Burks, R., Islam, K.A., Lu, Y., Li, J.: Data augmentation with generative models for improved malware detection: a comparative study. In: IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), pp. 660–665 (2019)
Google Scholar
Lu, Y., Li, J.: Generative adversarial network for improving deep learning based malware classification. In: Winter Simulation Conference (WSC), pp. 584–593 (2019)
Google Scholar
Yoo, S., Kim, S., Kang, B.B.: The image game: exploit kit detection based on recursive convolutional neural networks. IEEE Access 8, 18808–18821 (2020)
Article Google Scholar
Choi, S., Jang, S., Kim, Y., Kim, J.: Malware detection using malware image and deep learning. In: International Conference on Information and Communication Technology Convergence (ICTC), pp. 1193–1195 (2017)
Google Scholar
Kabanga, E.K., Kim, C.H.: Malware images classification using convolutional neural network. J. Comput. Commun. 6(1), 153–158 (2017)
Article Google Scholar
Hsiao, S.C., Kao, D.Y., Liu, Z.Y., Tso, R.: Malware image classification using one-shot learning with siamese networks. Procedia Comput. Sci. 159, 1863–1871 (2019)
Article Google Scholar
Zhu, J., Jang-Jaccard, J., Watters, P.A.: Multi-loss siamese neural network with batch normalization layer for malware detection. IEEE Access 8, 171542–171550 (2020)
Article Google Scholar
Tobiyama, S., Yamaguchi, Y., Shimada, H., Ikuse, T., Yagi, T.: Malware detection with deep neural network using process behavior. In: IEEE 40th Annual Computer Software and Applications Conf. (COMPSAC), vol. 2, pp. 577–582 (2016)
Google Scholar
Tran, T.K., Sato, H., Kubo, M.: Image-based unknown malware classification with few-shot learning models. In: International Symposium on Computing and Networking Workshops (CANDARW), pp. 401–407 (2019)
Google Scholar
Kim, J.Y., Cho, S.B.: Detecting intrusive malware with a hybrid generative deep learning model. In: International Conference on Intelligent Data Engineering and Automated Learning, pp. 499–507 (2018)
Google Scholar
Moti, Z., et al.: Generative adversarial network to detect unseen internet of things malware. Ad Hoc Netw. 122, 102591 (2021)
Article Google Scholar
Jian, Y., Kuang, H., Ren, C., Ma, Z., Wang, H.: A novel framework for image-based malware detection with a deep neural network. Comput. Secur. 109, 102400 (2021)
Article Google Scholar
Bhaskara, V.S., Bhattacharyya, D.: Emulating malware authors for proactive protection using GANs over a distributed image visualization of dynamic file behavior. arXiv preprint arXiv:1807.07525 (2018)
Google Scholar
Yuan, B., Wang, J., Liu, D., Guo, W., Wu, P., Bao, X.: Byte-level malware classification based on markov images and deep learning. Comput. Secur. 92, 101740 (2020)
Article Google Scholar
Dosovitskiy, A., et al.: An image is worth 16×16 words: transformers for image recognition at scale. In: ICLR (2021)
Google Scholar
Ronen, R., Radu, M., Feuerstein, C., Yom-Tov, E., Ahmadi, M.: Microsoft malware classification challenge. arXiv preprint arXiv:1802.10135 (2018)
Google Scholar
Jiang, Y., Chang, S., Wang, Z.: Transgan: two pure transformers can make one strong gan, and that can scale up. Adv. Neural Inf. Process. Syst. 34, 14745–14758 (2021)
Google Scholar

Download references

Acknowledgement

This work was supported by an IITP grant funded by the Korean government (MSIT) (No. 2020–0-01361, Artificial Intelligence Graduate School Program (Yonsei University)) and Air Force Defense Research Sciences Program funded by Air Force Office of Scientific Research.

Author information

Authors and Affiliations

Department of Artificial Intelligence, Yonsei University, Seoul, 03722, Korea
Kyoung-Won Park & Sung-Bae Cho
Department of Computer Science, Yonsei University, Seoul, 03722, Korea
Sung-Bae Cho

Authors

Kyoung-Won Park
View author publications
You can also search for this author in PubMed Google Scholar
Sung-Bae Cho
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sung-Bae Cho .

Editor information

Editors and Affiliations

University of Manchester, Manchester, UK
Hujun Yin
Technical University of Madrid, Madrid, Spain
David Camacho
University of Birmingham, Birmingham, UK
Peter Tino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Park, KW., Cho, SB. (2022). A Vision Transformer Enhanced with Patch Encoding for Malware Classification. In: Yin, H., Camacho, D., Tino, P. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2022. IDEAL 2022. Lecture Notes in Computer Science, vol 13756. Springer, Cham. https://doi.org/10.1007/978-3-031-21753-1_29

Download citation

DOI: https://doi.org/10.1007/978-3-031-21753-1_29
Published: 21 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21752-4
Online ISBN: 978-3-031-21753-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics