Abstract
Today, the extensive reliae on technology has exposed us to a constant threat of sophisticated malware attacks. Various automated malware production techniques have evolved, some of which reuse specific segments of existing malware to produce new malware, making malware detection challenging. In this paper, we propose a Convolutional Recurrence based malware classification technique that exploits the visual recurrences in the grayscale images of the malware samples belonging to the same malware families. Firstly, we convert the malware samples into grayscale images to capture the structural similarities from the malware samples using a Convolutional Neural Network architecture. Then we perform data augmentation to counter the effects of high data imbalance and reduce the class bias, such that training on that dataset would generate a more generalized framework. The augmented dataset is then passed through a VGG16 based feature extractor to extract the visual outliers amongst the malware families. Now, the extracted features are processed by passing them through two stacked BiLSTM layers. The outputs generated by the BiLSTM layers and the VGG16 layer are then merged to perform the final classification of the malware sample into its malware family. The model’s performance is further improved by using proper hyperparameter tuning. We compare the performance of our algorithm against several baseline methods and some contemporary state-of-the-art methods for visual malware detection by utilizing two benchmarked datasets. The obtained experimental results reveal the utility and efficacy of our proposed malware family classification technique.
Similar content being viewed by others
Availability of data and materials
Not applicable.
Code availability (software application or custom code)
Not applicable.
References
Sikorski, Michael, Honig, Andrew: Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software, 1st edn. No Starch Press, USA (2012)
Ye, Y., Li, T., Adjeroh, D., Iyengar, S.S.: A survey on malware detection using data mining techniques. ACM Comput. Surveys (CSUR). 50(3), 1–40 (2017)
Damodaran, A., Di Troia, F., Visaggio, C.A., Austin, T.H., Stamp, M.: A comparison of static, dynamic, and hybrid analysis for malware detection. J. Comput. Virol. Hacking Tech. 13(1), 1–2 (2017)
Shabtai, A., Moskovitch, R., Elovic,i Y., Glezer, C.: Detection of malicious code by applying machine learning classifiers on static features: A state-of-the-art survey, Information Security Tech. Report 14(1), 16–29 (2009)
Kancherla Mukkamala, K. S.: Image visualization based malware detection. In: Proceedings of the 2013 IEEE Symposium on Computational Intelligence in Cyber Security (CICS), Singapore, pp. 40–44 (2013)
Vasan, D., Alazab, M., Wassan, S., Naeem, H., Safaei, B., Zheng, Q.: IMCFN: Image-based malware classification using fine-tuned convolutional neural network architecture. Comput. Netw 171, 107138 (2020)
Wagner, M., Fischer, F., Luh, R., Haberson, A., Rind, A., Keim, D. A., Aigner, W.: A Survey of Visualization Systems for Malware Analysis. EuroVis (2015)
Shaid, S.Z., Maarof, M.A.: Malware behaviour visualization. J. Teknologi 70(5), 25–33 (2014)
Trinius, P., Holz, T., Gobel, J., Freiling, F. C.: Visual analysis of malware behavior using treemaps and thread graphs. In: Proceedings of 6th International Workshop on Visualization for Cyber Security, pp. 33–38 (2009)
Jae, H.L., KyoungSoo, H., Eul, G. I.: Malware analysis method using visualization of binary files. In: Proceedings of 2013 ACM Conference on Research in Adaptive and Convergent Systems, pp. 317–321 (2013)
Eul, G.I., KyoungSoo, H., Jae, H.L., Boojoong, K.: Malware analysis using visualized images and entropy graphs. Int. J. Inform. Secur. 14, 1–14 (2014)
Ban, X., Chen, L., Hu, W., Wu, Q.:Malware variant detection using similarity search over content fingerprint. In: Proceedings of the IEEE Conference on Control and Decision, pp. 5334–5339 (2014)
Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B.: Malware images: visualization and automatic classification. In: Proceedings of the 8th International Symposium on Visualization for Cyber Security, pp. 1–7 (2011)
Aziz, M., Anita, P.: Malware class recognition using image processing techniques. In: Proceedings of the IEEE Conference on Data Management, Analytics and Innovation, (ICDMAI), pp. 76–80 (2017)
Barath, N. N., Ouboti, D. B., Temesguen, M. K.:Pattern recognition algorithms for Malware classification. In: Proceedings of the IEEE Conference of Aerospace and Electronics, pp. 338–342 (2016)
Kosmidis, K., Kalloniatis, C.:Machine Learning and Images for Malware detection and classification. In: Proceedings of the 21st Pan-Hellenic Conference on Informatics, pp. 1–6 (2017)
Agarap, A. F., Pepito, F. J. :Towards building an intelligent anti-malware system: a deep learning approach using Support Vector Machine (SVM) for Malware classification. arXiv preprint arXiv:1801.00318 (2017)
Bhowmik, A., Kumar, S. Bhat, N.: Evolution of automatic visual description techniques-a methodological survey. In: Multimedia Tools and Applications, pp. 1–45 (2021)
Bhowmik, A., Kumar, S., Bhat, N.: Eye disease prediction from optical coherence tomography images with transfer learning. In: International Conference on Engineering Applications of Neural Networks, pp. 104–114 (2019)
Kumar, S., Kumar, M.: Predicting customer churn using artificial neural network. In: International Conference on Engineering Applications of Neural Networks, pp. 299-306 (2019)
Singh, A., Handa, A., Kumar, N., Shukla, S. K.:Malware classification using image representation. In: Cyber Security Cryptography and Machine Learning, pp. 75–92 (2017)
Zhu, D., Jin, H., Yang, Y., Wu. D., Chen, W.:DeepFlow: Deep learning-based malware detection by mining Android application for abnormal usage of sensitive data. In: Proceedings of the 2017 IEEE Symposium on Computers and Communications (ISCC), pp. 438–443 (2017)
Simonyan, K., Zisserman, A. : Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv:1409.1556 (2014)
Cui, Z., Xue, F., Cai, X., Cao, Y., Wang, G., Chen, J.: Detection of malicious code variants based on deep learning. IEEE Trans. Ind. Inform. 14(7), 3187–96 (2018)
Vinayakumar, R., Alazab, M., Soman, K.P., Poornachandran, P., Venkatraman, S.: Robust intelligent malware detection using deep learning. IEEE Access 7, 46717–46738 (2019)
Naeem, H., Guo, B., Naeem, M.R., Ullah, F., Aldabbas, H., Javed, M.S.: Identification of malicious code variants based on image visualization. Comput. Electr. Eng. 76, 225–237 (2019)
Naeem, H.: Detection of malicious activities in internet of things environment based on binary visualization and machine intelligence. Wireless Personal Commun. 108(4), 2609–29 (2019)
Kumar, S., Kumar, M: A study on the image detection using convolution neural networks and TenserFlow. In: 2018 International Conference on Inventive Research in Computing Applications (ICIRCA), IEEE, pp. 1080–1083 (2018)
Nisa, M., Shah, J.H., Kanwal, S., Raza, M., Khan, M.A., Damaševičius, R., Blažauskas, T.: Hybrid malware classification method using segmentation-based fractal texture analysis and deep convolution neural network features. Appl. Sci. 10(14), 4966 (2020)
Ren, Z., Chen, G., Lu, W.: Malware visualization methods based on deep convolution neural networks. Multimedia Tools Appl. 16, 1–9 (2020)
Tuncer, T., Ertam, F., Dogan, S.: Automated malware recognition method based on local neighborhood binary pattern. Multimedia Tools Appl. 79(37), 27815–32 (2020)
Anderson. B., Storlie, C. Improving malware classification: bridging the static/dynamic gap. In: Proceedings of the 5th ACM Workshop on Security and Artificial Intelligence, pp. 3–14 (2012)
Dahl, G. E., Stokes, J. W., Deng, L., Yu, D.: Large-scale malware classification using random projections and neural networks. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 26–31 (2013)
Zhang, M., Duan, Y., Yin, H., Zhao, Z.:Semantics-aware android malware classification using weighted contextual api dependency graphs. In: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, pp. 1105–1116 (2014)
Pascanu, R.; Stokes, J.W.; Sanossian, H.; Marinescu, M.; Thomas, A. Malware classification with recurrent networks. In: Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, QLD, Australia, pp. 7–24 (2015)
Garcia, F. C., Muga, F. P. Random forest for malware classification. arXiv preprint arXiv:1609.07770 (2016)
Moshiri, E., Abdullah, A.B., Azlina, R., Raja, B., Muda, Z.: Malware classification framework for dynamic analysis using information theory. Indian J. Sci. Technol. 10, 1–10 (2017)
Liu, L., Wang, B., Yu, B., Zhong, Q.: Automatic malware classification and new malware detection using machine learning. Front. Inform. Technol. Electr. Eng. 18(9), 1336–47 (2017)
Cakir, B., Dogdu, E.:Malware classification using deep learning methods. In: Proceedings of the ACM Southeast Conference (ACMSE) Association for Computing Machinery, pp. 1–5 (2019)
Kalash, M., Rochan, M., Mohammed, N., Bruce, N. D. B., Wang, Y., Iqbal, F.: Malware Classification with Deep Convolutional Neural Networks. In: Proceedings of the 2018 9th IFIP International Conference on New Technologies, Mobility and Security (NTMS), pp. 1–5 (2018)
Gibert, L.D., Mateu, P.C., Planes, C.J., Vicens, R.: Using convolutional neural networks for classification of malware represented as images. J. Comput. Virol. Hacking Tech. 15, 15–28 (2019)
Marastoni, N., Giacobazzi, R., Dalla, P. M.: Data augmentation and transfer learning to classify malware images in a deep learning context. J. Comput. Virol. Hack. Tech. 1–9 (2021)
Bhodia N, Prajapati P, Di Troia F, Stamp M. Transfer learning for image-based malware classification. arXiv preprint arXiv:1903.11551. 2019 Jan 21
Prajapati P, Stamp M. An empirical analysis of image-based learning techniques for malware classification. In: Malware Analysis Using Artificial Intelligence and Deep Learning 2021, pp. 411–435. Springer, Cham (2021)
Yajamanam, S., Selvin, V. R., Di Troia, F., Stamp, M.: Deep Learning versus Gist Descriptors for Image-based Malware Classification. InIcissp, pp. 553–561 (2018)
Vasan, D., Alazab, M., Wassan, S., Safaei, B., Zheng, Q.: Image-Based malware classification using ensemble of CNN architectures (IMCEC). Comput. Secur. 92, 101748 (2020)
Jain, M., Andreopoulos, W., Stamp, M.: Convolutional neural networks and extreme learning machines for malware classification. J. Comput. Virol. Hacking Tech. 16(3), 229–44 (2020)
Vasan, D., Alazab, M., Wassan, S., Naeem, H., Safaei, B., Zheng, Q.: IMCFN: image-based malware classification using fine-tuned convolutional neural network architecture. Comput. Networks. 22(171), 107138 (2020)
Lad, S. S., Adamuthe, A.C. Malware classification with improved convolutional neural network model. Int. J. Comput. Network Inform. Secur. 12(6) (2020)
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13(2) (2012)
Funding
(nformation that explains whether and by whom the research was supported.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest/Competing interests
(Include appropriate disclosures)- Authors declare that they have no conflict of interests.
Authors’ contributions
(optional: please review the submission guidelines from the journal whether statements are mandatory)
Humans and animals.
Additional declarations for articles in life science journals that report the results of studies involving humans and/or animals.
Ethics approval (include appropriate approvals or waivers)
Not applicable.
Consent to participate (include appropriate statements)
Not applicable.
Consent for publication (include appropriate statements)
Authors declare to provide Consent for publication to journal.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mallik, A., Khetarpal, A. & Kumar, S. ConRec: malware classification using convolutional recurrence. J Comput Virol Hack Tech 18, 297–313 (2022). https://doi.org/10.1007/s11416-022-00416-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11416-022-00416-3