Skip to main content
Log in

S-DCNN: stacked deep convolutional neural networks for malware classification

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Malware classification continues to be exceedingly difficult due to the exponential growth in the number and variants of malicious files. It is crucial to classify malicious files based on their intent, activity, and threat to have a robust malware protection and post-attack recovery system in place. This paper proposes a novel deep learning-based model, S-DCNN, to classify malware binary files into their respective malware families efficiently. S-DCNN uses the image-based representation of the malware binaries and leverages the concepts of transfer learning and ensemble learning. The model incorporates three deep convolutional neural networks, namely ResNet50, Xception, and EfficientNet-B4. The ensemble technique is used to combine these component models’ predictions and a multilayered perceptron is used as a meta classifier. The ensemble technique fuses the diverse knowledge of the component models, resulting in high generalizability and low variance of the S-DCNN. Further, it eliminates the use of feature engineering, reverse engineering, disassembly, and other domain-specific techniques earlier used for malware classification. To establish S-DCNN’s robustness and generalizability, the performance of proposed model is evaluated on the Malimg dataset, a dataset collected from VirusShare, and packed malware dataset counterparts of both Malimg and VirusShare datasets. The proposed method achieves a state-of-the-art 10-fold accuracy of 99.43% on the Malimg dataset and an accuracy of 99.65% on the VirusShare dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Alsulami B, Mancoridis S (2018) Behavioral malware classification using convolutional recurrent neural networks. In: 2018 13th international conference on malicious and unwanted software (MALWARE), pp 103–111. https://doi.org/10.1109/MALWARE.2018.8659358

  2. Beek C, Dunton T, Fokker J, Grobman S, Hux T, Polzer T, Lopez MR, Roccia T, Saavedra-Morales J, Samani R, Sherstobitof R (2019) Accessed on December 27, 2020 Mcafee labs threats report. https://www.mcafee.com/enterprise/en-us/assets/reports/rp-quarterly-threats-aug-2019.pdf

  3. Bhowmik A, Kumar S, Bhat N (2019) Eye disease prediction from optical coherence tomography images with transfer learning. In: International conference on engineering applications of neural networks. Springer, pp 104–114

  4. Bhowmik A, Kumar S, Bhat N (2021) Evolution of automatic visual description techniques-a methodological survey. Multimed Tools Appl, 1–45

  5. Çayır A, Ünal U, Dağ H (2021) Random capsnet forest model for imbalanced malware type classification task. Comput Secur 102:102133. https://doi.org/10.1016/j.cose.2020.102133

    Article  Google Scholar 

  6. Chaudhary P, Gupta DK, Singh S (2021) Outcome prediction of patients for different stages of sepsis using machine learning models. In: Advances in communication and computational technology. Springer, Singapore, pp 1085–1098

  7. Chollet F (2017) Xception: deep learning with depthwise separable convolutions. In: 2017 IEEE Conference on computer vision and pattern recognition (CVPR), pp 1800–1807. https://doi.org/10.1109/CVPR.2017.195

  8. Cui Z, Xue F, Cai X, Cao Y, ge Wang G, Chen J (2018) Detection of malicious code variants based on deep learning. IEEE Trans Industr Inform 14(7):3187–3196. https://doi.org/10.1109/tii.2018.2822680

    Article  Google Scholar 

  9. D’Angelo G, Ficco M, Palmieri F (2021) Association rule-based malware classification using common subsequences of api calls. Appl Soft Comput 105:107234. https://doi.org/10.1016/j.asoc.2021.107234

    Article  Google Scholar 

  10. Gao X, Hu C, Shan C, Liu B, Niu Z, Xie H (2020) Malware classification for the cloud via semi-supervised transfer learning. J Inform Secur Applic 55:102661. https://doi.org/10.1016/j.jisa.2020.102661

    Google Scholar 

  11. Gibert D, Mateu C, Planes J, Vicens R (2018) Using convolutional neural networks for classification of malware represented as images. J Comput Virol Hack Techniques 15(1):15–28. https://doi.org/10.1007/s11416-018-0323-0

    Article  Google Scholar 

  12. Gibert D, Mateu C, Planes J (2020) Hydra: a multimodal deep learning framework for malware classification. Comput Secur 95:101873. https://doi.org/10.1016/j.cose.2020.101873

    Article  Google Scholar 

  13. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on computer vision and pattern recognition (CVPR), pp 770–778. https://doi.org/10.1109/CVPR.2016.90

  14. Jain M, Andreopoulos W, Stamp M (2020) Convolutional neural networks and extreme learning machines for malware classification. J Comput Virol Hack Techniques 16(3):229–244

    Article  Google Scholar 

  15. Kalash M, Rochan M, Mohammed N, Bruce NDB, Wang Y, Iqbal F (2018) Malware classification with deep convolutional neural networks. In: 2018 9th IFIP international conference on new technologies, mobility and security (NTMS), pp 1–5. https://doi.org/10.1109/NTMS.2018.8328749

  16. Kaspersky (2020) Accessed on December 27, 2020 Protecting your personal data online at every point. https://media.kasperskydaily.com/wp-content/uploads/sites/92/2020/01/27103216/International-Privacy-Day-2020-Kaspersky-report.pdf

  17. Katyal S, Kumar S, Sakhuja R, Gupta S (2018) Object detection in foggy conditions by fusion of saliency map and yolo. In: In 2018 12th international conference on sensing technology (ICST), pp 154–159. https://doi.org/10.1109/ICSensT.2018.8603632

  18. Kaur G, Singh S, Rani R, Kumar R, Malik A (2021) High-quality reversible data hiding scheme using sorting and enhanced pairwise pee. IET Image Processing. https://doi.org/10.1049/ipr2.12212

  19. Kolosnjaji B, Zarras A, Webster G, Eckert C (2016) Deep learning for classification of malware system call sequences. In: AI 2016: advances in artificial intelligence. Springer International Publishing, pp 137–149. https://doi.org/10.1007/978-3-319-50127-7_11

  20. Kumar M, Gupta DK, Singh S (2021) Extreme event forecasting using machine learning models. In: Advances in communication and computational technology. Springer, Singapore, pp 1503–1514

  21. Kumar N, Kumar R, Malik A, Singh S (2021) Low bandwidth data hiding for multimedia systems based on bit redundancy. Multimed Tools Appl, 1–19

  22. Kumar R, Chand S, Singh S (2019) An optimal high capacity reversible data hiding scheme using move to front coding for lzw codes. Multimed Tools Appl 78(16):22977–23001

    Article  Google Scholar 

  23. Liu L, Wang B, Yu B, Zhong Q (2017) Automatic malware classification and new malware detection using machine learning. Front Inform Technol Electron Eng 18:1336–1347

    Article  Google Scholar 

  24. Malik A, Kumar R, Singh S (2018) A new image steganography technique based on pixel intensity and similarity in secret message. In: International Conference on Advances in Computing, Communication Control and Networking (ICACCCN). IEEE, pp 828–831

  25. Moskovitch R, Feher C, Tzachar N, Berger E, Gitelman M, Dolev S, Elovici Y (2008) Unknown malcode detection using OPCODE representation. In: Intelligence and security informatics. Springer, Berlin, pp 204–215. https://doi.org/10.1007/978-3-540-89900-6_21

  26. Naeem H (2019) Detection of malicious activities in internet of things environment based on binary visualization and machine intelligence. Wirel Pers Commun 108 (4):2609–2629. https://doi.org/10.1007/s11277-019-06540-6

    Article  Google Scholar 

  27. Naeem H, Guo B, Naeem MR (2018) A light-weight malware static visual analysis for iot infrastructure. In: 2018 International conference on artificial intelligence and big data (ICAIBD), pp 240–244. https://doi.org/10.1109/ICAIBD.2018.8396202

  28. Naeem H, Guo B, Naeem MR, Ullah F, Aldabbas H, Javed MS (2019) Identification of malicious code variants based on image visualization. Comput Electr Eng 76:225–237. https://doi.org/10.1016/j.compeleceng.2019.03.015

    Article  Google Scholar 

  29. Narayanan BN, Djaneye-Boundjou O, Kebede TM (2016) Performance analysis of machine learning and pattern recognition algorithms for malware classification. In: 2016 IEEE national aerospace and electronics conference (NAECON) and ohio innovation summit (OIS), pp 338–342. https://doi.org/10.1109/NAECON.2016.7856826

  30. Nataraj L, Karthikeyan S, Jacob G, Manjunath BS (2011) Malware images: visualization and automatic classification. In: the 8th International symposium on visualization for cyber security. Association for Computing Machinery, New York, NY, USA, VizSec ’11. https://doi.org/10.1145/2016904.2016908

  31. Nataraj L, Yegneswaran V, Porras P, Zhang J (2011) A comparative assessment of malware classification using binary texture analysis and dynamic analysis. In: Proceedings of the 4th ACM workshop on security and artificial intelligence. Association for Computing Machinery, New York, NY, USA, AISec ’11, pp 21–30. https://doi.org/10.1145/2046684.2046689

  32. Nisa M, Shah JH, Kanwal S, Raza M, Khan MA, Damaševičius R, Blažauskas T (2020) Hybrid malware classification method using segmentation-based fractal texture analysis and deep convolution neural network features. Appl Sci 10(14):4966. https://doi.org/10.3390/app10144966

    Article  Google Scholar 

  33. Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175

    Article  Google Scholar 

  34. Pascanu R, Stokes JW, Sanossian H, Marinescu M, Thomas A (2015) Malware classification with recurrent networks. In: IEEE International conference on acoustics, speech and signal processing (ICASSP), pp 1916–1920. https://doi.org/10.1109/ICASSP.2015.7178304

  35. Saadat S, Raymond VJ (2020) Malware classification using CNN-XGBoost model. In: Artificial intelligence techniques for advanced computing applications. https://doi.org/10.1007/978-981-15-5329-5_19. Springer, Singapore, pp 191–202

  36. Saxe J, Berlin K (2015) Deep neural network based malware detection using two dimensional binary program features. In: 2015 10th international conference on malicious and unwanted software (MALWARE), pp 11–20. https://doi.org/10.1109/MALWARE.2015.7413680

  37. Saxena A, Gupta DK, Singh S (2021) An animal detection and collision avoidance system using deep learning. In: Advances in communication and computational technology. Springer, Singapore, pp 1069–1084

  38. Schultz MG, Eskin E, Zadok F, Stolfo SJ (2001) Data mining methods for detection of new malicious executables. In: Proceedings 200 IEEE symposium on security and privacy S P 2001, pp 38–49. https://doi.org/10.1109/SECPRI.2001.924286

  39. Tan M, Le Q (2019) EfficientNet: rethinking model scaling for convolutional neural networks. In: Proceedings of the 36th international conference on machine learning, PMLR, proceedings of machine learning research, vol 97, pp 6105–6114

  40. Vasan D, Alazab M, Wassan S, Naeem H, Safaei B, Zheng Q (2020) Imcfn: image-based malware classification using fine-tuned convolutional neural network architecture. Comput Netw 171:107138. https://doi.org/10.1016/j.comnet.2020.107138

    Article  Google Scholar 

  41. Vasan D, Alazab M, Wassan S, Safaei B, Zheng Q (2020) Image-based malware classification using ensemble of cnn architectures (imcec). Comput Secur 92:101748. https://doi.org/10.1016/j.cose.2020.101748

    Article  Google Scholar 

  42. Venkatraman S, Alazab M, Vinayakumar R (2019) A hybrid deep learning image-based analysis for effective malware detection. J Inform Secur Applic 47:377–389. https://doi.org/10.1016/j.jisa.2019.06.006

    Google Scholar 

  43. Verma V, Muttoo SK, Singh VB (2020) Multiclass malware classification via first- and second-order texture statistics. Comput Secur 97:101895. https://doi.org/10.1016/j.cose.2020.101895

    Article  Google Scholar 

  44. Yuan B, Wang J, Liu D, Guo W, Wu P, Bao X (2020) Byte-level malware classification based on markov images and deep learning. Comput Secur 92:101740. https://doi.org/10.1016/j.cose.2020.101740

    Article  Google Scholar 

  45. Zhang H, Xiao X, Mercaldo F, Ni S, Martinelli F, Sangaiah AK (2019) Classification of ransomware families with machine learning based on n-gram of opcodes. Futur Gener Comput Syst 90:211–221. https://doi.org/10.1016/j.future.2018.07.052

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anil Singh Parihar.

Ethics declarations

Conflict of Interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Parihar, A.S., Kumar, S. & Khosla, S. S-DCNN: stacked deep convolutional neural networks for malware classification. Multimed Tools Appl 81, 30997–31015 (2022). https://doi.org/10.1007/s11042-022-12615-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12615-7

Keywords

Navigation