Skip to main content

Advertisement

A lightweight machine learning methods for malware classification

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Today’s Information Technology landscape is rapidly evolving. Cyber professionals are increasingly concerned about maintaining security and privacy. Research has shown that the emergence of new malware is on the rise. The realm of malware assault and defense is an endless circle. Antivirus firms are always striving to create signatures for hazardous malware, while attackers are constantly seeking to circumvent these signatures. Machine learning is incredibly successful at detecting malware. ML-based Malware detection falls into two categories: feature extraction and malware classification. The proposed solutions are designed specifically for low-power embedded devices and edge computing systems. These methods allow for real-time malware detection without imposing a significant computing burden. This study provides an in-depth analysis of feature reduction, and lightweight algorithms to enable this proposed method to work effectively and efficiently on any device starting from PC, IoT devices and servers. Extensive experiments were carried out on BODMAS dataset to provide the best low-complexity method with an F1 score of more than 99%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

No datasets were generated or analysed during the current study.

References

  1. Afzal, S., Asim, M., Javed, A.R., Beg, M.O., Baker, T.: URLdeepDetect: A deep learning approach for detecting malicious URLs using semantic vector models. J. Netw. Syst. Manage., 29, 3, p. 21, 202

  2. ur Rehman, S., Khaliq, M., Imtiaz, S.I., et al.: Diddos: An approach for detection and identification of distributed denial of service (ddos) cyberattacks using gated recurrent units (gru). Future Generation Comput. Syst. 118, 453–466 (2021)

    Article  MATH  Google Scholar 

  3. Mohurle andM, S., Patil: A brief study of wannacry threat: Ransomware attack 2017. Int. J. Adv. Res. Comput. Sci. 8(5), 1938–1940 (2017)

    MATH  Google Scholar 

  4. Idika, N., Mathur, A.P.: A survey of malware detection techniques. Purdue Univ. 48, 2007–2002 (2007)

    MATH  Google Scholar 

  5. Rawat, W., Wang, Z.: Deep convolutional neural networks for image classification: A comprehensive review. Neural Comput. 29(9), 2352–2449 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  6. Venable, M., Walenstein, A., Hayes, M., Thompson, C., Lakhotia, A., Vilo: A Shield in the Malware Variation Battle, pp. 5–10. Virus Bulletin (2007)

  7. Rafiq, H., Aslam, N., Aleem, M., Issac, B., Randhawa, R.H.: AndroMalPack: Enhancing the ML-based malware classification by detection and removal of repacked apps for android systems. Sci. Rep., 12, 1, 19534, pp. 1–18

  8. Mughaid, A., AlZu’bi, S., Hnaif, A.: An intelligent cyber security phishing detection system using deep learning techniques. Cluster Comput. 25, 3819–3828 (2022)

    Article  MATH  Google Scholar 

  9. Taylor, C., Alves-Foss, J.: Nate – network Analysis of Anomalous Traffic Events, a low-cost Approach. New Security Paradigms Workshop (2001)

  10. Yang, L., Ciptadi, A., Laziuk, I., Ahmadzadeh, A., Wang, G.: BODMAS: An Open Dataset for Learning based Temporal Analysis of PE Malware, 2021 IEEE Security and Privacy Workshops (SPW), San Francisco, CA, USA, pp. 78–84, (2021). https://doi.org/10.1109/SPW53761.2021.00020

  11. Al-Mimi, H.M., Hamad, N.A., Abualhaj, M.M., Al-Khatib, S.N., Hiari, M.O.: Improved intrusion detection system to alleviate attacks on DNS service. J. Comput. Sci. 19(12), 1549–1560 (2023)

    Article  MATH  Google Scholar 

  12. Nissim, N., Moskovitch, R., Rokach, L., Elovici, Y.: Novel active learning methods for enhanced pc malware detection in windows os. Expert Syst. Appl. 41(13), 5843–5857 (2014)

    Article  MATH  Google Scholar 

  13. Bae, S.I., Lee, G.B., Im, E.G.: Ransomware detection using machine learning algorithms. Concurrency Computation: Pract. Experience, 32, 18, e5422, (2020)

  14. Brengel, M., Rossow, C.: Yarix: Scalable yara-based malware intelligence, USENIX Security Symposium, pp. 3541–3558, (2021)

  15. Li, J., Sun, L., Yan, Q., Li, Z., Srisa-an, W., Ye, H.: Significant permission identification for machine-learning-based android malware detection. IEEE Trans. Industrial Inf. vol. 14(7), 3216–3225 (2018)

    Article  MATH  Google Scholar 

  16. Ou, F., Xu, J.: S3feature: A static sensitive subgraph-based feature for android malware detection. Computers Secur. 112, 102513 (2022)

    Article  MATH  Google Scholar 

  17. Jerbi, M., Dagdia, Z.C., Bechikh, S., Said, L.B.: On the use of artificial malicious patterns for android malware detection. Computers Secur. 92, 101743 (2020)

    Article  MATH  Google Scholar 

  18. Mahindru, A., Sangal, A.L.: Mldroid—framework for android malware detection using machine learning techniques. Neural Comput. Appl. 33(10), 5183–5240 (May 2021)

  19. Jung, J., Kim, H., Shin, D., Lee, M., Lee, H., Cho, S., Suh, K.: Android malware detection based on useful api calls and machine learning, IEEE 1st International Conference on Artificial Intelligence and Knowledge Engineering, pp. 175–178, (2018)

  20. Yu, H.: An android malware detection system based on machine learning, vol. 1864, no. 1, p. 020136, Aug. (2017)

  21. Li, S., Zhou, Q., Zhou, R., Lv, Q.: Intelligent malware detection based on graph convolutional network. J. Supercomputing. 78, 4182–4198 (2022)

    Article  MATH  Google Scholar 

  22. Garcia, J., Hammad, M., Malek, S.: Lightweight, obfuscation-resilient detection and family identification of android malware. ACM Trans. Softw. Eng. Methodol., 26, 3, (2018)

  23. Karbab, E.B., Debbabi, M.: Petadroid: Adaptive android malware detection using deep learning, In: 18th International Conference, DIMVA, pp. 319–340, Jul. (2021)

  24. Zhang, X., Zhang, Y., Zhong, M., Ding, D., Cao, Y., Zhang, Y., Zhang, M., Yang, M.: Enhancing state-of-the-art classifiers with api semantics to detect evolved android malware, In: ACM SIGSAC Conference on Computer and Communications Security. pp. 757–770, (2020)

  25. Baptista, I., Shiaeles, S., Kolokotronis, N.: A novel malware detection system based on machine learning and binary visualization, In: IEEE International Conference on Communications Workshops. pp. 1–6, (2019)

  26. Vu, D.L., Nguyen, T.K., Nguyen, T.V., Nguyen, T.N., Massacci, F., Phung, P.H.: Hit4mal: Hybrid image transformation for malware classification. Trans. Emerg. Telecommunications Technol., pp. 1–15, (2019)

  27. Wong, W.K., Juwono, F.H., Apriono, C.: Vision-based malware detection: A transfer learning approach using optimal ecoc-svm configuration. IEEE Access. 9, 159262–159270 (2021)

    Article  Google Scholar 

  28. Xiao, M., Guo, C., Shen, G., Cui, Y., Jiang, C.: Image-based malware classification using section distribution information. Computers Secur. 110, 102420 (2021)

    Article  Google Scholar 

  29. Xu, Z., Ren, K., Qin, S., Craciun, F.: Cdgdroid: Android malware detection based on deep learning using cfg and dfg, In: Sun, J., Sun, M. (eds.) Formal Methods and Software Engineering, (2018)

  30. Ünver, H.M., Bakour, K.: Android malware detection based on image-based features and machine learning techniques. SN Appl. Sci. 2(7), 1299 (2020)

    Article  MATH  Google Scholar 

  31. Hao, J., Luo, S., Pan, L.: EII-MBS: Malware Family Classification via Enhanced Instruction-level Behavior Semantic Learning, Computer Security, vol. 112. no. C (2022)

  32. Lu, Q., Zhang, H., Kinawi, H., Niu, D.: Self-attentive models for real-time malware classification. IEEE Access. 10, 95970–95985 (2022)

    Article  Google Scholar 

  33. Onwuzurike, L., Mariconti, E., Andriotis, P., Cristofaro, E.D., Ross, G., Stringhini, G.: Mamadroid: Detecting android malware by building markov chains of behavioral models (extended version). ACM Trans. Priv. Secur., 22, 2, (2019)

  34. Samuel, A.L.: Some Studies in Machine Learning Using the Game of Checkers, IBM J. Res. Dev., vol. 3, no. 3, pp. 210–229, Jul. (1959)

  35. John, G.H., Langley, P.: Estimating continuous distributions in bayesian classifiers, ArXivPrepr. ArXiv13024964, 2013.

  36. Fix, E., Hodges, J.L.: Discriminatory analysis: Nonparametric discrimination: Consistency properties. Am. Psychol. Association. (1951). https://doi.org/10.1037/e471672008-001

    Article  MATH  Google Scholar 

  37. Joachims, T.: Making large-scale support vector machine learning practical, advances in Kernel methods. Support Vector Learn., (1999)

  38. Ho, T.K.: Random decision forests, In Proceedings of 3rd international conference on document analysis and recognition, vol. 1, pp. 278–282

  39. Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Liu, T.-Y.: Lightgbm: A highly efficient gradient boosting decision tree. Adv. Neural. Inf. Process. Syst. 30, 3146–3154 (2017)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

Mahmoud and Ibrahim wrote the main manuscript text and Ahmad prepared the Figures and Tables. Radwan developed the models machine learning codding as per the intended model design agreed by all authors. Finally, all authors reviewed the manuscript.

Corresponding author

Correspondence to Mahmoud E. Farfoura.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Farfoura, M.E., Mashal, I., Alkhatib, A. et al. A lightweight machine learning methods for malware classification. Cluster Comput 28, 1 (2025). https://doi.org/10.1007/s10586-024-04755-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10586-024-04755-2

Keywords