Abstract
Today’s Information Technology landscape is rapidly evolving. Cyber professionals are increasingly concerned about maintaining security and privacy. Research has shown that the emergence of new malware is on the rise. The realm of malware assault and defense is an endless circle. Antivirus firms are always striving to create signatures for hazardous malware, while attackers are constantly seeking to circumvent these signatures. Machine learning is incredibly successful at detecting malware. ML-based Malware detection falls into two categories: feature extraction and malware classification. The proposed solutions are designed specifically for low-power embedded devices and edge computing systems. These methods allow for real-time malware detection without imposing a significant computing burden. This study provides an in-depth analysis of feature reduction, and lightweight algorithms to enable this proposed method to work effectively and efficiently on any device starting from PC, IoT devices and servers. Extensive experiments were carried out on BODMAS dataset to provide the best low-complexity method with an F1 score of more than 99%.
Similar content being viewed by others
Data availability
No datasets were generated or analysed during the current study.
References
Afzal, S., Asim, M., Javed, A.R., Beg, M.O., Baker, T.: URLdeepDetect: A deep learning approach for detecting malicious URLs using semantic vector models. J. Netw. Syst. Manage., 29, 3, p. 21, 202
ur Rehman, S., Khaliq, M., Imtiaz, S.I., et al.: Diddos: An approach for detection and identification of distributed denial of service (ddos) cyberattacks using gated recurrent units (gru). Future Generation Comput. Syst. 118, 453–466 (2021)
Mohurle andM, S., Patil: A brief study of wannacry threat: Ransomware attack 2017. Int. J. Adv. Res. Comput. Sci. 8(5), 1938–1940 (2017)
Idika, N., Mathur, A.P.: A survey of malware detection techniques. Purdue Univ. 48, 2007–2002 (2007)
Rawat, W., Wang, Z.: Deep convolutional neural networks for image classification: A comprehensive review. Neural Comput. 29(9), 2352–2449 (2017)
Venable, M., Walenstein, A., Hayes, M., Thompson, C., Lakhotia, A., Vilo: A Shield in the Malware Variation Battle, pp. 5–10. Virus Bulletin (2007)
Rafiq, H., Aslam, N., Aleem, M., Issac, B., Randhawa, R.H.: AndroMalPack: Enhancing the ML-based malware classification by detection and removal of repacked apps for android systems. Sci. Rep., 12, 1, 19534, pp. 1–18
Mughaid, A., AlZu’bi, S., Hnaif, A.: An intelligent cyber security phishing detection system using deep learning techniques. Cluster Comput. 25, 3819–3828 (2022)
Taylor, C., Alves-Foss, J.: Nate – network Analysis of Anomalous Traffic Events, a low-cost Approach. New Security Paradigms Workshop (2001)
Yang, L., Ciptadi, A., Laziuk, I., Ahmadzadeh, A., Wang, G.: BODMAS: An Open Dataset for Learning based Temporal Analysis of PE Malware, 2021 IEEE Security and Privacy Workshops (SPW), San Francisco, CA, USA, pp. 78–84, (2021). https://doi.org/10.1109/SPW53761.2021.00020
Al-Mimi, H.M., Hamad, N.A., Abualhaj, M.M., Al-Khatib, S.N., Hiari, M.O.: Improved intrusion detection system to alleviate attacks on DNS service. J. Comput. Sci. 19(12), 1549–1560 (2023)
Nissim, N., Moskovitch, R., Rokach, L., Elovici, Y.: Novel active learning methods for enhanced pc malware detection in windows os. Expert Syst. Appl. 41(13), 5843–5857 (2014)
Bae, S.I., Lee, G.B., Im, E.G.: Ransomware detection using machine learning algorithms. Concurrency Computation: Pract. Experience, 32, 18, e5422, (2020)
Brengel, M., Rossow, C.: Yarix: Scalable yara-based malware intelligence, USENIX Security Symposium, pp. 3541–3558, (2021)
Li, J., Sun, L., Yan, Q., Li, Z., Srisa-an, W., Ye, H.: Significant permission identification for machine-learning-based android malware detection. IEEE Trans. Industrial Inf. vol. 14(7), 3216–3225 (2018)
Ou, F., Xu, J.: S3feature: A static sensitive subgraph-based feature for android malware detection. Computers Secur. 112, 102513 (2022)
Jerbi, M., Dagdia, Z.C., Bechikh, S., Said, L.B.: On the use of artificial malicious patterns for android malware detection. Computers Secur. 92, 101743 (2020)
Mahindru, A., Sangal, A.L.: Mldroid—framework for android malware detection using machine learning techniques. Neural Comput. Appl. 33(10), 5183–5240 (May 2021)
Jung, J., Kim, H., Shin, D., Lee, M., Lee, H., Cho, S., Suh, K.: Android malware detection based on useful api calls and machine learning, IEEE 1st International Conference on Artificial Intelligence and Knowledge Engineering, pp. 175–178, (2018)
Yu, H.: An android malware detection system based on machine learning, vol. 1864, no. 1, p. 020136, Aug. (2017)
Li, S., Zhou, Q., Zhou, R., Lv, Q.: Intelligent malware detection based on graph convolutional network. J. Supercomputing. 78, 4182–4198 (2022)
Garcia, J., Hammad, M., Malek, S.: Lightweight, obfuscation-resilient detection and family identification of android malware. ACM Trans. Softw. Eng. Methodol., 26, 3, (2018)
Karbab, E.B., Debbabi, M.: Petadroid: Adaptive android malware detection using deep learning, In: 18th International Conference, DIMVA, pp. 319–340, Jul. (2021)
Zhang, X., Zhang, Y., Zhong, M., Ding, D., Cao, Y., Zhang, Y., Zhang, M., Yang, M.: Enhancing state-of-the-art classifiers with api semantics to detect evolved android malware, In: ACM SIGSAC Conference on Computer and Communications Security. pp. 757–770, (2020)
Baptista, I., Shiaeles, S., Kolokotronis, N.: A novel malware detection system based on machine learning and binary visualization, In: IEEE International Conference on Communications Workshops. pp. 1–6, (2019)
Vu, D.L., Nguyen, T.K., Nguyen, T.V., Nguyen, T.N., Massacci, F., Phung, P.H.: Hit4mal: Hybrid image transformation for malware classification. Trans. Emerg. Telecommunications Technol., pp. 1–15, (2019)
Wong, W.K., Juwono, F.H., Apriono, C.: Vision-based malware detection: A transfer learning approach using optimal ecoc-svm configuration. IEEE Access. 9, 159262–159270 (2021)
Xiao, M., Guo, C., Shen, G., Cui, Y., Jiang, C.: Image-based malware classification using section distribution information. Computers Secur. 110, 102420 (2021)
Xu, Z., Ren, K., Qin, S., Craciun, F.: Cdgdroid: Android malware detection based on deep learning using cfg and dfg, In: Sun, J., Sun, M. (eds.) Formal Methods and Software Engineering, (2018)
Ünver, H.M., Bakour, K.: Android malware detection based on image-based features and machine learning techniques. SN Appl. Sci. 2(7), 1299 (2020)
Hao, J., Luo, S., Pan, L.: EII-MBS: Malware Family Classification via Enhanced Instruction-level Behavior Semantic Learning, Computer Security, vol. 112. no. C (2022)
Lu, Q., Zhang, H., Kinawi, H., Niu, D.: Self-attentive models for real-time malware classification. IEEE Access. 10, 95970–95985 (2022)
Onwuzurike, L., Mariconti, E., Andriotis, P., Cristofaro, E.D., Ross, G., Stringhini, G.: Mamadroid: Detecting android malware by building markov chains of behavioral models (extended version). ACM Trans. Priv. Secur., 22, 2, (2019)
Samuel, A.L.: Some Studies in Machine Learning Using the Game of Checkers, IBM J. Res. Dev., vol. 3, no. 3, pp. 210–229, Jul. (1959)
John, G.H., Langley, P.: Estimating continuous distributions in bayesian classifiers, ArXivPrepr. ArXiv13024964, 2013.
Fix, E., Hodges, J.L.: Discriminatory analysis: Nonparametric discrimination: Consistency properties. Am. Psychol. Association. (1951). https://doi.org/10.1037/e471672008-001
Joachims, T.: Making large-scale support vector machine learning practical, advances in Kernel methods. Support Vector Learn., (1999)
Ho, T.K.: Random decision forests, In Proceedings of 3rd international conference on document analysis and recognition, vol. 1, pp. 278–282
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Liu, T.-Y.: Lightgbm: A highly efficient gradient boosting decision tree. Adv. Neural. Inf. Process. Syst. 30, 3146–3154 (2017)
Author information
Authors and Affiliations
Contributions
Mahmoud and Ibrahim wrote the main manuscript text and Ahmad prepared the Figures and Tables. Radwan developed the models machine learning codding as per the intended model design agreed by all authors. Finally, all authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Farfoura, M.E., Mashal, I., Alkhatib, A. et al. A lightweight machine learning methods for malware classification. Cluster Comput 28, 1 (2025). https://doi.org/10.1007/s10586-024-04755-2
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10586-024-04755-2