Evading malware classifiers using RL agent with action-mask

Pandey, Saurabh; Kumar, Nitesh; Handa, Anand; Shukla, Sandeep Kumar

doi:10.1007/s10207-023-00715-w

Evading malware classifiers using RL agent with action-mask

Regular Contribution
Published: 07 July 2023

Volume 22, pages 1743–1763, (2023)
Cite this article

International Journal of Information Security Aims and scope Submit manuscript

Saurabh Pandey¹,
Nitesh Kumar¹,
Anand Handa¹ &
…
Sandeep Kumar Shukla¹

258 Accesses
1 Citation
Explore all metrics

Abstract

For malware detection, the commonly known technique used by many commercial antivirus engines is signature-based methods. However, researchers have started focusing on utilizing machine learning (ML)-based methods for this task over the past few years. This is because signature-based have their limitations of not recognizing new variants of malware. On the other hand, ML-based methods have proven their worth across different domains such as images, text, and audio with their generalization capabilities. Even though these ML-based methods are robust enough compared to signature-based methods, these can be easily fooled by executing an adversarial attack by simply crafting an adversarial sample that looks similar to the original sample with some added perturbation. Such attacks must be carried out to know the potential threat under different scenarios and leverage the knowledge to tighten the defenses of the ML models. We propose a reinforcement learning (RL)-based approach to generate adversarial examples that utilize action-masking with the agent learning process. In our work, we attack two pre-trained static models – EMBER trained on 1 M+ samples and GBDT trained on 100K+ samples. We also attack a surrogate model trained on 25K+ samples, a substitute for 11 different self-trained hybrid models. We also demonstrate the transferability attack on all the hybrid models with the help of adversarial examples generated after attacking the surrogate and two pre-trained static models. We also perform adversarial defense training on these 11 hybrid models and evaluate their robustness against such attacks. We achieve the maximum evasion rates of 47.64%, 33.2%, and 94.67% for GBDT, EMBER, and surrogate models using action masks, respectively. We also present binary modification evidence and a few corner cases. Since modifying binary files is tricky, even a 1-byte change at the wrong place can corrupt the binary. We use the modification routine that covers some corner cases and restricts us from taking action that can potentially corrupt the binary.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comprehensive survey of AI-enabled phishing attacks detection techniques

Article 23 October 2020

Artificial intelligence in cyber security: research advances, challenges, and opportunities

Article 13 March 2021

Deepfakes: current and future trends

Article Open access 19 February 2024

Availability of data and material

The data that support the findings of this study are available on request from the corresponding author.

References

2023 report - checkpoint research cybersecurity report. https://www.i-maxpr.com/x/13167/c6e/c6e9c743674c8e0971480e58d23f9cc6.pdf. Accessed 15 June 2023
Alexey, K., Ian, G., Samy, B.: Adversarial machine learning at scale. In: International Conference on Learning Representations (2017)
Anderson, H.S., Kharkar, A., Filar, B., Evans, D., Roth, P.: Learning to evade static pe machine learning malware models via reinforcement learning. (2018) arXiv preprint arXiv:1801.08917
Backes, M., Manoharan, P., Grosse, K., Papernot, N.: Adversarial perturbations against deep neural networks for malware classification. Comput Res Reposit (CoRR) (2016)
Backes, M., Manoharan, P., Grosse, K., Papernot, N..: Adversarial perturbations against deep neural networks for malware classification. Comput. Res. Reposit (CoRR) (2016)
Bridges, R.A., Oesch, S., Iannacone, M.D., Huffer, K.M.T., Jewell, B., Nichols, J.A., Weber, B., Verma, M.E., Scofield, D., Miles, C., et al.: Beyond the hype: An evaluation of commercially available machine-learning-based malware detectors. Digital Threats Res. Pract (2023)
Carlini, N., Wagner, D..: Audio adversarial examples: Targeted attacks on speech-to-text. In: 2018 IEEE Security and Privacy Workshops (SPW), pp. 1–7. IEEE, (2018)
Chen, T., Guestrin, C.: XGBoost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, pages 785–794, New York, NY, USA,ACM (2016)
Cheng, M., Yi, J., Chen, P.-Y., Zhang, H., Hsieh, C.-J.: Seq2sick: Evaluating the robustness of sequence-to-sequence models with adversarial examples. In: Proceedings of the AAAI Conference on Artificial Intelligence 34, 3601–3608 (2020)
Chollet, F., et al.: Keras: The python deep learning library. Astrophysics source code library
Cylance AI Homepage. https://www.blackberry.com/us/en/products/unified-endpoint-security/cylance-ai. Accessed 15 June 2023
Demetrio, L., Biggio, B., Lagorio, G., Roli, F., Armando, A.: Functionality-preserving black-box optimization of adversarial windows malware. IEEE Trans. Inf. Forensics Secur. 16, 3469–3478 (2021)
Article Google Scholar
Demetrio, L., Coull, S.E., Biggio, B., Lagorio, G., Armando, A., Roli, F.: Adversarial exemples: A survey and experimental evaluation of practical attacks on machine learning for windows malware detection. ACM Trans. Privacy Security (TOPS) 24(4), 1–31 (2021)
Article Google Scholar
Dogo, EM., Afolabi, OJ., Nwulu, NI., Twala, Bhekisipho, Aigbavboa, CO.: A comparative analysis of gradient descent-based optimization algorithms on convolutional neural networks. In: 2018 international conference on computational techniques, electronics and mechanical systems (CTEMS), pages 92–99. IEEE, (2018)
Ember: An Open Source Classifier And Dataset. https://www.elastic.co/blog/introducing-ember-open-source-classifier-and-dataset. Accessed 15 June 2023
Fang, Y., Zeng, Y., Li, B., Liu, L., Zhang, L.: Deepdetectnet vs rlattacknet: An adversarial method to improve deep learning-based static malware detection model. PLoS ONE 15(4), e0231626 (2020)
Article Google Scholar
Fang, Z., Wang, J., Li, B., Siqi, W., Zhou, Y., Huang, H.: Evading anti-malware engines with deep reinforcement learning. IEEE Access 7, 48867–48879 (2019)
Article Google Scholar
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. Adv. Neural Inf. Process. Syst. 27, (2014)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. Stat 1050, 20 (2015)
Google Scholar
Hariom, A., Handa, B., Kumar, N., Shukla, S.K.: Adversaries strike hard: Adversarial attacks against malware classifiers using dynamic api calls as features. In: Cyber Security Cryptography and Machine Learning: 5th International Symposium, CSCML 2021, Be’er Sheva, Israel, July 8–9, 2021, Proceedings, volume 12716, page 20. Springer Nature, (2021)
Labaca-Castro, R., Franz, S., Rodosek, G.: Dreo: Aimed-rl: Exploring adversarial malware examples with reinforcement learning. In: Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part IV 21, pages 37–52. Springer, (2021)
Li, D., Li, Q.: Adversarial deep ensemble: Evasion attacks and defenses for malware detection. IEEE Trans. Inf. Forensics Secur. 15, 3886–3900 (2020)
Article Google Scholar
LIEF - Library to Instrument Executable Formats. https://lief.quarkslab.com/. Accessed 15 June 2023
Ling, X., Wu, L., Zhang, J., Qu, Z., Deng, W., Chen, X., Qian, Y., Wu, C., Ji, S., Luo, T., et al: Adversarial attacks against windows pe malware detection: A survey of the state-of-the-art. Comput. Secur. 103134 (2023)
Malware Statistics & Trend Report. https://www.av-test.org/en/statistics/malware/. Accessed 15 June 2023
Mnih, V., Badia, A., Puigdomenech, M., Mehdi, G., Alex, L., Timothy, H., Tim, S., David, K.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937. PMLR (2016)
Moosavi-Dezfooli, S.-M., Fawzi, A., Frossard, P.: Deepfool: a simple and accurate method to fool deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2574–2582 (2016)
Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z, Berkay, S., Ananthram: The limitations of deep learning in adversarial settings. In: 2016 IEEE European symposium on security and privacy (EuroS &P), pp. 372–387. IEEE (2016)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
pefile: A Python module to read and work with PE files. https://github.com/erocarrera/pefile. Accessed 15 June 2023
Peng, X., Xian, H., Qian, L., Xiuqing, L.: Semantics aware adversarial malware examples generation for black-box attacks. Appl. Soft Comput. 109, 107506 (2021)
Article Google Scholar
Portable Executable 32 bit Structure. https://commons.wikimedia.org/w/index.php?curid=51026079. Accessed 15 June 2023
Raff, E., Barker, J., Sylvester, J., Brandon, R., Catanzaro, B., Nicholas, C.K.: Malware detection by eating a whole exe. In: Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M., Dormann, N.: Stable-baselines3: Reliable reinforcement learning implementations. J. Mach. Learn. Res. 22(268), 1–8 (2021)
MATH Google Scholar
Ranveer, S., Hiray, S.: Comparative analysis of feature extraction methods of malware detection. Int. J. Comput. Appl. 120, 1–7 (2015)
Google Scholar
Rosenberg, I., Shabtai, A., Rokach, L., Elovici, Y.: Generic black-box end-to-end attack against state of the art api call based malware classifiers. In: International Symposium on Research in Attacks, Intrusions, and Defenses, pp. 490–510. Springer, (2018)
Schönherr, L., Eisenhofer, T., Zeiler, S., Holz, T., Kolossa, D.: Imperio: Robust over-the-air adversarial examples for automatic speech recognition systems. In: Annual Computer Security Applications Conference, pp. 843–855 (2020)
Suciu, O., Coull, S.E., Johns, J.: Exploring adversarial examples in malware detection. In: 2019 IEEE Security and Privacy Workshops (SPW), pp. 8–14. IEEE, (2019)
Softonic - windows app. https://en.softonic.com/windows. Accessed year 2022
Suciu, O., Coull, S.E., Johns, J.: Exploring adversarial examples in malware detection. In: 2019 IEEE Security and Privacy Workshops (SPW), pp. 8–14. IEEE, (2019)
Sourceforge - free open source windows software. https://sourceforge.net/directory/os:windows/. Accessed year 2022
Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT press, (2018)
Surge in Ransomware and 10 Biggest Attacks in 2021. https://www.isaca.org/resources/news-and-trends/newsletters/atisaca/2021/volume-35/surge-in-ransomware-attack-and-10-biggest-attacks-in-2021. Accessed 15 June 2023
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks. In: 2nd International Conference on Learning Representations, ICLR 2014, (2014)
VirusShare - Because Sharing is Caring. https://virusshare.com/. Accessed 15 June 2023
VirusTotal. https://www.virustotal.com/gui/home/upload. Accessed 15 June 2023
Wang, R., Juefei-Xu, F., Huang, Y., Guo, Q., Xie, X., Ma, L., Liu, Y.: Deepsonar: Towards effective and robust detection of ai-synthesized fake voices. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 1207–1216 (2020)
Watkins, C.J.C.H.: Learning from delayed rewards (1989)
What is Cuckoo? https://cuckoo.sh/docs/introduction/what.html. Accessed 15 June 2023
What is the Polymorphic Virus? https://www.kaspersky.com/resource-center/definitions/what-is-a-polymorphic-virus. Accessed 15 June 2023
Wu, W., Su, Y., Lyu, M.R., King, I.: Improving the transferability of adversarial samples with adversarial transformations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9024–9033 (2021)
Zhang, Y., Gan, Z., Fan, K., Chen, Z., Henao, R., Shen, D., Carin, L.: Adversarial feature matching for text generation. In: International Conference on Machine Learning, pp. 4006–4015. PMLR, (2017)

Download references

Funding

This work is partially funded by SERB, Government of India.

Author information

Authors and Affiliations

C3I Center, Department of CSE, Indian Institute of Technology, Kanpur, India
Saurabh Pandey, Nitesh Kumar, Anand Handa & Sandeep Kumar Shukla

Authors

Saurabh Pandey
View author publications
You can also search for this author in PubMed Google Scholar
Nitesh Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Anand Handa
View author publications
You can also search for this author in PubMed Google Scholar
Sandeep Kumar Shukla
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed equally.

Corresponding author

Correspondence to Anand Handa.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

This article does not contain any studies with human participants.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Pandey, S., Kumar, N., Handa, A. et al. Evading malware classifiers using RL agent with action-mask. Int. J. Inf. Secur. 22, 1743–1763 (2023). https://doi.org/10.1007/s10207-023-00715-w

Download citation

Published: 07 July 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s10207-023-00715-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evading malware classifiers using RL agent with action-mask

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey of AI-enabled phishing attacks detection techniques

Artificial intelligence in cyber security: research advances, challenges, and opportunities

Deepfakes: current and future trends

Availability of data and material

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Evading malware classifiers using RL agent with action-mask

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey of AI-enabled phishing attacks detection techniques

Artificial intelligence in cyber security: research advances, challenges, and opportunities

Deepfakes: current and future trends

Availability of data and material

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation