Skip to main content

Adversaries Strike Hard: Adversarial Attacks Against Malware Classifiers Using Dynamic API Calls as Features

  • Conference paper
  • First Online:
Cyber Security Cryptography and Machine Learning (CSCML 2021)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12716))

Abstract

Malware designers have become increasingly sophisticated over time, crafting polymorphic and metamorphic malware employing obfuscation tricks such as packing and encryption to evade signature-based malware detection systems. Therefore, security professionals use machine learning-based systems to toughen their defenses – based on malware’s dynamic behavioral features. However, these systems are susceptible to adversarial inputs. Some malware designers exploit this vulnerability to bypass detection. In this work, we develop two approaches to evade machine learning-based classifiers. First, we create a Generative Adversarial Networks (GAN) based method, which we call ‘Malware Evasion using GAN’ (MEGAN) and the extended version ‘Malware Evasion using GAN with Reduced Perturbation (MEGAN-RP).’ Second, we develop a novel reinforcement learning-based approach called ‘Malware Evasion using Reinforcement Agent (MERA).’ We generate adversarial malware that simultaneously minimizes the recall of a target classifier and the amount of perturbation needed in the actual malware to evade detection. We evaluate our work against 13 different BlackBox detection models – all of which use dynamic presence-absence of API calls as features. We observe that our approaches reduce the recall of almost all BlackBox models to zero. Further, MERA outperforms all the other models and reduces True Positive Rate (TPR) to zero against all target models except the Decision Tree (DT) – with minimum perturbation in 6 out of 13 target models. We also present experimental results on adversarial retraining defense and its evasion for GAN based strategies.

Partially Supported by SERB, Government of India.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Anderson, H.S., Kharkar, A., Filar, B., Evans, D., Roth, P.: Learning to evade static PE machine learning malware models via reinforcement learning. CoRR abs/1801.08917 (2018). http://arxiv.org/abs/1801.08917

  2. AVTest: AV-Test Institute Statistics of Malware (2020). https://www.av-test.org/en/statistics/malware/. Accessed 21 May

  3. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  4. scikit-learn developers (BSD License): Bootstrap Aggregating Decision Tree (2019). https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingClassifier.html#sklearn.ensemble.BaggingClassifier

  5. scikit-learn developers (BSD License): Gaussian Naive Bayes (2019). https://scikit-learn.org/stable/modules/generated/sklearn.naive_bayes.GaussianNB.html

  6. scikit-learn developers (BSD License): Gradient Boosting Classifier (2019). https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html

  7. scikit-learn developers (BSD License): Hard Voting (2019). https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.VotingClassifier.html

  8. scikit-learn developers (BSD License): Linear SVC (2019). https://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html

  9. scikit-learn developers (BSD License): RBF SVC (2019). https://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html

  10. Cheng, M., Yi, J., Chen, P.Y., Zhang, H., Hsieh, C.J.: Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples (2018)

    Google Scholar 

  11. Cuckoo: Cuckoo Sandbox Architecture (2020). https://cuckoo.sh/docs. Accessed 10 June

  12. Fadadu, F., Handa, A., Kumar, N., Shukla, S.K.: Evading API call sequence based malware classifiers. In: Zhou, J., Luo, X., Shen, Q., Xu, Z. (eds.) ICICS 2019. LNCS, vol. 11999, pp. 18–33. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-41579-2_2

    Chapter  Google Scholar 

  13. Fang, Z., Wang, J., Li, B., Wu, S., Zhou, Y., Huang, H.: Evading anti-malware engines with deep reinforcement learning. IEEE Access 7, 48867–48879 (2019)

    Article  Google Scholar 

  14. Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)

    Google Scholar 

  15. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and Harnessing Adversarial Examples. arXiv stat.ML eprint 1412.6572 (2014)

    Google Scholar 

  16. Hosmer Jr., D.W., Lemeshow, S., Sturdivant, R.X.: Applied Logistic Regression, vol. 398. Wiley, Hoboken (2013)

    Book  Google Scholar 

  17. Hu, W., Tan, Y.: Generating Adversarial Malware Examples for Black-Box Attacks Based on GAN. arXiv preprint arXiv:1702.05983 (2017)

  18. Keras: Keras Library (2020). https://www.keras.io. Accessed 10 June

  19. Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial Machine Learning at Scale. arXiv cs.CV eprint 1611.01236 (2016)

    Google Scholar 

  20. Liu, Y., Chen, X., Liu, C., Song, D.: Delving into Transferable Adversarial Examples and Black-box Attacks. arXiv cs.LG eprint 1611.02770 (2016)

    Google Scholar 

  21. Malwr: Malware Dataset Repository (2019). https://www.malwr.ee. Accessed 27 Dec

  22. Margineantu, D.D., Dietterich, T.G.: Pruning adaptive boosting. In: ICML, vol. 97, pp. 211–218. Citeseer (1997)

    Google Scholar 

  23. Mnih, V., et al.: Playing Atari with Deep Reinforcement Learning. arXiv cs.LG eprint 1312.5602 (2013)

    Google Scholar 

  24. Mohaisen, A., Alrawi, O.: Unveiling zeus: automated classification of malware samples. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 829–832 (2013)

    Google Scholar 

  25. Moosavi-Dezfooli, S.M., Fawzi, A., Fawzi, O., Frossard, P.: Universal Adversarial Perturbations. arXiv cs.CV eprint 1610.08401 (2016)

    Google Scholar 

  26. Nishida, K.: Extreme Gradient Boosting Classifier (2019). https://blog.exploratory.io/introduction-to-extreme-gradient-boosting-in-exploratory-7bbec554ac7

  27. Papernot, N., McDaniel, P., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 582–597 (2016)

    Google Scholar 

  28. Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security (2017). https://doi.org/10.1145/3052973.3053009

  29. Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)

    Google Scholar 

  30. Ruck, D.W., Rogers, S.K., Kabrisky, M., Oxley, M.E., Suter, B.W.: The multilayer perceptron as an approximation to a Bayes optimal discriminant function. IEEE Trans. Neural Netw. 1(4), 296–298 (1990)

    Article  Google Scholar 

  31. Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., McDaniel, P.: Ensemble Adversarial Training: Attacks and Defenses. arXiv stat.ML eprint 1705.07204 (2017)

    Google Scholar 

  32. Virusshare: Malware Dataset Repository (2020). https://www.virusshare.com. Accessed 5 Jan

  33. Wang, R., et al.: DeepSonar: Towards Effective and Robust Detection of AI-Synthesized Fake Voices (2020)

    Google Scholar 

  34. Webroot: Webroot Threat Brief (2016). https://webroot-cms-cdn.s3.amazonaws.com/7814/5617/2382/Webroot-2016-Threat-Brief.pdf. Accessed 21 May

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anand Handa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hariom, Handa, A., Kumar, N., Kumar Shukla, S. (2021). Adversaries Strike Hard: Adversarial Attacks Against Malware Classifiers Using Dynamic API Calls as Features. In: Dolev, S., Margalit, O., Pinkas, B., Schwarzmann, A. (eds) Cyber Security Cryptography and Machine Learning. CSCML 2021. Lecture Notes in Computer Science(), vol 12716. Springer, Cham. https://doi.org/10.1007/978-3-030-78086-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-78086-9_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-78085-2

  • Online ISBN: 978-3-030-78086-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics