Skip to main content

Assessing Adaptive Attacks Against Trained JavaScript Classifiers

  • Conference paper
  • First Online:
Security and Privacy in Communication Networks (SecureComm 2020)

Abstract

In this work, we evaluate the security of heuristic- and machine learning-based classifiers for the detection of malicious JavaScript code. Due to the prevalence of web attacks directed though JavaScript injected into webpages, such defense mechanisms serve as a last-line of defense by classifying individual scripts as either benign or malicious. State-of-the-art classifiers work well at distinguishing currently-known malicious scripts from existing legitimate functionality, often by employing training sets of known benign or malicious samples. However, we observe that real-world attackers can be adaptive, and tailor their attacks to the benign content of the page and the defense mechanisms being used to defend the page.

In this work, we consider a variety of techniques that an adaptive adversary may use to overcome JavaScript classifiers. We introduce a variety of new threat models that consider various types of adaptive adversaries, with varying knowledge of the classifier and dataset being used to detect malicious scripts. We show that while no heuristic defense mechanism is a silver bullet against an adaptive adversary, some techniques are far more effective than others. Thus, our work points to which techniques should be considered best practices in classifying malicious content, and a call to arms for more advanced classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Experimentally, we determined that increasing the maximum tree-edit distance above 20 results in a sharp increase in the detection rate for the generated samples. This applies to both classifiers in our examination.

  2. 2.

    It is in principle possible to extend the algorithm with backtracking during the gadget search process, enabling it to generate an arbitrary number of variants.

References

  1. GitHub - geeksonsecurity/js-malicious-dataset, December 2019. https://github.com/geeksonsecurity/js-malicious-dataset

  2. Biggio, B., Roli, F.: Wild patterns: Ten years after the rise of adversarial machine learning. Pattern Recogn. 84, 317–331 (2018)

    Article  Google Scholar 

  3. Calzavara, S., Rabitti, A., Bugliesi, M.: Content security problems?: evaluating the effectiveness of content security policy in the wild. In: CCS (2016)

    Google Scholar 

  4. Curtsinger, C., Livshits, B., Zorn, B.G., Seifert, C.: ZOZZLE: fast and precise in-browser javascript malware detection. In: USENIX Security Symposium (2011)

    Google Scholar 

  5. Demontis, A., et al.: Yes, machine learning can be more secure! a case study on android malware detection. IEEE Trans. Dependable Secure Comput. PP, 1 (2018)

    Google Scholar 

  6. Ersan, E., Malka, L., Kapron, B.M.: Semantically non-preserving transformations for antivirus evaluation. In: FPS (2016)

    Google Scholar 

  7. Falleri, J.R., Morandat, F., Blanc, X., Martinez, M., Monperrus, M.: Fine-grained and accurate source code differencing. In: ASE (2014)

    Google Scholar 

  8. Fass, A., Backes, M., Stock, B.: HideNoSeek: camouflaging malicious JavaScript in benign ASTs. In: CCS (2019)

    Google Scholar 

  9. Fass, A., Krawczyk, R.P., Backes, M., Stock, B.: JaSt: fully syntactic detection of malicious (Obfuscated) JavaScript. In: DIMVA (2018)

    Google Scholar 

  10. Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: CCS (2015)

    Google Scholar 

  11. Giffin, J.T., Jha, S., Miller, B.P.: Automated discovery of mimicry attacks. In: RAID (2006)

    Google Scholar 

  12. Grosse, K., Papernot, N., Manoharan, P., Backes, M., McDaniel, P.: Adversarial examples for malware detection. In: ESORICS (2017)

    Google Scholar 

  13. Hao, Y., Liang, H., Zhang, D., Zhao, Q., Cui, B.: JavaScript malicious codes analysis based on naive bayes classification. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (2014)

    Google Scholar 

  14. Hu, W., Tan, Y.: Black-box attacks against RNN based malware detection algorithms. arXiv:1705.08131 [cs], May 2017. http://arxiv.org/abs/1705.08131

  15. Hu, W., Tan, Y.: Generating adversarial malware examples for black-box attacks based on GAN. arXiv:1702.05983 [cs], February 2017

  16. Leyden, J.: Payment-card-skimming Magecart strikes again: Zero out of five for infecting e-retail sites, October 2018. https://www.theregister.com/2018/10/09/magecart_payment_card_malware/

  17. Kantchelian, A., Tygar, J.D., Joseph, A.D.: Evasion and hardening of tree ensemble classifiers. In: ICML (2016)

    Google Scholar 

  18. Maiorca, D., Biggio, B., Chiappe, M.E., Giacinto, G.: Adversarial detection of flash malware: limitations and open issues. arXiv:1710.10225 [cs], October 2017

  19. Pan, X., Cao, Y., Liu, S., Zhou, Y., Chen, Y., Zhou, T.: CSPAutoGen: black-box enforcement of content security policy upon real-world websites. In: CCS (2016)

    Google Scholar 

  20. Rieck, K., Krueger, T., Dewald, A.: Cujo: efficient detection and prevention of drive-by-download attacks. In: ACSAC (2010)

    Google Scholar 

  21. Rosenberg, I., Shabtai, A., Elovici, Y., Rokach, L.: Query-efficient GAN based black-box attack against sequence based machine and deep learning classifiers. arXiv:1804.08778 [cs] (Apr 2018), http://arxiv.org/abs/1804.08778

  22. Rosenberg, I., Shabtai, A., Rokach, L., Elovici, Y.: Generic black-box end-to-end attack against state of the art API call based malware classifiers. In: RAID (2018)

    Google Scholar 

  23. Srndic, N., Laskov, P.: Practical evasion of a learning-based classifier: a case study. In: IEEE S&P (2014)

    Google Scholar 

  24. Staicu, C.A., Pradel, M., Livshits, B.: Synode: Understanding and automatically preventing injection attacks on node. js. In: NDSS (2018)

    Google Scholar 

  25. Szegedy, C., et al.: Intriguing properties of neural networks. In: ICLR (2014)

    Google Scholar 

  26. Wagner, D., Soto, P.: Mimicry attacks on host-based intrusion detection systems. In: CCS (2002)

    Google Scholar 

Download references

Acknowledgments

We thank the anonymous reviewers and our shepherd, Yuan Zhang, for their insightful comments. We further thank: Louis Narmour and Devin Dennis for their help in building infrastructure and dataset; Bruce Kapron and Somesh Jha for informative early discussions on the problems tackled in this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lorenzo De Carli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hansen, N., De Carli, L., Davidson, D. (2020). Assessing Adaptive Attacks Against Trained JavaScript Classifiers. In: Park, N., Sun, K., Foresti, S., Butler, K., Saxena, N. (eds) Security and Privacy in Communication Networks. SecureComm 2020. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 335. Springer, Cham. https://doi.org/10.1007/978-3-030-63086-7_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-63086-7_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-63085-0

  • Online ISBN: 978-3-030-63086-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics