Assessing Adaptive Attacks Against Trained JavaScript Classifiers

Hansen, Niels; De Carli, Lorenzo; Davidson, Drew

doi:10.1007/978-3-030-63086-7_12

Niels Hansen²⁰,
Lorenzo De Carli²¹ &
Drew Davidson²⁰

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 335))

Included in the following conference series:

International Conference on Security and Privacy in Communication Systems

931 Accesses

Abstract

In this work, we evaluate the security of heuristic- and machine learning-based classifiers for the detection of malicious JavaScript code. Due to the prevalence of web attacks directed though JavaScript injected into webpages, such defense mechanisms serve as a last-line of defense by classifying individual scripts as either benign or malicious. State-of-the-art classifiers work well at distinguishing currently-known malicious scripts from existing legitimate functionality, often by employing training sets of known benign or malicious samples. However, we observe that real-world attackers can be adaptive, and tailor their attacks to the benign content of the page and the defense mechanisms being used to defend the page.

In this work, we consider a variety of techniques that an adaptive adversary may use to overcome JavaScript classifiers. We introduce a variety of new threat models that consider various types of adaptive adversaries, with varying knowledge of the classifier and dataset being used to detect malicious scripts. We show that while no heuristic defense mechanism is a silver bullet against an adaptive adversary, some techniques are far more effective than others. Thus, our work points to which techniques should be considered best practices in classifying malicious content, and a call to arms for more advanced classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Experimentally, we determined that increasing the maximum tree-edit distance above 20 results in a sharp increase in the detection rate for the generated samples. This applies to both classifiers in our examination.
2.
It is in principle possible to extend the algorithm with backtracking during the gadget search process, enabling it to generate an arbitrary number of variants.

References

GitHub - geeksonsecurity/js-malicious-dataset, December 2019. https://github.com/geeksonsecurity/js-malicious-dataset
Biggio, B., Roli, F.: Wild patterns: Ten years after the rise of adversarial machine learning. Pattern Recogn. 84, 317–331 (2018)
Article Google Scholar
Calzavara, S., Rabitti, A., Bugliesi, M.: Content security problems?: evaluating the effectiveness of content security policy in the wild. In: CCS (2016)
Google Scholar
Curtsinger, C., Livshits, B., Zorn, B.G., Seifert, C.: ZOZZLE: fast and precise in-browser javascript malware detection. In: USENIX Security Symposium (2011)
Google Scholar
Demontis, A., et al.: Yes, machine learning can be more secure! a case study on android malware detection. IEEE Trans. Dependable Secure Comput. PP, 1 (2018)
Google Scholar
Ersan, E., Malka, L., Kapron, B.M.: Semantically non-preserving transformations for antivirus evaluation. In: FPS (2016)
Google Scholar
Falleri, J.R., Morandat, F., Blanc, X., Martinez, M., Monperrus, M.: Fine-grained and accurate source code differencing. In: ASE (2014)
Google Scholar
Fass, A., Backes, M., Stock, B.: HideNoSeek: camouflaging malicious JavaScript in benign ASTs. In: CCS (2019)
Google Scholar
Fass, A., Krawczyk, R.P., Backes, M., Stock, B.: JaSt: fully syntactic detection of malicious (Obfuscated) JavaScript. In: DIMVA (2018)
Google Scholar
Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: CCS (2015)
Google Scholar
Giffin, J.T., Jha, S., Miller, B.P.: Automated discovery of mimicry attacks. In: RAID (2006)
Google Scholar
Grosse, K., Papernot, N., Manoharan, P., Backes, M., McDaniel, P.: Adversarial examples for malware detection. In: ESORICS (2017)
Google Scholar
Hao, Y., Liang, H., Zhang, D., Zhao, Q., Cui, B.: JavaScript malicious codes analysis based on naive bayes classification. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (2014)
Google Scholar
Hu, W., Tan, Y.: Black-box attacks against RNN based malware detection algorithms. arXiv:1705.08131 [cs], May 2017. http://arxiv.org/abs/1705.08131
Hu, W., Tan, Y.: Generating adversarial malware examples for black-box attacks based on GAN. arXiv:1702.05983 [cs], February 2017
Leyden, J.: Payment-card-skimming Magecart strikes again: Zero out of five for infecting e-retail sites, October 2018. https://www.theregister.com/2018/10/09/magecart_payment_card_malware/
Kantchelian, A., Tygar, J.D., Joseph, A.D.: Evasion and hardening of tree ensemble classifiers. In: ICML (2016)
Google Scholar
Maiorca, D., Biggio, B., Chiappe, M.E., Giacinto, G.: Adversarial detection of flash malware: limitations and open issues. arXiv:1710.10225 [cs], October 2017
Pan, X., Cao, Y., Liu, S., Zhou, Y., Chen, Y., Zhou, T.: CSPAutoGen: black-box enforcement of content security policy upon real-world websites. In: CCS (2016)
Google Scholar
Rieck, K., Krueger, T., Dewald, A.: Cujo: efficient detection and prevention of drive-by-download attacks. In: ACSAC (2010)
Google Scholar
Rosenberg, I., Shabtai, A., Elovici, Y., Rokach, L.: Query-efficient GAN based black-box attack against sequence based machine and deep learning classifiers. arXiv:1804.08778 [cs] (Apr 2018), http://arxiv.org/abs/1804.08778
Rosenberg, I., Shabtai, A., Rokach, L., Elovici, Y.: Generic black-box end-to-end attack against state of the art API call based malware classifiers. In: RAID (2018)
Google Scholar
Srndic, N., Laskov, P.: Practical evasion of a learning-based classifier: a case study. In: IEEE S&P (2014)
Google Scholar
Staicu, C.A., Pradel, M., Livshits, B.: Synode: Understanding and automatically preventing injection attacks on node. js. In: NDSS (2018)
Google Scholar
Szegedy, C., et al.: Intriguing properties of neural networks. In: ICLR (2014)
Google Scholar
Wagner, D., Soto, P.: Mimicry attacks on host-based intrusion detection systems. In: CCS (2002)
Google Scholar

Download references

Acknowledgments

We thank the anonymous reviewers and our shepherd, Yuan Zhang, for their insightful comments. We further thank: Louis Narmour and Devin Dennis for their help in building infrastructure and dataset; Bruce Kapron and Somesh Jha for informative early discussions on the problems tackled in this paper.

Author information

Authors and Affiliations

University of Kansas, Lawrence, USA
Niels Hansen & Drew Davidson
Worcester Polytechnic Institute, Worcester, USA
Lorenzo De Carli

Authors

Niels Hansen
View author publications
You can also search for this author in PubMed Google Scholar
Lorenzo De Carli
View author publications
You can also search for this author in PubMed Google Scholar
Drew Davidson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lorenzo De Carli .

Editor information

Editors and Affiliations

Yonsei University, Seoul, Korea (Republic of)
Noseong Park
George Mason University, Fairfax, VA, USA
Kun Sun
Dipartimento di Informatica, Universita degli Studi, Milan, Milano, Italy
Sara Foresti
University of Florida, Gainesville, FL, USA
Kevin Butler
Division of Nephrology, University of Alabama, Birmingham, AL, USA
Nitesh Saxena

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hansen, N., De Carli, L., Davidson, D. (2020). Assessing Adaptive Attacks Against Trained JavaScript Classifiers. In: Park, N., Sun, K., Foresti, S., Butler, K., Saxena, N. (eds) Security and Privacy in Communication Networks. SecureComm 2020. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 335. Springer, Cham. https://doi.org/10.1007/978-3-030-63086-7_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-63086-7_12
Published: 12 December 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63085-0
Online ISBN: 978-3-030-63086-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics