Skip to main content

JaSt: Fully Syntactic Detection of Malicious (Obfuscated) JavaScript

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 10885))

Abstract

JavaScript is a browser scripting language initially created to enhance the interactivity of web sites and to improve their user-friendliness. However, as it offloads the work to the user’s browser, it can be used to engage in malicious activities such as Crypto Mining, Drive-by Download attacks, or redirections to web sites hosting malicious software. Given the prevalence of such nefarious scripts, the anti-virus industry has increased the focus on their detection. The attackers, in turn, make increasing use of obfuscation techniques, so as to hinder analysis and the creation of corresponding signatures. Yet these malicious samples share syntactic similarities at an abstract level, which enables to bypass obfuscation and detect even unknown malware variants.

In this paper, we present JaSt, a low-overhead solution that combines the extraction of features from the abstract syntax tree with a random forest classifier to detect malicious JavaScript instances. It is based on a frequency analysis of specific patterns, which are either predictive of benign or of malicious samples. Even though the analysis is entirely static, it yields a high detection accuracy of almost 99.5% and has a low false-negative rate of 0.54%.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Malware don’t need Coffee, https://malware.dontneedcoffee.com.

  2. 2.

    Alexa top sites, http://www.alexa.com/topsites.

References

  1. Atom: Atom the hackable text editor for the 21st Century. https://atom.io. Accessed 21 Feb 2018

  2. Backes, M., Nauman, M.: LUNA: quantifying and leveraging uncertainty in android malware analysis through Bayesian machine learning. In: Euro S&P (2017)

    Google Scholar 

  3. Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)

    MathSciNet  MATH  Google Scholar 

  4. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)

    Article  Google Scholar 

  5. Canali, D., Cova, M., Vigna, G., Kruegel, C.: Prophiler: a fast filter for the large-scale detection of malicious web pages. In: International Conference on World Wide Web (2011)

    Google Scholar 

  6. Cao, Y., Pan, X., Chen, Y., Zhuge, J.: JShield: towards real-time and vulnerability-based detection of polluted drive-by download attacks. In: Annual Computer Security Applications Conference (ACSAC) (2014)

    Google Scholar 

  7. Curtsinger, C., Livshits, B., Zorn, B., Seifert, C.: Zozzle: fast and precise in-browser javascript malware detection. In: USENIX (2011)

    Google Scholar 

  8. Gastwirth, J.L.: The estimation of the Lorenz curve and Gini index. Rev. Econ. Stat. 54, 306–316 (1972)

    Article  MathSciNet  Google Scholar 

  9. Hao, Y., Liang, H., Zhang, D., Zhao, Q., Cui, B.: JavaScript malicious codes analysis based on naive Bayes classification. In: International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (2014)

    Google Scholar 

  10. Hidayat, A.: ECMAScript Parsing Infrastructure for Multipurpose Analysis. http://esprima.org. Accessed 05 Apr 2017

  11. AV-TEST - The Independent IT-Security Institute: New malware. https://www.av-test.org/en/statistics/malware. Accessed 01 Feb 2018

  12. Invernizzi, L., Benvenuti, S., Cova, M., Comparetti, P.M., Kruegel, C., Vigna, G.: EvilSeed: a guided approach to finding malicious web pages. In: S&P (2012)

    Google Scholar 

  13. Joseph, A.D., Laskov, P., Roli, F., Tygar, J.D., Nelson, B.: Machine learning methods for computer security. In: Dagstuhl Manifestos (2013)

    Google Scholar 

  14. Jules, D.S.: JS inspect Detect copy-pasted and structurally similar code. https://github.com/danielstjules/jsinspect. Accessed 19 Feb 2018

  15. Kantchelian, A., Tygar, J.D., Joseph, A.D.: Evasion and hardening of tree ensemble classifiers. In: International Conference on Machine Learning (2016)

    Google Scholar 

  16. Kaplan, S., Livshits, B., Zorn, B., Siefert, C., Curtsinger, C.: “NoFus: Automatically Detecting” + String.fromCharCode(32) + “ObFuSCateD ”. toLowerCase() + “JavaScript Code”. Microsoft Research Technical Report (2011)

    Google Scholar 

  17. Kapravelos, A., Shoshitaishvili, Y., Cova, M., Krügel, C., Vigna, G..: Revolver: an automated approach to the detection of evasive web-based malware. In: USENIX (2013)

    Google Scholar 

  18. Kar, D., Panigrahi, S., Sundararajan, S.: SQLiGot: detecting SQL injections attacks using graph of tokens and SVM. Comput. Secur. 60, 206–225 (2016)

    Article  Google Scholar 

  19. Kolbitsch, C., Livshits, B., Zorn, B., Seifert, C.: Rozzle: de-cloaking internet malware. In: S&P (2012)

    Google Scholar 

  20. Kolter, J.Z., Maloof, M.A.: Learning to detect and classify malicious executables in the wild. J. Mach. Learn. Res. 7, 2721–2744 (2006)

    MathSciNet  MATH  Google Scholar 

  21. Laskov, P., Šrndić, N.: Static detection of malicious javascript-bearing pdf documents. In: Annual Computer Security Applications Conference (ACSAC) (2011)

    Google Scholar 

  22. Likarish, P., Jung, E., Jo, I.: Obfuscated malicious javascript detection using classification techniques. In: International Conference on Malicious and Unwanted Software (MALWARE) (2009)

    Google Scholar 

  23. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  24. Powers, D.M.W.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. J. Mach. Learn. Technol. 2, 37–63 (2011)

    Google Scholar 

  25. Rao, V., Hande, K.: A comparative study of static, dynamic and hybrid analysis techniques for android malware detection. Int. J. Eng. Dev. Res. (IJEDR) 5, 1433–1436 (2017)

    Google Scholar 

  26. Symantec Security Response: Mirai: what you need to know about the botnet behind recent major DDoS attacks. https://www.symantec.com/connect/blogs/mirai-what-you-need-know-about-botnet-behind-recent-major-ddos-attacks. Accessed 02 Feb 2018

  27. Symantec Security Response: Petya ransomware outbreak: Here is what you need to know. https://www.symantec.com/blogs/threat-intelligence/petya-ransomware-wiper. Accessed 14 Feb 2018

  28. Symantec Security Response: What you need to know about the WannaCry Ransomware. https://www.symantec.com/blogs/threat-intelligence/wannacry-ransomware-attack. Accessed 14 Feb 2018

  29. Rieck, K., Krueger, T., Dewald, A.: Cujo: efficient detection and prevention of drive-by-download attacks. In: Annual Computer Security Applications Conference (ACSAC) (2010)

    Google Scholar 

  30. Stock, B., Livshits, B., Zorn, B.: Kizzle: a signature compiler for detecting exploit kits. In: Dependable Systems and Networks (DSN) (2016)

    Google Scholar 

  31. Šrndić, N., Laskov, P.: Detection of malicious pdf files based on hierarchical document structure. In: NDSS (2013)

    Google Scholar 

  32. Wang, K., Parekh, J.J., Stolfo, S.J.: Anagram: a content anomaly detector resistant to mimicry attack. In: Zamboni, D., Kruegel, C. (eds.) RAID 2006. LNCS, vol. 4219, pp. 226–248. Springer, Heidelberg (2006). https://doi.org/10.1007/11856214_12

    Chapter  Google Scholar 

  33. Wisse, W., Veenman, C.J.: Scripting DNA: identifying the javascript programmer. Digit. Investig. 15, 61–71 (2015)

    Article  Google Scholar 

  34. Wressnegger, C., Schwenk, G., Arp, D., Rieck, K.: A close look on n-grams in intrusion detection: anomaly detection vs. classification. In: ACM Workshop on Artificial Intelligence and Security (AISec) (2013)

    Google Scholar 

  35. Xu, W., Zhang, F., Zhu, S.: The power of obfuscation techniques in malicious javascript code: a measurement study. In: International Conference on Malicious and Unwanted Software (MALWARE) (2012)

    Google Scholar 

  36. Xu, W., Qi, Y., Evans, D.: Automatically evading classifiers: a case study on pdf malware classifiers. In: NDSS (2016)

    Google Scholar 

  37. Yamaguchi, F., Lottmann, M., Rieck, K.: Generalized vulnerability extrapolation using abstract syntax trees. In: Annual Computer Security Applications Conference (ACSAC) (2012)

    Google Scholar 

  38. Youden, W.J.: Index for rating diagnostic tests. Cancer 3, 32–35 (1950)

    Article  Google Scholar 

Download references

Acknowledgments

This work would not have been possible without the help of the German Federal Office for Information Security and Kafeine DNC which provided us with materials for our experiments. We would also like to thank the anonymous reviewers of this paper for their well-appreciated feedback. This work was partially supported by the German Federal Ministry of Education and Research (BMBF) through funding for the Center for IT-Security, Privacy and Accountability (CISPA) (FKZ: 16KIS0345).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aurore Fass .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fass, A., Krawczyk, R.P., Backes, M., Stock, B. (2018). JaSt: Fully Syntactic Detection of Malicious (Obfuscated) JavaScript. In: Giuffrida, C., Bardin, S., Blanc, G. (eds) Detection of Intrusions and Malware, and Vulnerability Assessment. DIMVA 2018. Lecture Notes in Computer Science(), vol 10885. Springer, Cham. https://doi.org/10.1007/978-3-319-93411-2_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-93411-2_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-93410-5

  • Online ISBN: 978-3-319-93411-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics