Generic, efficient, and effective deobfuscation and semantic-aware attack detection for PowerShell scripts

Xiong, Chunlin; Li, Zhenyuan; Chen, Yan; Zhu, Tiantian; Wang, Jian; Yang, Hai; Ruan, Wei

doi:10.1631/FITEE.2000436

Generic, efficient, and effective deobfuscation and semantic-aware attack detection for PowerShell scripts

通用、有效且轻量的PowerShell解混淆和语义敏感的攻击检测方法

Research Article
Published: 26 March 2022

Volume 23, pages 361–381, (2022)
Cite this article

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Chunlin Xiong (熊春霖) ORCID: orcid.org/0000-0003-4426-3585¹,
Zhenyuan Li (李振源)¹,
Yan Chen (陈焰)²,
Tiantian Zhu (朱添田)³,
Jian Wang (王箭)¹,
Hai Yang (杨海)⁴ &
…
Wei Ruan (阮伟) ORCID: orcid.org/0000-0001-8721-4391⁵

295 Accesses
Explore all metrics

Abstract

In recent years, PowerShell has increasingly been reported as appearing in a variety of cyber attacks. However, because the PowerShell language is dynamic by design and can construct script fragments at different levels, state-of-the-art static analysis based PowerShell attack detection approaches are inherently vulnerable to obfuscations. In this paper, we design the first generic, effective, and lightweight deobfuscation approach for PowerShell scripts. To precisely identify the obfuscated script fragments, we define obfuscation based on the differences in the impacts on the abstract syntax trees of PowerShell scripts and propose a novel emulation-based recovery technology. Furthermore, we design the first semantic-aware PowerShell attack detection system that leverages the classic objective-oriented association mining algorithm and newly identifies 31 semantic signatures. The experimental results on 2342 benign samples and 4141 malicious samples show that our deobfuscation method takes less than 0.5 s on average and increases the similarity between the obfuscated and original scripts from 0.5% to 93.2%. By deploying our deobfuscation method, the attack detection rates for Windows Defender and VirusTotal increase substantially from 0.33% and 2.65% to 78.9% and 94.0%, respectively. Moreover, our detection system outperforms both existing tools with a 96.7% true positive rate and a 0% false positive rate on average.

摘要

近年来, PowerShell攻击越来越多见诸报道. 然而, 由于PowerShell语言的动态特性, 且可在不同级别构造脚本片段, 即使基于最先进的静态脚本分析的PowerShell攻击检测方法, 其本质上也容易受到混淆的影响. 本文为PowerShell脚本设计了一种通用、有效且轻量的去混淆方法. 首先, 为精准识别模糊脚本片段, 根据混淆方法对PowerShell抽象语法树的影响, 提出一种全新混淆片段检测方法, 在此基础上提出一种基于仿真的恢复技术. 此外, 设计了一个语义敏感的PowerShell攻击检测系统, 该系统利用经典的面向目标的关联挖掘算法, 新识别31个用于恶意脚本检测的语义特征. 在2342个良性样本和4141个恶意样本上的实验结果表明, 所提去混淆方法平均耗时不到0.5秒, 且将模糊脚本和原始脚本的相似度从0.5%提至93.2%. 采用该去混淆方法, Windows Defender和VirusTotal的攻击检测率分别从0.33%和2.65%提至78.9%和94.0%. 实验还表明, 我们的检测系统优于现有两种工具(平均真正例率为96.7%, 假正例率为0%).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detecting Malicious and Clean PowerShell Scripts Among Obfuscated Commands Using Deep Learning Methods and Word Embedding

PowerDrive: Accurate De-obfuscation and Analysis of PowerShell Malware

Detection of Malicious PowerShell Using Word-Level Language Models

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

AbdelKhalek M, Shosha A, 2017. JSDES: an automated de-obfuscation system for malicious JavaScript. Proc 12^th Int Conf on Availability, Reliability and Security, p.1–13. https://doi.org/10.1145/3098954.3107009
Google Scholar
Ackerman G, Cole R, Thompson A, et al., 2018. OVERRULED: Containing a Potentially Destructive Adversary. https://bit.ly/2tSUacy [Accessed on Aug. 8, 2020].
Google Scholar
Acornjs, 2013. Acorn. https://bit.ly/2BPzkyw [Accessed on Aug. 8, 2020].
Google Scholar
Aebersold S, Kryszczuk K, Paganoni S, et al., 2016. Detecting obfuscated JavaScript using machine learning. 11^th Int Conf on Internet Monitoring and Protection, p.11–17.
Google Scholar
Ahl I, 2017. Threat Research: Privileges and Credentials: Phished at the Request of Counsel. https://bit.ly/2RaIk5o [Accessed on Aug. 8, 2020].
Google Scholar
AST Explorer, 2015. AST Explorer. https://astexplorer.net/ [Accessed on Aug. 8, 2020].
Google Scholar
Barak B, Goldreich O, Impagliazzo R, et al., 2012. On the (im)possibility of obfuscating programs. J ACM, 59(2):6. https://doi.org/10.1145/2160158.2160159
Article MathSciNet MATH Google Scholar
Bohannon D, 2016. Invoke-Obfuscation. https://bit.ly/2TIEwLN [Accessed on Aug. 8, 2020].
Google Scholar
Bohannon D, 2017a. ObfuscatedEmpire—Use an Obfuscated, In-memory PowerShell C2 Channel to Evade AV Signatures. https://bit.ly/36UVYjC [Accessed on Aug. 8, 2020].
Google Scholar
Bohannon D, 2017b. PowerShellObfuscation Detection Framework. https://bit.ly/2RhakUP [Accessed on Aug. 8, 2020].
Google Scholar
Borgelt C, 2005. An implementation of the FP-growth algorithm. Proc 1^st Int Workshop on Open Source Data Mining: Frequent Pattern Mining Implementations, p.1–5. https://doi.org/10.1145/1133905.1133907
Google Scholar
Canali D, Cova M, Vigna G, et al., 2011. Prophiler: a fast filter for the large-scale detection of malicious web pages. Proc 20^th Int Conf on World Wide Web, p.197–206. https://doi.org/10.1145/1963405.1963436
Google Scholar
Candid W, 2016. The Increased Use of PowerShell in Attacks. https://symc.ly/2NmazwO [Accessed on Aug. 8, 2020].
Google Scholar
Christodorescu M, Jha S, Seshia SA, et al., 2005. Semantics-aware malware detection. Proc IEEE Symp on Security and Privacy, p.32–46. https://doi.org/10.1109/SP.2005.20
Google Scholar
Cova M, Kruegel C, Vigna G, 2010. Detection and analysis of drive-by-download attacks and malicious JavaScript code. Proc 19^th Int Conf on World Wide Web, p.281–290. https://doi.org/10.1145/1772690.1772720
Google Scholar
CrowdStrike, 2014. Free Automated Malware Analysis Service. https://bit.ly/36SUUgd [Accessed on Aug. 8, 2020].
Google Scholar
CrowdStrike, 2018. Who Needs Malware? How Adversaries Use Fileless Attacks to Evade Your Security. https://bit.ly/2HZB23i [Accessed on Aug. 8, 2020].
Google Scholar
Curtsinger C, Livshits B, Zorn B, et al., 2011. ZOZZLE: fast and precise in-browser JavaScript malware detection. Proc 20^th USENIX Conf on Security, p.33–48.
Google Scholar
Diggs R, 2017. Pulling Back the Curtains on Encoded-Command PowerShell Attacks. https://bit.ly/30jVNMr [Accessed on Aug. 8, 2020].
Google Scholar
EmpireProject, 2015. Empire Is a PowerShell and Python Post-Exploitation Agent. https://bit.ly/36P13du [Accessed on Aug. 8, 2020].
Google Scholar
FOLDOC, 1994. Free On-line Dictionary of Computing: Abstract Syntax Tree. https://foldoc.org/abstract+syntax+tree [Accessed on Aug. 8, 2020].
Google Scholar
Fredrikson M, Jha S, Christodorescu M, et al., 2010. Synthesizing near-optimal malware specifications from suspicious behaviors. Proc IEEE Symp on Security and Privacy, p.45–60. https://doi.org/10.1109/SP.2010.11
Google Scholar
Google, 2004. VirusTotal. https://bit.ly/3a3Pfpz [Accessed on Aug. 8, 2020].
Google Scholar
Google, 2011. Traceur-Compiler. https://bit.ly/2BW2hZP [Accessed on Aug. 8, 2020].
Google Scholar
Hendler D, Kels S, Rubin A, 2018. Detecting malicious PowerShell commands using deep neural networks. Proc Asia Conf on Computer and Communications Security, p.187–197. https://doi.org/10.1145/3196494.3196511
Google Scholar
Hidayat A, 2012. ECMAScript Parsing Infrastructure for Multipurpose Analysis. https://esprima.org/ [Accessed on Aug. 8, 2020].
Google Scholar
Jodavi M, Abadi M, Parhizkar E, 2015. JSObfusDetector: a binary PSO-based one-class classifier ensemble to detect obfuscated JavaScript code. Proc Int Symp on Artificial Intelligence and Signal Processing, p.322–327. https://doi.org/10.1109/AISP.2015.7123508
Google Scholar
Kachalov T, 2016. JavaScript-Obfuscator. https://bit.ly/3cSvP7a [Accessed on Aug. 8, 2020].
Google Scholar
Kannumittal, 2018. Difference b/w a Programming & Scripting Language. https://www.codingninjas.com/blog/2018/12/08/difference-between-a-programming-language-and-a-scripting-language/
Google Scholar
Kaplan S, Livshits B, Zorn B, et al., 2011. “NOFUS: Automatically Detecting” String.fromCharCode(32) “ObFuSCateD” to LowerCase() “JavaScript Code”. Technical Report MSR-TR 2011-57. Microsoft Research.
Google Scholar
Koschke R, Falke R, Frenzel P, 2006. Clone detection using abstract syntax suffix trees. Proc 13^th Working Conf on Reverse Engineering, p.253–262. https://doi.org/10.1109/WCRE.2006.18
Google Scholar
Li ZY, Chen QA, Xiong CL, et al., 2019. Effective and lightweight deobfuscation and semantic-aware attack detection for PowerShell scripts. Proc ACM SIGSAC Conf on Computer and Communications Security, p.1831–1847. https://doi.org/10.1145/3319535.3363187
Google Scholar
Liu C, Xia B, Yu M, et al., 2018. PSDEM: a feasible deobfuscation method for malicious PowerShell detection. Proc IEEE Symp on Computers and Communications, p.825–831. https://doi.org/10.1109/ISCC.2018.8538691
Google Scholar
Lu G, Debray S, 2012. Automatic simplification of obfuscated JavaScript code: a semantics-based approach. Proc IEEE 6^th Int Conf on Software Security and Reliability, p.31–40. https://doi.org/10.1109/SERE.2012.13
Google Scholar
Maniar V, 2018. PowerShell-RAT. https://bit.ly/2uOD7ZH [Accessed on Aug. 8, 2020].
Google Scholar
Mateas M, Montfort N, 2005. A box, darkly: obfuscation, weird languages, and code aesthetics. Proc 6^th Digital Arts and Culture Conf, p.144–153.
Google Scholar
Microsoft, 2014. Submit a File for Malware Analysis—Microsoft Security Intelligence. https://bit.ly/2TgVYXo [Accessed on Aug. 8, 2020].
Google Scholar
Microsoft, 2019. Antimalware Scan Interface (AMSI). https://bit.ly/3hHhXBJ [Accessed on Aug. 8, 2020].
Google Scholar
Mishoo, 2015. UglifyJS. https://bit.ly/30wOWkM [Accessed on Aug. 8, 2020].
Google Scholar
MITRE, 2015. MITRE ATT&CK. https://attack.mitre.org/ [Accessed on Aug. 8, 2020].
Google Scholar
MITRE, 2020. Technique: PowerShell-MITRE ATT&CKTM. https://bit.ly/36SVSsR [Accessed on Aug. 8, 2020].
Google Scholar
PowerShellMafia, 2012. PowerSploit: a PowerShell Post-Exploitation Framework—PowerShellMafia/PowerSploit. https://bit.ly/36STQJ9 [Accessed on Aug. 8, 2020].
Google Scholar
R3MRUM, 2018. PowerShell Script for Deobfuscating Encoded PowerShell Scripts: R3mrum/PSDecode https://github.com/R3MRUM/PSDecode [Accessed on Aug. 8, 2020].
Google Scholar
Reactor NET, 2003. Code Virtualization. https://www.eziriz.com [Accessed on Aug. 8, 2020].
Google Scholar
Rieck K, Krueger T, Dewald A, 2010. Cujo: efficient detection and prevention of drive-by-download attacks. Proc 26^th Annual Computer Security Applications Conf, p.31–39. https://doi.org/10.1145/1920261.1920267
Google Scholar
Rubin A, Kels S, Hendler D, 2019. AMSI-based detection of malicious PowerShell code using contextual embeddings. https://arxiv.org/abs/1905.09538
Google Scholar
Rusak G, Al-Dujaili A, O’Reilly UM, 2018. AST-based deep learning for detecting malicious PowerShell. Proc ACM SIGSAC Conf on Computer and Communications Security, p.2276–2278. https://doi.org/10.1145/3243734.3278496
Google Scholar
Samratashok, 2020. What Is PowerShell? https://bit.ly/3f8U5DS [Accessed on Aug. 8, 2020].
Google Scholar
Scraper W, 2019. Web Scraper. https://www.webscraper.io/ [Accessed on Aug. 8, 2020].
Google Scholar
ShapeSecurity, 2015. Shift-parser-js. https://bit.ly/3fe0HRj [Accessed on Aug. 8, 2020].
Google Scholar
Shen YD, Zhang Z, Yang Q, 2002. Objective-oriented utility-based association mining. Proc IEEE Int Conf on Data Mining, p.426–433. https://doi.org/10.1109/ICDM.2002.1183938
Google Scholar
Symantec, 2018. Security Center White Papers | Symantec. https://symc.ly/2TlKphr [Accessed on Aug. 8, 2020].
Google Scholar
Tobias W, 2018. New Obfuscation Modes. https://bit.ly/2FJhJae [Accessed on Aug. 8, 2020].
Google Scholar
Ugarte D, Maiorca D, Cara F, et al., 2019. PowerDrive: accurate de-obfuscation and analysis of PowerShell malware. Proc 16^th Int Conf on Detection of Intrusions and Malware, and Vulnerability Assessment, p.240–259. https://doi.org/10.1007/978-3-030-22038-9_12
Google Scholar
Wueest C, Anand H, 2017. ISTR Living off the Land and Fileless Attack Techniques. https://symc.ly/2FP6v3X [Accessed on Aug. 8, 2020].
Google Scholar
Wueest C, Stephen D, 2016. The Increased Use of PowerShell in Attacks. https://symc.ly/35Qj1ef [Accessed on Aug. 8, 2020].
Google Scholar
Xiong CL, Zhu TT, Dong WH, et al., 2022. Conan: a practical real-time APT detection system with high accuracy and efficiency. IEEE Trans Depend Sec Comput, 19(1):551–565. https://doi.org/10.1109/TDSC.2020.2971484
Article Google Scholar
Xu W, Zhang FF, Zhu SC, 2012. The power of obfuscation techniques in malicious JavaScript code: a measurement study. Proc 7^th Int Conf on Malicious and Unwanted Software, p.9–16. https://doi.org/10.1109/MALWARE.2012.6461002
Google Scholar
Ye YF, Wang DD, Li T, et al., 2008. An intelligent PE-malware detection system based on association mining. J Comput Virol, 4(4):323–334. https://doi.org/10.1007/s11416-008-0082-4
Article Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer Science and Technology, Zhejiang University, Hangzhou, 310027, China
Chunlin Xiong (熊春霖), Zhenyuan Li (李振源) & Jian Wang (王箭)
Department of Electrical Engineering and Computer Science, Northwestern University, Evanston, IL, 60208, USA
Yan Chen (陈焰)
College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, China
Tiantian Zhu (朱添田)
Magic Shield Co., Ltd., Hangzhou, 310027, China
Hai Yang (杨海)
College of Control Science and Engineering, Zhejiang University, Hangzhou, 310027, China
Wei Ruan (阮伟)

Authors

Chunlin Xiong (熊春霖)
View author publications
Search author on:PubMed Google Scholar
Zhenyuan Li (李振源)
View author publications
Search author on:PubMed Google Scholar
Yan Chen (陈焰)
View author publications
Search author on:PubMed Google Scholar
Tiantian Zhu (朱添田)
View author publications
Search author on:PubMed Google Scholar
Jian Wang (王箭)
View author publications
Search author on:PubMed Google Scholar
Hai Yang (杨海)
View author publications
Search author on:PubMed Google Scholar
Wei Ruan (阮伟)
View author publications
Search author on:PubMed Google Scholar

Contributions

Chunlin XIONG and Zhenyuan LI designed the research. Tiantian ZHU and Hai YANG investigated the background. Jian WANG and Hai YANG processed the data. Zhenyuan LI and Chunlin XIONG drafted the paper. Wei RUAN and Tiantian ZHU helped organize the paper. Yan CHEN, Tiantian ZHU, and Wei RUAN revised and finalized the paper.

Corresponding author

Correspondence to Wei Ruan (阮伟).

Ethics declarations

Chunlin XIONG, Zhenyuan LI, Yan CHEN, Tiantian ZHU, Jian WANG, Hai YANG, and Wei RUAN declare that they have no conflict of interest.

Additional information

Project supported by the National Natural Science Foundation of China (No. U1936215)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiong, C., Li, Z., Chen, Y. et al. Generic, efficient, and effective deobfuscation and semantic-aware attack detection for PowerShell scripts. Front Inform Technol Electron Eng 23, 361–381 (2022). https://doi.org/10.1631/FITEE.2000436

Download citation

Received: 28 August 2020
Accepted: 29 December 2020
Published: 26 March 2022
Issue Date: March 2022
DOI: https://doi.org/10.1631/FITEE.2000436

Key words

关键词

CLC number

TP309

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generic, efficient, and effective deobfuscation and semantic-aware attack detection for PowerShell scripts

Abstract

摘要

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Detecting Malicious and Clean PowerShell Scripts Among Obfuscated Commands Using Deep Learning Methods and Word Embedding

PowerDrive: Accurate De-obfuscation and Analysis of PowerShell Malware

Detection of Malicious PowerShell Using Word-Level Language Models

Explore related subjects

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

关键词

CLC number

Subscribe and save

Buy Now