Skip to main content
Log in

Generic, efficient, and effective deobfuscation and semantic-aware attack detection for PowerShell scripts

通用、 有效且轻量的PowerShell解混淆和语义敏感的攻击检测方法

  • Research Article
  • Published:
Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Abstract

In recent years, PowerShell has increasingly been reported as appearing in a variety of cyber attacks. However, because the PowerShell language is dynamic by design and can construct script fragments at different levels, state-of-the-art static analysis based PowerShell attack detection approaches are inherently vulnerable to obfuscations. In this paper, we design the first generic, effective, and lightweight deobfuscation approach for PowerShell scripts. To precisely identify the obfuscated script fragments, we define obfuscation based on the differences in the impacts on the abstract syntax trees of PowerShell scripts and propose a novel emulation-based recovery technology. Furthermore, we design the first semantic-aware PowerShell attack detection system that leverages the classic objective-oriented association mining algorithm and newly identifies 31 semantic signatures. The experimental results on 2342 benign samples and 4141 malicious samples show that our deobfuscation method takes less than 0.5 s on average and increases the similarity between the obfuscated and original scripts from 0.5% to 93.2%. By deploying our deobfuscation method, the attack detection rates for Windows Defender and VirusTotal increase substantially from 0.33% and 2.65% to 78.9% and 94.0%, respectively. Moreover, our detection system outperforms both existing tools with a 96.7% true positive rate and a 0% false positive rate on average.

摘要

近年来, PowerShell攻击越来越多见诸报道. 然而, 由于PowerShell语言的动态特性, 且可在不同级别构造脚本片段, 即使基于最先进的静态脚本分析的PowerShell攻击检测方法, 其本质上也容易受到混淆的影响. 本文为PowerShell脚本设计了一种通用、有效且轻量的去混淆方法. 首先, 为精准识别模糊脚本片段, 根据混淆方法对PowerShell抽象语法树的影响, 提出一种全新混淆片段检测方法, 在此基础上提出一种基于仿真的恢复技术. 此外, 设计了一个语义敏感的PowerShell攻击检测系统, 该系统利用经典的面向目标的关联挖掘算法, 新识别31个用于恶意脚本检测的语义特征. 在2342个良性样本和4141个恶意样本上的实验结果表明, 所提去混淆方法平均耗时不到0.5秒, 且将模糊脚本和原始脚本的相似度从0.5%提至93.2%. 采用该去混淆方法, Windows Defender和VirusTotal的攻击检测率分别从0.33%和2.65%提至78.9%和94.0%. 实验还表明, 我们的检测系统优于现有两种工具(平均真正例率为96.7%, 假正例率为0%).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

Download references

Author information

Authors and Affiliations

Authors

Contributions

Chunlin XIONG and Zhenyuan LI designed the research. Tiantian ZHU and Hai YANG investigated the background. Jian WANG and Hai YANG processed the data. Zhenyuan LI and Chunlin XIONG drafted the paper. Wei RUAN and Tiantian ZHU helped organize the paper. Yan CHEN, Tiantian ZHU, and Wei RUAN revised and finalized the paper.

Corresponding author

Correspondence to Wei Ruan  (阮伟).

Ethics declarations

Chunlin XIONG, Zhenyuan LI, Yan CHEN, Tiantian ZHU, Jian WANG, Hai YANG, and Wei RUAN declare that they have no conflict of interest.

Additional information

Project supported by the National Natural Science Foundation of China (No. U1936215)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xiong, C., Li, Z., Chen, Y. et al. Generic, efficient, and effective deobfuscation and semantic-aware attack detection for PowerShell scripts. Front Inform Technol Electron Eng 23, 361–381 (2022). https://doi.org/10.1631/FITEE.2000436

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/FITEE.2000436

Key words

关键词

CLC number

Navigation