research-article

SyzRisk: A Change-Pattern-Based Continuous Kernel Regression Fuzzer

Authors:

Byoungyoung Lee,

Mathias PayerAuthors Info & Claims

ASIA CCS '24: Proceedings of the 19th ACM Asia Conference on Computer and Communications Security

Pages 1480 - 1494

https://doi.org/10.1145/3634737.3637642

Published: 01 July 2024 Publication History

Abstract

Syzbot continuously fuzzes the full Linux kernel to discover latent bugs. Yet, around 75% of recent kernel bugs are caused by recent patches, dubbed regression bugs. Regression fuzzing prioritizes inputs that target recently or frequently patched code. However, this heuristic breaks down in the kernel environment as there are too many patches (and therefore too many targets).

To improve regression fuzzing, we note that certain code change patterns (e.g., modifying GOTO) carry more risk of introducing bugs than others. Leveraging this observation, we introduce SyzRisk, a continuous regression fuzzer for the kernel that stresses bug-prone code changes. SyzRisk introduces code change patterns that allow for identifying risky code changes. After systematically estimating the risk of suspected change patterns under various circumstances, SyzRisk assigns more weight to risky change patterns. Using the accumulated corpus from prior continuous fuzzing, SyzRisk further prioritizes mutation inputs based on the observed weights.

We simulated the pattern creation from developers using 146 known Linux kernel root causes including 38 CVE root causes and collected 23 risky change patterns. The evaluation shows that the pattern-based weighting method highlights root-cause commits 3.60x more compared to the heuristic of simply targeting recent and frequent changes. Our evaluation of the Linux kernel v6.0 demonstrates that SyzRisk records a 61% speedup in bug exposure time compared to Syzkaller, while discovering the most complete set of bugs across all compared fuzzers.

References

[1]

CodeQL. https://codeql.github.com/.

[2]

difflib. https://docs.python.org/3/library/difflib.html.

[3]

gitpy. https://gitpython.readthedocs.io/en/stable/.

[4]

ImageMagick. https://github.com/ImageMagick/ImageMagick.

[5]

Joern: open-source code analysis platform for C/C++ based on code property graphs. https://joern.io/.

[6]

OSSfuzz. https://github.com/google/oss-fuzz.

[7]

Syzbot. https://syzkaller.appspot.com/.

[8]

Nikolaos Alexopoulos, Manuel Brack, Jan Philipp Wagner, Tim Grube, and Max Mühlhäuser. How long do vulnerabilities live in the code? a Large-Scale empirical measurement study on FOSS vulnerability lifetimes. In 31st USENIX Security Symposium (USENIX Security 22), pages 359--376, 2022.

[9]

Tim Blazytko, Moritz Schlögel, Cornelius Aschermann, Ali Abbasi, Joel Frank, Simon Wörner, and Thorsten Holz. AURORA: Statistical crash analysis for automated root cause explanation. In 29th USENIX Security Symposium (USENIX Security 20), 2020.

[10]

Marcel Böhme, Van-Thuan Pham, Manh-Dung Nguyen, and Abhik Roychoudhury. Directed greybox fuzzing. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS '17, page 2329--2344, New York, NY, USA, 2017. Association for Computing Machinery.

Digital Library

[11]

Sadullah Canakci, Nikolay Matyunin, Kalman Graffi, Ajay Joshi, and Manuel Egele. TargetFuzz: Using darts to guide directed greybox fuzzers. In Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security, ASIA CCS '22, page 561--573, New York, NY, USA, 2022. Association for Computing Machinery.

[12]

Sicong Cao, Xiaobing Sun, Lili Bo, Rongxin Wu, Bin Li, and Chuanqi Tao. MVD: Memory-related vulnerability detection based on flow-sensitive graph neural networks. In Proceedings of the 44th International Conference on Software Engineering, ICSE '22, page 1456--1468, New York, NY, USA, 2022. Association for Computing Machinery.

Digital Library

[13]

Hongxu Chen, Yinxing Xue, Yuekang Li, Bihuan Chen, Xiaofei Xie, Xiuheng Wu, and Yang Liu. Hawkeye: Towards a desired directed grey-box fuzzer. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS '18, page 2095--2108, New York, NY, USA, 2018. Association for Computing Machinery.

Digital Library

[14]

Yaohui Chen, Peng Li, Jun Xu, Shengjian Guo, Rundong Zhou, Yulong Zhang, Tao Wei, and Long Lu. SAVIOR: Towards bug-driven hybrid testing. In 2020 IEEE Symposium on Security and Privacy (SP), pages 1580--1596, 2020.

[15]

Mingi Cho, Dohyeon An, Hoyong Jin, and Taekyoung Kwon. BoKASAN: Binary-only kernel address sanitizer for effective kernel fuzzing. In 32nd USENIX Security Symposium (USENIX Security 23), 2023.

[16]

Jaeseung Choi, Kangsu Kim, Daejin Lee, and Sang Kil Cha. NtFuzz: Enabling type-aware kernel fuzzing on windows with static binary analysis. In 2021 IEEE Symposium on Security and Privacy (SP), pages 677--693, 2021.

[17]

Agnieszka Ciborowska and Kostadin Damevski. Fast changeset-based bug localization with bert. In Proceedings of the 44th International Conference on Software Engineering, ICSE '22, page 946--957, New York, NY, USA, 2022. Association for Computing Machinery.

Digital Library

[18]

Weidong Cui, Marcus Peinado, Sang Kil Cha, Yanick Fratantonio, and Vasileios P. Kemerlis. Retracer: Triaging crashes by reverse execution from partial memory dumps. In Proceedings of the 38th International Conference on Software Engineering, ICSE '16, page 820--831, New York, NY, USA, 2016. Association for Computing Machinery.

Digital Library

[19]

Xiaoning Du, Bihuan Chen, Yuekang Li, Jianmin Guo, Yaqin Zhou, Yang Liu, and Yu Jiang. Leopard: Identifying vulnerable code for vulnerability assessment through program metrics. In Proceedings of the 41st International Conference on Software Engineering, ICSE '19, page 60--71. IEEE Press, 2019.

Digital Library

[20]

Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Martin Monperrus. Fine-grained and accurate source code differencing. In Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering, 2014.

Digital Library

[21]

Marius Fleischer, Dipanjan Das, Priyanka Bose, Weiheng Bai, Kangjie Lu, Mathias Payer, Christopher Kruegel, and Giovanni Vigna. ACTOR: Action-Guided kernel fuzzing. In 32nd USENIX Security Symposium (USENIX Security 23), 2023.

[22]

Google. syzkaller - linux syscall fuzzer, 2017. https://github.com/google/syzkaller.

[23]

Tracy Hall, Sarah Beecham, David Bowes, David Gray, and Steve Counsell. A systematic literature review on fault prediction performance in software engineering. IEEE Transactions on Software Engineering, 38(6):1276--1304, 2012.

Digital Library

[24]

HyungSeok Han and Sang Kil Cha. IMF: Inferred model-based fuzzer. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS), pages 2345--2358. ACM, 2017.

[25]

Thong Hoang, Hong Jin Kang, David Lo, and Julia Lawall. CC2Vec: Distributed representations of code changes. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, ICSE '20, page 518--529, New York, NY, USA, 2020. Association for Computing Machinery.

Digital Library

[26]

Thong Hoang, Hoa Khanh Dam, Yasutaka Kamei, David Lo, and Naoyasu Ubayashi. Deepjit: An end-to-end deep learning framework for just-in-time defect prediction. In 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), pages 34--45, 2019.

Digital Library

[27]

Aram Hovsepyan, Riccardo Scandariato, Wouter Joosen, and James Walden. Software vulnerability prediction using text analysis techniques. In Proceedings of the 4th International Workshop on Security Measurements and Metrics, MetriSec '12, page 7--10, New York, NY, USA, 2012. Association for Computing Machinery.

Digital Library

[28]

Dae R Jeong, Kyungtae Kim, Basavesh Shivakumar, Byoungyoung Lee, and Insik Shin. Razzer: Finding kernel race bugs through fuzzing. In Proceedings of the IEEE Symposium on Security and Privacy (S&P), page 0. IEEE, 2018.

[29]

Hyungsub Kim, Muslum Ozgur Ozmen, Z. Berkay Celik, Antonio Bianchi, and Dongyan Xu. PatchVerif: Discovering faulty patches in robotic vehicles. In Proceedings of the 32nd USENIX Conference on Security Symposium [Prepublication], SEC'23. USENIX Association, 2023.

[30]

Kyungtae Kim, Dae R Jeong, Chung Hwan Kim, Yeongjin Jang, Insik Shin, and Byoungyoung Lee. HFL: Hybrid fuzzing on the linux kernel. In NDSS, 2020.

[31]

Chris Lattner and Vikram Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In International Symposium on Code Generation and Optimization, 2004. CGO 2004., pages 75--86. IEEE, 2004.

[32]

Yuwei Li, Shouling Ji, Chenyang Lyu, Yuan Chen, Jianhai Chen, Qinchen Gu, Chunming Wu, and Raheem Beyah. V-Fuzz: Vulnerability prediction-assisted evolutionary fuzzing for binary programs. IEEE Transactions on Cybernetics, 52(5):3745--3756, 2022.

[33]

Zhen Li, Deqing Zou, Shouhuai Xu, Xinyu Ou, Hai Jin, Sujuan Wang, Zhijun Deng, and Yuyi Zhong. Vuldeepecker: A deep learning-based system for vulnerability detection. In 25th Annual Network and Distributed System Security Symposium, NDSS 2018, San Diego, California, USA, February 18-21, 2018. The Internet Society, 2018.

[34]

Zhenpeng Lin, Yueqi Chen, Yuhang Wu, Dongliang Mu, Chensheng Yu, Xinyu Xing, and Kang Li. GREBE: Unveiling exploitation potential for linux kernel bugs. In 2022 IEEE Symposium on Security and Privacy (SP), 2022.

[35]

Raimund Moser, Witold Pedrycz, and Giancarlo Succi. A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In 2008 ACM/IEEE 30th International Conference on Software Engineering, pages 181--190, 2008.

Digital Library

[36]

Nachiappan Nagappan, Andreas Zeller, Thomas Zimmermann, Kim Herzig, and Brendan Murphy. Change bursts as defect predictors. In 2010 IEEE 21st International Symposium on Software Reliability Engineering, pages 309--318, 2010.

Digital Library

[37]

Yannic Noller, Corina S Păsăreanu, Marcel Böhme, Youcheng Sun, Hoang Lam Nguyen, and Lars Grunske. HyDiff: Hybrid differential software analysis. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE), pages 1273--1285. IEEE, 2020.

[38]

Sebastian Österlund, Kaveh Razavi, Herbert Bos, and Cristiano Giuffrida. Parmesan: Sanitizer-guided greybox fuzzing. In Proceedings of the 29th USENIX Conference on Security Symposium, SEC'20, USA, 2020. USENIX Association.

[39]

Shankara Pailoor, Andrew Aday, and Suman Jana. MoonShine: Optimizing OS fuzzer seed selection with trace distillation. In 27th USENIX Security Symposium (USENIX Security 18), pages 729--743, 2018.

[40]

Yulei Pang, Xiaozhen Xue, and Akbar Siami Namin. Predicting vulnerable software components through n-gram analysis and statistical feature selection. In 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA), pages 543--548, 2015.

[41]

Hui Peng and Mathias Payer. USBFuzz: A framework for fuzzing usb drivers by device emulation. In Proceedings of the 29th USENIX Conference on Security Symposium, SEC'20, USA, 2020. USENIX Association.

[42]

Theofilos Petsios, Adrian Tang, Salvatore Stolfo, Angelos D. Keromytis, and Suman Jana. Nezha: Efficient domain-independent differential testing. In 2017 IEEE Symposium on Security and Privacy (SP), pages 615--632, 2017.

[43]

Gordon Plotkin. A note on inductive generalization. Machine Intelligence, 1970.

[44]

Sergej Schumilo, Cornelius Aschermann, Robert Gawlik, Sebastian Schinzel, and Thorsten Holz. KAFL: Hardware-assisted feedback fuzzing for os kernels. In Proceedings of the 26th USENIX Conference on Security Symposium, SEC'17, page 167--182, USA, 2017. USENIX Association.

[45]

Dokyung Song, Felicitas Hetzelt, Dipanjan Das, Chad Spensky, Yeoul Na, Stijn Volckaert, Giovanni Vigna, Christopher Kruegel, Jean-Pierre Seifert, and Michael Franz. PeriScope: An effective probing and fuzzing framework for the hardwareOS boundary. In Network and Distributed System Security Symposium (NDSS), 2019.

[46]

Hao Sun, Yuheng Shen, Cong Wang, Jianzhong Liu, Yu Jiang, Ting Chen, and Aiguo Cui. HEALER: Relation learning guided kernel fuzzing. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles, SOSP '21, page 344--358, New York, NY, USA, 2021. Association for Computing Machinery.

Digital Library

[47]

Daimeng Wang, Zheng Zhang, Hang Zhang, Zhiyun Qian, Srikanth V Krishnamurthy, and Nael Abu-Ghazaleh. SyzVegas: Beating kernel fuzzing odds with reinforcement learning. In 30th USENIX Security Symposium (USENIX Security 21), pages 2741--2758, 2021.

[48]

Shu Wang, Xinda Wang, Kun Sun, Sushil Jajodia, Haining Wang, and Qi Li. GraphSPD: Graph-based security patch detection with enriched code semantics. In 2023 IEEE Symposium on Security and Privacy (SP), pages 2409--2426, 2023.

[49]

Xinda Wang, Kun Sun, Archer Batcheller, and Sushil Jajodia. Detecting "0-day" vulnerability: An empirical study of secret security patch in oss. In 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pages 485--492, 2019.

[50]

Xinda Wang, Shu Wang, Pengbin Feng, Kun Sun, Sushil Jajodia, Sanae Benchaaboun, and Frank Geck. PatchRNN: A deep learning-based system for security patch identification. In 2021 IEEE Military Communications Conference (MILCOM), 2021.

Digital Library

[51]

Yueming Wu, Deqing Zou, Shihan Dou, Wei Yang, Duo Xu, and Hai Jin. VulCNN: An image-inspired scalable vulnerability detection system. In Proceedings of the 44th International Conference on Software Engineering, ICSE '22, page 2365--2376, New York, NY, USA, 2022. Association for Computing Machinery.

[52]

Yuhang Wu, Zhenpeng Lin, Yueqi Chen, Dang K Le, Dongliang Mu, and Xinyu Xing. Mitigating security risks in linux with KLAUS: A method for evaluating patch correctness. In Proceedings of the 32nd USENIX Conference on Security Symposium, 2023.

[53]

Xiaoyuan Xie, Tsong Yueh Chen, and Baowen Xu. Isolating suspiciousness from spectrum-based fault localization techniques. In 2010 10th International Conference on Quality Software, pages 385--392, 2010.

Digital Library

[54]

Jiadong Lu Xin Xiong Zhuang Liu Xin Tan, Yuan Zhang and Min Yang. SyzDirect: Directed greybox fuzzing for linux kernel. In Proceedings of the 30th ACM Conference on Computer and Communications Security (CCS), 2023.

[55]

Jian Xu, Zhenyu Zhang, W. K. Chan, T. H. Tse, and Shanping Li. A general noise-reduction framework for fault localization of java programs. Inf. Softw. Technol., 55(5):880--896, may 2013.

Digital Library

[56]

Carter Yagemann, Simon P. Chung, Brendan Saltaformaggio, and Wenke Lee. Automated bug hunting with data-driven symbolic root cause analysis. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, CCS '21, page 320--336, New York, NY, USA, 2021. Association for Computing Machinery.

Digital Library

[57]

Wei You, Peiyuan Zong, Kai Chen, XiaoFeng Wang, Xiaojing Liao, Pan Bian, and Bin Liang. SemFuzz: Semantics-based automatic generation of proof-of-concept exploits. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS '17, page 2139--2154, New York, NY, USA, 2017. Association for Computing Machinery.

Digital Library

[58]

M. Zalewski. American fuzzy lop, 2017. http://lcamtuf.coredump.cx/afl/technical_details.txt.

[59]

Zhengran Zeng, Yuqun Zhang, Haotian Zhang, and Lingming Zhang. Deep just-in-time defect prediction: How far are we? In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2021, page 427--438, New York, NY, USA, 2021. Association for Computing Machinery.

Digital Library

[60]

Yizhuo Zhai, Yu Hao, Zheng Zhang, Weiteng Chen, Guoren Li, Zhiyun Qian, Chengyu Song, Manu Sridharan, Srikanth V Krishnamurthy, Trent Jaeger, et al. Progressive scrutiny: Incremental detection of ubi bugs in the linux kernel. In 2022 Network and Distributed System Security Symposium, 2022.

[61]

Yunhui Zheng, Saurabh Pujar, Burn Lewis, Luca Buratti, Edward Epstein, Bo Yang, Jim Laredo, Alessandro Morari, and Zhong Su. D2A: A dataset built for ai-based vulnerability detection methods using differential analysis. In Proceedings of the 43rd International Conference on Software Engineering: Software Engineering in Practice, ICSE-SEIP '21, page 111--120. IEEE Press, 2021.

Digital Library

[62]

Yaqin Zhou, Shangqing Liu, Jingkai Siow, Xiaoning Du, and Yang Liu. Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks. Curran Associates Inc., Red Hook, NY, USA, 2019.

[63]

Yaqin Zhou, Jing Kai Siow, Chenyu Wang, Shangqing Liu, and Yang Liu. SPI: Automated identification of security patches via commits. ACM Trans. Softw. Eng. Methodol., 2021.

[64]

Xiaogang Zhu and Marcel Böhme. Regression greybox fuzzing. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, pages 2169--2182, 2021.

Digital Library

Index Terms

SyzRisk: A Change-Pattern-Based Continuous Kernel Regression Fuzzer
1. Security and privacy
  1. Systems security
    1. Operating systems security

Recommendations

An Approach to Regression Testing Selection based on Code Changes and Smells
SAST '23: Proceedings of the 8th Brazilian Symposium on Systematic and Automated Software Testing

Regression testing is a software engineering maintenance activity that involves re-executing test cases on a modified software system to check whether code changes introduce new faults. However, it can be time-consuming and resource-intensive, ...
Test Case Prioritization for Continuous Regression Testing: An Industrial Case Study
ICSM '13: Proceedings of the 2013 IEEE International Conference on Software Maintenance

Regression testing in continuous integration environment is bounded by tight time constraints. To satisfy time constraints and achieve testing goals, test cases must be efficiently ordered in execution. Prioritization techniques are commonly used to ...
Hunting bugs: Towards an automated approach to identifying which change caused a bug through regression testing
AbstractContext
Finding code changes that introduced bugs is important both for practitioners and researchers, but doing it precisely is a manual, effort-intensive process. The perfect test method is a theoretical construct aimed at detecting Bug-...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ASIA CCS '24: Proceedings of the 19th ACM Asia Conference on Computer and Communications Security

July 2024

1987 pages

ISBN:9798400704826

DOI:10.1145/3634737

Chair:
Jianying Zhou,
Co-chair:
Tony Q. S. Quek,
Program Chairs:
Debin Gao,
Alvaro Cardenas
University of California, Santa Cruz, USA

Copyright © 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 2024

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

ERC
SNSF
DARPA

Conference

ASIA CCS '24

Sponsor:

SIGSAC

ASIA CCS '24: 19th ACM Asia Conference on Computer and Communications Security

July 1 - 5, 2024

Singapore, Singapore

Acceptance Rates

Overall Acceptance Rate 418 of 2,322 submissions, 18%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
146
Total Downloads

Downloads (Last 12 months)146
Downloads (Last 6 weeks)21

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten