skip to main content
10.1145/3634737.3637642acmconferencesArticle/Chapter ViewAbstractPublication Pagesasia-ccsConference Proceedingsconference-collections
research-article

SyzRisk: A Change-Pattern-Based Continuous Kernel Regression Fuzzer

Published: 01 July 2024 Publication History

Abstract

Syzbot continuously fuzzes the full Linux kernel to discover latent bugs. Yet, around 75% of recent kernel bugs are caused by recent patches, dubbed regression bugs. Regression fuzzing prioritizes inputs that target recently or frequently patched code. However, this heuristic breaks down in the kernel environment as there are too many patches (and therefore too many targets).
To improve regression fuzzing, we note that certain code change patterns (e.g., modifying GOTO) carry more risk of introducing bugs than others. Leveraging this observation, we introduce SyzRisk, a continuous regression fuzzer for the kernel that stresses bug-prone code changes. SyzRisk introduces code change patterns that allow for identifying risky code changes. After systematically estimating the risk of suspected change patterns under various circumstances, SyzRisk assigns more weight to risky change patterns. Using the accumulated corpus from prior continuous fuzzing, SyzRisk further prioritizes mutation inputs based on the observed weights.
We simulated the pattern creation from developers using 146 known Linux kernel root causes including 38 CVE root causes and collected 23 risky change patterns. The evaluation shows that the pattern-based weighting method highlights root-cause commits 3.60x more compared to the heuristic of simply targeting recent and frequent changes. Our evaluation of the Linux kernel v6.0 demonstrates that SyzRisk records a 61% speedup in bug exposure time compared to Syzkaller, while discovering the most complete set of bugs across all compared fuzzers.

References

[1]
CodeQL. https://codeql.github.com/.
[2]
difflib. https://docs.python.org/3/library/difflib.html.
[3]
gitpy. https://gitpython.readthedocs.io/en/stable/.
[4]
ImageMagick. https://github.com/ImageMagick/ImageMagick.
[5]
Joern: open-source code analysis platform for C/C++ based on code property graphs. https://joern.io/.
[6]
OSSfuzz. https://github.com/google/oss-fuzz.
[7]
Syzbot. https://syzkaller.appspot.com/.
[8]
Nikolaos Alexopoulos, Manuel Brack, Jan Philipp Wagner, Tim Grube, and Max Mühlhäuser. How long do vulnerabilities live in the code? a Large-Scale empirical measurement study on FOSS vulnerability lifetimes. In 31st USENIX Security Symposium (USENIX Security 22), pages 359--376, 2022.
[9]
Tim Blazytko, Moritz Schlögel, Cornelius Aschermann, Ali Abbasi, Joel Frank, Simon Wörner, and Thorsten Holz. AURORA: Statistical crash analysis for automated root cause explanation. In 29th USENIX Security Symposium (USENIX Security 20), 2020.
[10]
Marcel Böhme, Van-Thuan Pham, Manh-Dung Nguyen, and Abhik Roychoudhury. Directed greybox fuzzing. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS '17, page 2329--2344, New York, NY, USA, 2017. Association for Computing Machinery.
[11]
Sadullah Canakci, Nikolay Matyunin, Kalman Graffi, Ajay Joshi, and Manuel Egele. TargetFuzz: Using darts to guide directed greybox fuzzers. In Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security, ASIA CCS '22, page 561--573, New York, NY, USA, 2022. Association for Computing Machinery.
[12]
Sicong Cao, Xiaobing Sun, Lili Bo, Rongxin Wu, Bin Li, and Chuanqi Tao. MVD: Memory-related vulnerability detection based on flow-sensitive graph neural networks. In Proceedings of the 44th International Conference on Software Engineering, ICSE '22, page 1456--1468, New York, NY, USA, 2022. Association for Computing Machinery.
[13]
Hongxu Chen, Yinxing Xue, Yuekang Li, Bihuan Chen, Xiaofei Xie, Xiuheng Wu, and Yang Liu. Hawkeye: Towards a desired directed grey-box fuzzer. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS '18, page 2095--2108, New York, NY, USA, 2018. Association for Computing Machinery.
[14]
Yaohui Chen, Peng Li, Jun Xu, Shengjian Guo, Rundong Zhou, Yulong Zhang, Tao Wei, and Long Lu. SAVIOR: Towards bug-driven hybrid testing. In 2020 IEEE Symposium on Security and Privacy (SP), pages 1580--1596, 2020.
[15]
Mingi Cho, Dohyeon An, Hoyong Jin, and Taekyoung Kwon. BoKASAN: Binary-only kernel address sanitizer for effective kernel fuzzing. In 32nd USENIX Security Symposium (USENIX Security 23), 2023.
[16]
Jaeseung Choi, Kangsu Kim, Daejin Lee, and Sang Kil Cha. NtFuzz: Enabling type-aware kernel fuzzing on windows with static binary analysis. In 2021 IEEE Symposium on Security and Privacy (SP), pages 677--693, 2021.
[17]
Agnieszka Ciborowska and Kostadin Damevski. Fast changeset-based bug localization with bert. In Proceedings of the 44th International Conference on Software Engineering, ICSE '22, page 946--957, New York, NY, USA, 2022. Association for Computing Machinery.
[18]
Weidong Cui, Marcus Peinado, Sang Kil Cha, Yanick Fratantonio, and Vasileios P. Kemerlis. Retracer: Triaging crashes by reverse execution from partial memory dumps. In Proceedings of the 38th International Conference on Software Engineering, ICSE '16, page 820--831, New York, NY, USA, 2016. Association for Computing Machinery.
[19]
Xiaoning Du, Bihuan Chen, Yuekang Li, Jianmin Guo, Yaqin Zhou, Yang Liu, and Yu Jiang. Leopard: Identifying vulnerable code for vulnerability assessment through program metrics. In Proceedings of the 41st International Conference on Software Engineering, ICSE '19, page 60--71. IEEE Press, 2019.
[20]
Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Martin Monperrus. Fine-grained and accurate source code differencing. In Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering, 2014.
[21]
Marius Fleischer, Dipanjan Das, Priyanka Bose, Weiheng Bai, Kangjie Lu, Mathias Payer, Christopher Kruegel, and Giovanni Vigna. ACTOR: Action-Guided kernel fuzzing. In 32nd USENIX Security Symposium (USENIX Security 23), 2023.
[22]
Google. syzkaller - linux syscall fuzzer, 2017. https://github.com/google/syzkaller.
[23]
Tracy Hall, Sarah Beecham, David Bowes, David Gray, and Steve Counsell. A systematic literature review on fault prediction performance in software engineering. IEEE Transactions on Software Engineering, 38(6):1276--1304, 2012.
[24]
HyungSeok Han and Sang Kil Cha. IMF: Inferred model-based fuzzer. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS), pages 2345--2358. ACM, 2017.
[25]
Thong Hoang, Hong Jin Kang, David Lo, and Julia Lawall. CC2Vec: Distributed representations of code changes. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, ICSE '20, page 518--529, New York, NY, USA, 2020. Association for Computing Machinery.
[26]
Thong Hoang, Hoa Khanh Dam, Yasutaka Kamei, David Lo, and Naoyasu Ubayashi. Deepjit: An end-to-end deep learning framework for just-in-time defect prediction. In 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), pages 34--45, 2019.
[27]
Aram Hovsepyan, Riccardo Scandariato, Wouter Joosen, and James Walden. Software vulnerability prediction using text analysis techniques. In Proceedings of the 4th International Workshop on Security Measurements and Metrics, MetriSec '12, page 7--10, New York, NY, USA, 2012. Association for Computing Machinery.
[28]
Dae R Jeong, Kyungtae Kim, Basavesh Shivakumar, Byoungyoung Lee, and Insik Shin. Razzer: Finding kernel race bugs through fuzzing. In Proceedings of the IEEE Symposium on Security and Privacy (S&P), page 0. IEEE, 2018.
[29]
Hyungsub Kim, Muslum Ozgur Ozmen, Z. Berkay Celik, Antonio Bianchi, and Dongyan Xu. PatchVerif: Discovering faulty patches in robotic vehicles. In Proceedings of the 32nd USENIX Conference on Security Symposium [Prepublication], SEC'23. USENIX Association, 2023.
[30]
Kyungtae Kim, Dae R Jeong, Chung Hwan Kim, Yeongjin Jang, Insik Shin, and Byoungyoung Lee. HFL: Hybrid fuzzing on the linux kernel. In NDSS, 2020.
[31]
Chris Lattner and Vikram Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In International Symposium on Code Generation and Optimization, 2004. CGO 2004., pages 75--86. IEEE, 2004.
[32]
Yuwei Li, Shouling Ji, Chenyang Lyu, Yuan Chen, Jianhai Chen, Qinchen Gu, Chunming Wu, and Raheem Beyah. V-Fuzz: Vulnerability prediction-assisted evolutionary fuzzing for binary programs. IEEE Transactions on Cybernetics, 52(5):3745--3756, 2022.
[33]
Zhen Li, Deqing Zou, Shouhuai Xu, Xinyu Ou, Hai Jin, Sujuan Wang, Zhijun Deng, and Yuyi Zhong. Vuldeepecker: A deep learning-based system for vulnerability detection. In 25th Annual Network and Distributed System Security Symposium, NDSS 2018, San Diego, California, USA, February 18-21, 2018. The Internet Society, 2018.
[34]
Zhenpeng Lin, Yueqi Chen, Yuhang Wu, Dongliang Mu, Chensheng Yu, Xinyu Xing, and Kang Li. GREBE: Unveiling exploitation potential for linux kernel bugs. In 2022 IEEE Symposium on Security and Privacy (SP), 2022.
[35]
Raimund Moser, Witold Pedrycz, and Giancarlo Succi. A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In 2008 ACM/IEEE 30th International Conference on Software Engineering, pages 181--190, 2008.
[36]
Nachiappan Nagappan, Andreas Zeller, Thomas Zimmermann, Kim Herzig, and Brendan Murphy. Change bursts as defect predictors. In 2010 IEEE 21st International Symposium on Software Reliability Engineering, pages 309--318, 2010.
[37]
Yannic Noller, Corina S Păsăreanu, Marcel Böhme, Youcheng Sun, Hoang Lam Nguyen, and Lars Grunske. HyDiff: Hybrid differential software analysis. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE), pages 1273--1285. IEEE, 2020.
[38]
Sebastian Österlund, Kaveh Razavi, Herbert Bos, and Cristiano Giuffrida. Parmesan: Sanitizer-guided greybox fuzzing. In Proceedings of the 29th USENIX Conference on Security Symposium, SEC'20, USA, 2020. USENIX Association.
[39]
Shankara Pailoor, Andrew Aday, and Suman Jana. MoonShine: Optimizing OS fuzzer seed selection with trace distillation. In 27th USENIX Security Symposium (USENIX Security 18), pages 729--743, 2018.
[40]
Yulei Pang, Xiaozhen Xue, and Akbar Siami Namin. Predicting vulnerable software components through n-gram analysis and statistical feature selection. In 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA), pages 543--548, 2015.
[41]
Hui Peng and Mathias Payer. USBFuzz: A framework for fuzzing usb drivers by device emulation. In Proceedings of the 29th USENIX Conference on Security Symposium, SEC'20, USA, 2020. USENIX Association.
[42]
Theofilos Petsios, Adrian Tang, Salvatore Stolfo, Angelos D. Keromytis, and Suman Jana. Nezha: Efficient domain-independent differential testing. In 2017 IEEE Symposium on Security and Privacy (SP), pages 615--632, 2017.
[43]
Gordon Plotkin. A note on inductive generalization. Machine Intelligence, 1970.
[44]
Sergej Schumilo, Cornelius Aschermann, Robert Gawlik, Sebastian Schinzel, and Thorsten Holz. KAFL: Hardware-assisted feedback fuzzing for os kernels. In Proceedings of the 26th USENIX Conference on Security Symposium, SEC'17, page 167--182, USA, 2017. USENIX Association.
[45]
Dokyung Song, Felicitas Hetzelt, Dipanjan Das, Chad Spensky, Yeoul Na, Stijn Volckaert, Giovanni Vigna, Christopher Kruegel, Jean-Pierre Seifert, and Michael Franz. PeriScope: An effective probing and fuzzing framework for the hardwareOS boundary. In Network and Distributed System Security Symposium (NDSS), 2019.
[46]
Hao Sun, Yuheng Shen, Cong Wang, Jianzhong Liu, Yu Jiang, Ting Chen, and Aiguo Cui. HEALER: Relation learning guided kernel fuzzing. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles, SOSP '21, page 344--358, New York, NY, USA, 2021. Association for Computing Machinery.
[47]
Daimeng Wang, Zheng Zhang, Hang Zhang, Zhiyun Qian, Srikanth V Krishnamurthy, and Nael Abu-Ghazaleh. SyzVegas: Beating kernel fuzzing odds with reinforcement learning. In 30th USENIX Security Symposium (USENIX Security 21), pages 2741--2758, 2021.
[48]
Shu Wang, Xinda Wang, Kun Sun, Sushil Jajodia, Haining Wang, and Qi Li. GraphSPD: Graph-based security patch detection with enriched code semantics. In 2023 IEEE Symposium on Security and Privacy (SP), pages 2409--2426, 2023.
[49]
Xinda Wang, Kun Sun, Archer Batcheller, and Sushil Jajodia. Detecting "0-day" vulnerability: An empirical study of secret security patch in oss. In 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pages 485--492, 2019.
[50]
Xinda Wang, Shu Wang, Pengbin Feng, Kun Sun, Sushil Jajodia, Sanae Benchaaboun, and Frank Geck. PatchRNN: A deep learning-based system for security patch identification. In 2021 IEEE Military Communications Conference (MILCOM), 2021.
[51]
Yueming Wu, Deqing Zou, Shihan Dou, Wei Yang, Duo Xu, and Hai Jin. VulCNN: An image-inspired scalable vulnerability detection system. In Proceedings of the 44th International Conference on Software Engineering, ICSE '22, page 2365--2376, New York, NY, USA, 2022. Association for Computing Machinery.
[52]
Yuhang Wu, Zhenpeng Lin, Yueqi Chen, Dang K Le, Dongliang Mu, and Xinyu Xing. Mitigating security risks in linux with KLAUS: A method for evaluating patch correctness. In Proceedings of the 32nd USENIX Conference on Security Symposium, 2023.
[53]
Xiaoyuan Xie, Tsong Yueh Chen, and Baowen Xu. Isolating suspiciousness from spectrum-based fault localization techniques. In 2010 10th International Conference on Quality Software, pages 385--392, 2010.
[54]
Jiadong Lu Xin Xiong Zhuang Liu Xin Tan, Yuan Zhang and Min Yang. SyzDirect: Directed greybox fuzzing for linux kernel. In Proceedings of the 30th ACM Conference on Computer and Communications Security (CCS), 2023.
[55]
Jian Xu, Zhenyu Zhang, W. K. Chan, T. H. Tse, and Shanping Li. A general noise-reduction framework for fault localization of java programs. Inf. Softw. Technol., 55(5):880--896, may 2013.
[56]
Carter Yagemann, Simon P. Chung, Brendan Saltaformaggio, and Wenke Lee. Automated bug hunting with data-driven symbolic root cause analysis. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, CCS '21, page 320--336, New York, NY, USA, 2021. Association for Computing Machinery.
[57]
Wei You, Peiyuan Zong, Kai Chen, XiaoFeng Wang, Xiaojing Liao, Pan Bian, and Bin Liang. SemFuzz: Semantics-based automatic generation of proof-of-concept exploits. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS '17, page 2139--2154, New York, NY, USA, 2017. Association for Computing Machinery.
[58]
M. Zalewski. American fuzzy lop, 2017. http://lcamtuf.coredump.cx/afl/technical_details.txt.
[59]
Zhengran Zeng, Yuqun Zhang, Haotian Zhang, and Lingming Zhang. Deep just-in-time defect prediction: How far are we? In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2021, page 427--438, New York, NY, USA, 2021. Association for Computing Machinery.
[60]
Yizhuo Zhai, Yu Hao, Zheng Zhang, Weiteng Chen, Guoren Li, Zhiyun Qian, Chengyu Song, Manu Sridharan, Srikanth V Krishnamurthy, Trent Jaeger, et al. Progressive scrutiny: Incremental detection of ubi bugs in the linux kernel. In 2022 Network and Distributed System Security Symposium, 2022.
[61]
Yunhui Zheng, Saurabh Pujar, Burn Lewis, Luca Buratti, Edward Epstein, Bo Yang, Jim Laredo, Alessandro Morari, and Zhong Su. D2A: A dataset built for ai-based vulnerability detection methods using differential analysis. In Proceedings of the 43rd International Conference on Software Engineering: Software Engineering in Practice, ICSE-SEIP '21, page 111--120. IEEE Press, 2021.
[62]
Yaqin Zhou, Shangqing Liu, Jingkai Siow, Xiaoning Du, and Yang Liu. Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks. Curran Associates Inc., Red Hook, NY, USA, 2019.
[63]
Yaqin Zhou, Jing Kai Siow, Chenyu Wang, Shangqing Liu, and Yang Liu. SPI: Automated identification of security patches via commits. ACM Trans. Softw. Eng. Methodol., 2021.
[64]
Xiaogang Zhu and Marcel Böhme. Regression greybox fuzzing. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, pages 2169--2182, 2021.

Index Terms

  1. SyzRisk: A Change-Pattern-Based Continuous Kernel Regression Fuzzer

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASIA CCS '24: Proceedings of the 19th ACM Asia Conference on Computer and Communications Security
    July 2024
    1987 pages
    ISBN:9798400704826
    DOI:10.1145/3634737
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 July 2024

    Check for updates

    Author Tags

    1. continuous fuzzing
    2. kernel security
    3. regression testing
    4. development study
    5. code analysis

    Qualifiers

    • Research-article

    Funding Sources

    • ERC
    • SNSF
    • DARPA

    Conference

    ASIA CCS '24
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 418 of 2,322 submissions, 18%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 146
      Total Downloads
    • Downloads (Last 12 months)146
    • Downloads (Last 6 weeks)21
    Reflects downloads up to 20 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media