research-article

ProphetFuzz: Fully Automated Prediction and Fuzzing of High-Risk Option Combinations with Only Documentation via Large Language Model

Authors:

Yukai MiaoAuthors Info & Claims

CCS '24: Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security

Pages 735 - 749

https://doi.org/10.1145/3658644.3690231

Published: 09 December 2024 Publication History

Abstract

Vulnerabilities related to option combinations pose a significant challenge in software security testing due to their vast search space. Previous research primarily addressed this challenge through mutation or filtering techniques, which inefficiently treated all option combinations as having equal potential for vulnerabilities, thus wasting considerable time on non-vulnerable targets and resulting in low testing efficiency. In this paper, we utilize carefully designed prompt engineering to drive the large language model (LLM) to predict high-risk option combinations (i.e., more likely to contain vulnerabilities) and perform fuzz testing automatically without human intervention. We developed a tool called ProphetFuzz and evaluated it on a dataset comprising 52 programs collected from three related studies. The entire experiment consumed 10.44 CPU years. ProphetFuzz successfully predicted 1748 high-risk option combinations at an average cost of only \8.69 per program. Results show that after 72 hours of fuzzing, ProphetFuzz discovered 364 unique vulnerabilities associated with 12.30% of the predicted high-risk option combinations, which was 32.85% higher than that found by state-of-the-art in the same timeframe. Additionally, using ProphetFuzz, we conducted persistent fuzzing on the latest versions of these programs, uncovering 140 vulnerabilities, with 93 confirmed by developers and 21 awarded CVE numbers.

References

[1]

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023).

[2]

Omer Akgul, Taha Eghtesad, Amit Elazari, Omprakash Gnawali, Jens Grossklags, Michelle L Mazurek, Daniel Votipka, and Aron Laszka. 2023. Bug Hunters? Perspectives on the Challenges and Benefits of the Bug Bounty Ecosystem. In 32nd USENIX Security Symposium (USENIX Security 23). 2275--2291.

[3]

Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2017. Coverage-Based Greybox Fuzzing as Markov Chain. IEEE Transactions on Software Engineering, Vol. 45, 5 (2017), 489--506.

[4]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems, Vol. 33 (2020), 1877--1901.

[5]

Yinlin Deng, Chunqiu Steven Xia, Haoran Peng, Chenyuan Yang, and Lingming Zhang. 2023. Large language models are zero-shot fuzzers: Fuzzing deep-learning libraries via large language models. In Proceedings of the 32nd ACM SIGSOFT international symposium on software testing and analysis. 423--435.

Digital Library

[6]

Yinlin Deng, Chunqiu Steven Xia, Chenyuan Yang, Shizhuo Dylan Zhang, Shujing Yang, and Lingming Zhang. 2024. Large language models are edge-case generators: Crafting unusual programs for fuzzing deep learning libraries. In Proceedings of the 46th IEEE/ACM International Conference on Software Engineering. 1--13.

Digital Library

[7]

Dor1s. 2019. Testcase of pdf. https://github.com/google/AFL/blob/master/testcases/others/pdf/small.pdf

[8]

Jueon Eom, Seyeon Jeong, and Taekyoung Kwon. 2024. CovRL: Fuzzing JavaScript Engines with Coverage-Guided Reinforcement Learning for LLM-based Mutation. arXiv preprint arXiv:2402.12222 (2024).

[9]

Andrea Fioraldi, Dominik Maier, Heiko Eißfeldt, and Marc Heuse. 2020. AFL: Combining Incremental Steps of Fuzzing Research. In 14th USENIX Workshop on Offensive Technologies (WOOT 20). USENIX Association.

[10]

Yao Fu, Hao Peng, Ashish Sabharwal, Peter Clark, and Tushar Khot. 2022. Complexity-based prompting for multi-step reasoning. In The Eleventh International Conference on Learning Representations.

[11]

Google. 2019. Fuzzing with afl-fuzz. https://afl-1.readthedocs.io/en/latest/fuzzing.html

[12]

Jie Hu, Qian Zhang, and Heng Yin. 2023. Augmenting greybox fuzzing with generative ai. arXiv preprint arXiv:2306.06782 (2023).

[13]

Shima Imani, Liang Du, and Harsh Shrivastava. 2023. Mathprompter: Mathematical reasoning using large language models. arXiv preprint arXiv:2303.05398 (2023).

[14]

George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018. Evaluating fuzz testing. In Proceedings of the 2018 ACM SIGSAC conference on computer and communications security. 2123--2138.

Digital Library

[15]

Ahcheong Lee, Irfan Ariq, Yunho Kim, and Moonzoo Kim. 2022. Power: Program option-aware fuzzer for high bug detection ability. In 2022 IEEE Conference on Software Testing, Verification and Validation (ICST). IEEE, 220--231.

[16]

Caroline Lemieux, Jeevana Priya Inala, Shuvendu K Lahiri, and Siddhartha Sen. 2023. Codamosa: Escaping coverage plateaus in test generation with pre-trained large language models. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 919--931.

Digital Library

[17]

LMSYS. 2024. LMSYS Chatbot Arena Leaderboard in March 13, 2024. https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard

[18]

Xiaoliang Luo, Akilles Rechardt, Guangzhi Sun, Kevin K Nejad, Felipe Yá nez, Bati Yilmaz, Kangjoo Lee, Alexandra O Cohen, Valentina Borghesani, Anton Pashkov, et al. 2024. Large language models surpass human experts in predicting neuroscience results. arXiv preprint arXiv:2403.03230 (2024).

[19]

Yunlong Lyu, Yuxuan Xie, Peng Chen, and Hao Chen. 2023. Prompt Fuzzing for Fuzz Driver Generation. arXiv preprint arXiv:2312.17677 (2023).

[20]

Ruijie Meng, Martin Mirchev, Marcel Böhme, and Abhik Roychoudhury. 2024. Large language model guided protocol fuzzing. In Proceedings of the 31st Annual Network and Distributed System Security Symposium (NDSS).

[21]

Timothy Nosco, Jared Ziegler, Zechariah Clark, Davy Marrero, Todd Finkler, Andrew Barbarello, and W Michael Petullo. 2020. The industrial age of hacking. In 29th USENIX Security Symposium (USENIX Security 20). 1129--1146.

[22]

Yaroslav Oliinyk, Michael Scott, Ryan Tsang, Chongzhou Fang, Houman Homayoun, et al. 2024. Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing. arXiv preprint arXiv:2403.03897 (2024).

[23]

OpenAI. 2023. GPT-4V(ision) System Card.

[24]

OpenAI. 2024. API Reference - OpenAI API. https://platform.openai.com/docs/api-reference/audio

[25]

Chengbin Pang, Tiantai Zhang, Xuelan Xu, Linzhang Wang, and Bing Mao. 2023. OCFI: Make Function Entry Identification Hard Again. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis. 804--815.

Digital Library

[26]

Anthropic PBC. 2024. Introducing the next generation of Claude. https://www.anthropic.com/news/claude-3-family

[27]

LLVM Project. 2024. libFuzzer -- a library for coverage-guided fuzz testing. https://llvm.org/docs/LibFuzzer.html

[28]

Reingart. 2018. FPDF for Python. https://pyfpdf.readthedocs.io/en/latest/

[29]

Suhwan Song, Chengyu Song, Yeongjin Jang, and Byoungyoung Lee. 2020. CrFuzz: Fuzzing multi-purpose programs through input validation. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 690--700.

Digital Library

[30]

Gemini Team, Rohan Anil, Sebastian Borgeaud, Yonghui Wu, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M Dai, Anja Hauth, et al. 2023. Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805 (2023).

[31]

Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al. 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023).

[32]

Trieu H Trinh, Yuhuai Wu, Quoc V Le, He He, and Thang Luong. 2024. Solving olympiad geometry without human demonstrations. Nature, Vol. 625, 7995 (2024), 476--482.

[33]

Daniel Votipka, Rock Stevens, Elissa Redmiles, Jeremy Hu, and Michelle Mazurek. 2018. Hackers vs. testers: A comparison of software vulnerability discovery processes. In 2018 IEEE Symposium on Security and Privacy (SP). IEEE, 374--391.

[34]

Dawei Wang, Ying Li, Zhiyu Zhang, and Kai Chen. 2023. CarpetFuzz: Automatic Program Option Constraint Extraction from Documentation for Fuzzing. In 32nd USENIX Security Symposium (USENIX Security 23). 1919--1936.

[35]

Zi Wang, Ben Liblit, and Thomas Reps. 2020. Tofu: Target-oriented fuzzer. arXiv preprint arXiv:2004.14375 (2020).

[36]

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. 2022. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, Vol. 35 (2022), 24824--24837.

[37]

Chunqiu Steven Xia, Matteo Paltenghi, Jia Le Tian, Michael Pradel, and Lingming Zhang. 2024. Fuzz4all: Universal fuzzing with large language models. Proceedings of the 46th IEEE/ACM International Conference on Software Engineering (2024).

Digital Library

[38]

Xuhai Xu, Bingsheng Yao, Yuanzhe Dong, Saadia Gabriel, Hong Yu, James Hendler, Marzyeh Ghassemi, Anind K Dey, and Dakuo Wang. 2024. Mental-LLM: Leveraging Large Language Models for Mental Health Prediction via Online Text Data. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol. 8, 1 (2024), 1--32.

Digital Library

[39]

Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V Le, Denny Zhou, and Xinyun Chen. 2023. Large language models as optimizers. arXiv preprint arXiv:2309.03409 (2023).

[40]

Chenyuan Yang, Zijie Zhao, and Lingming Zhang. 2023. KernelGPT: Enhanced Kernel Fuzzing via Large Language Models. arXiv preprint arXiv:2401.00563 (2023).

[41]

Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao, Yu Zhang, Yulong Chen, et al. 2023. Siren's song in the AI ocean: a survey on hallucination in large language models. arXiv preprint arXiv:2309.01219 (2023).

[42]

Yunhang Zhang, Chengbin Pang, Stefan Nagy, Xun Chen, and Jun Xu. 2023. Profile-guided System Optimizations for Accelerated Greybox Fuzzing. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security. 1257--1271.

Digital Library

[43]

Zenong Zhang, George Klees, Eric Wang, Michael Hicks, and Shiyi Wei. 2023. Fuzzing configurations of program options. ACM Transactions on Software Engineering and Methodology, Vol. 32, 2 (2023), 1--21.

Digital Library

Index Terms

ProphetFuzz: Fully Automated Prediction and Fuzzing of High-Risk Option Combinations with Only Documentation via Large Language Model
1. Security and privacy
  1. Software and application security

Recommendations

ContractFuzzer: fuzzing smart contracts for vulnerability detection
ASE '18: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering

Decentralized cryptocurrencies feature the use of blockchain to transfer values among peers on networks without central agency. Smart contracts are programs running on top of the blockchain consensus protocol to enable people make agreements while ...
Fuzzing vulnerability discovery techniques: Survey, challenges and future directions
Abstract
Fuzzing is a powerful tool for vulnerability discovery in software, with much progress being made in the field in recent years. There is limited literature available on the fuzzing vulnerability discovery approaches. Hence, in this ...
TFTP vulnerability finding technique based on fuzzing

The basic value proposition of vulnerability finding is simple: it is better for vulnerabilities to be found and fixed by good guys than for them to be found and exploited by bad guys. Fuzzing is the art of automatic vulnerability finding. In this paper,...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CCS '24: Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security

December 2024

5188 pages

ISBN:9798400706363

DOI:10.1145/3658644

General Chairs:
Bo Luo
University of Kansas, USA
,
Xiaojing Liao
Indiana University Bloomington, USA
,
Jun Xu
University of Utah, USA
,
Program Chairs:
Engin Kirda
Northeastern University, USA
,
David Lie
University of Toronto, Canada

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 December 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CCS '24

Sponsor:

SIGSAC

CCS '24: ACM SIGSAC Conference on Computer and Communications Security

October 14 - 18, 2024

UT, Salt Lake City, USA

Acceptance Rates

Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

Upcoming Conference

CCS '25

Sponsor:
sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 13 - 17, 2025

Taipei , Taiwan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
264
Total Downloads

Downloads (Last 12 months)264
Downloads (Last 6 weeks)128

Reflects downloads up to 10 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten