research-article

Dissecting American Fuzzy Lop: A FuzzBench Evaluation

Authors:
Andrea Fioraldi

EURECOM, Biot, France

EURECOM, Biot, France

0000-0002-0976-4395
View Profile

,
Alessandro Mantovani

EURECOM, Biot, France

EURECOM, Biot, France

0000-0003-4813-8562
View Profile

,
Dominik Maier

Technische Universität Berlin, Berlin, Germany

Technische Universität Berlin, Berlin, Germany

0000-0002-5588-5008
View Profile

,
Davide Balzarotti

EURECOM, Biot, France

EURECOM, Biot, France

0000-0001-5957-6213
View Profile

ACM Transactions on Software Engineering and Methodology Volume 32 Issue 2Article No.: 52pp 1–26https://doi.org/10.1145/3580596

Published:29 March 2023Publication History

ACM Transactions on Software Engineering and Methodology

Abstract

AFL is one of the most used and extended fuzzers, adopted by industry and academic researchers alike. Although the community agrees on AFL’s effectiveness at discovering new vulnerabilities and its outstanding usability, many of its internal design choices remain untested to date. Security practitioners often clone the project “as-is” and use it as a starting point to develop new techniques, usually taking everything under the hood for granted. Instead, we believe that a careful analysis of the different parameters could help modern fuzzers improve their performance and explain how each choice can affect the outcome of security testing, either negatively or positively.

The goal of this work is to provide a comprehensive understanding of the internal mechanisms of AFL by performing experiments and by comparing different metrics used to evaluate fuzzers. This can help to show the effectiveness of some techniques and to clarify which aspects are instead outdated. To perform our study, we performed nine unique experiments that we carried out on the popular Fuzzbench platform. Each test focuses on a different aspect of AFL, ranging from its mutation approach to the feedback encoding scheme and its scheduling methodologies.

Our findings show that each design choice affects different factors of AFL. Some of these are positively correlated with the number of detected bugs or the coverage of the target application, whereas other features are related to usability and reliability. Most important, we believe that the outcome of our experiments indicates which parts of AFL we should preserve in the design of modern fuzzers.

REFERENCES

[1] CERT. (n.d.). CERT BFF - Basic Fuzzing Framework. Retrieved September 1, 2022 from https://vuls.cert.org/confluence/display/tools/CERT+BFF+-+Basic+Fuzzing+Framework.Google Scholar
[2] GitHub. (n.d.). Funfuzz MozillaSecurity. Retrieved September 1, 2022 from https://github.com/MozillaSecurity/funfuzz.Google Scholar
[3] GitHub. (n.d.). Google OSS-Fuzz: Continuous Fuzzing of Open Source Software. Retrieved September 1, 2022 from https://github.com/google/oss-fuzz.Google Scholar
[4] Clang. 2016. Undefined Behavior Sanitizer. Retrieved December 22, 2021 from https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html.Google Scholar
[5] Aschermann Cornelius, Frassetto Tommaso, Holz T., Jauernig Patrick, Sadeghi A., and Teuchert Daniel. 2019. NAUTILUS: Fishing for deep bugs with grammars. In Proceedings of the 2019 Network and Distributed System Security Symposium (NDSS’19).Google ScholarCross Ref
[6] Aschermann Cornelius, Schumilo Sergej, Abbasi Ali, and Holz Thorsten. 2020. IJON: Exploring deep state spaces via fuzzing. In Proceedings of the IEEE Symposium on Security and Privacy (Oakland).Google ScholarCross Ref
[7] Baldoni Roberto, Coppa Emilio, D’Elia Daniele Cono, Demetrescu Camil, and Finocchi Irene. 2018. A survey of symbolic execution techniques. ACM Computing Surveys 51, 3 (2018), Article 50, 39 pages. DOI:Google ScholarDigital Library
[8] Blazytko Tim, Aschermann Cornelius, Schlögel Moritz, Abbasi Ali, Schumilo Sergej, Wörner Simon, and Holz Thorsten. 2019. GRIMOIRE: Synthesizing structure while fuzzing. In Proceedings of the 28th USENIX Security Symposium (USENIX Security’19). 1985–2002. https://www.usenix.org/conference/usenixsecurity19/presentation/blazytko.Google Scholar
[9] Böhme Marcel, Manès Valentin, and Cha Sang Kil. 2020. Boosting fuzzer efficiency: An information theoretic perspective. In Proceedings of the 14th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE’20). 1–11.Google ScholarDigital Library
[10] Böhme Marcel, Pham Van-Thuan, and Roychoudhury Abhik. 2016. Coverage-based greybox fuzzing as Markov chain. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS’16). ACM, New York, NY, 1032–1043. DOI:Google ScholarDigital Library
[11] Böhme Marcel and Paul Soumya. 2016. A probabilistic analysis of the efficiency of automated software testing. IEEE Transactions on Software Engineering 42, 4 (2016), 345–360. DOI:Google ScholarDigital Library
[12] Chen P. and Chen H.. 2018. Angora: Efficient fuzzing by principled search. In Proceedings of the 2018 IEEE Symposium on Security and Privacy (SP’18). 711–725. DOI:Google ScholarCross Ref
[13] Chen Yaohui, Ahmadi Mansour, Wang Boyu, Lu Long, et al. 2020. MEUZZ: Smart seed scheduling for hybrid fuzzing. In Proceedings of the 23rd International Symposium on Research in Attacks, Intrusions, and Defenses (RAID’20). 77–92.Google Scholar
[14] DeMott Jared D. and Enbody R.. 2007. Revolutionizing the field of grey-box attack surface testing with evolutionary fuzzing. In Proceedings of the 2007 Black Hat Conference.Google Scholar
[15] Eddington M.. (n.d.). Peach Fuzzing Platform. Retrieved December 22, 2021 from https://web.archive.org/web/20180621074520http://community.peachfuzzer.com/WhatIsPeach.html.Google Scholar
[16] Fioraldi Andrea, D’Elia Daniele Cono, and Balzarotti Davide. 2021. The use of likely invariants as feedback for fuzzers. In Proceedings of the 30th USENIX Security Symposium (USENIX Security’21). 2829–2846. https://www.usenix.org/conference/usenixsecurity21/presentation/fioraldi.Google Scholar
[17] Fioraldi Andrea, D’Elia Daniele Cono, and Coppa Emilio. 2020. WEIZZ: Automatic grey-box fuzzing for structured binary formats. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’20). ACM, New York, NY. DOI:Google ScholarDigital Library
[18] Fioraldi Andrea, Maier Dominik, Eißfeldt Heiko, and Heuse Marc. 2020. AFL++: Combining incremental steps of fuzzing research. In Proceedings of the 14th USENIX Workshop on Offensive Technologies (WOOT’20).Google Scholar
[19] Fioraldi Andrea, Maier Dominik, Zhang Dongjia, and Balzarotti Davide. 2022. LibAFL: A framework to build modular and reusable fuzzers. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security (CCS’22). 1051–1065.Google Scholar
[20] Godefroid Patrice. 2007. Random testing for security: Blackbox vs. whitebox fuzzing. In Proceedings of the 2nd International Workshop on Random Testing: Co-located with the 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE’07). 1.Google ScholarDigital Library
[21] Haller Istvan, Jeon Yuseok, Peng Hui, Payer Mathias, Giuffrida Cristiano, Bos Herbert, and Kouwe Erik Van Der. 2016. TypeSan: Practical type confusion detection. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS’16). 517–528.Google ScholarDigital Library
[22] Hertz Jesse and Newsham Tim. (n.d.) Project Triforce: Run AFL on Everything! Retrieved September 1, 2022 from https://www.nccgroup.trust/us/about-us/newsroom-and-events/blog/2016/june/project-triforce-run-afl-on-everything/.Google Scholar
[23] Heuse Marc. 2020. afl-clang-lto - Collision Free Instrumentation at Link Time. Retrieved September 1, 2022 from https://github.com/AFLplusplus/AFLplusplus/blob/stable/instrumentation/README.lto.md.Google Scholar
[24] Klees George, Ruef Andrew, Cooper Benji, Wei Shiyi, and Hicks Michael. 2018. Evaluating fuzz testing. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS’18). ACM, New York, NY, 2123–2138. DOI:Google ScholarDigital Library
[25] Lattner Chris and Adve Vikram. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the 2004 International Symposium on Code Generation and Optimization (CGO’04).Google ScholarDigital Library
[26] LLVM. (n.d.) SanitizerCoverage - Edge Coverage. Retrieved September 1, 2022 from https://clang.llvm.org/docs/SanitizerCoverage.html#edge-coverage.Google Scholar
[27] Project LLVM. 2018. libFuzzer – A Library for Coverage-Guided Fuzz Testing. Retrieved September 1, 2022 fromhttps://llvm.org/docs/LibFuzzer.html.Google Scholar
[28] Manes V., Han H., Han C., Cha S. K., Egele M., Schwartz E. J., and Woo M.. 2021. The art, science, and engineering of fuzzing: A survey. IEEE Transactions on Software Engineering 47, 11 (2021), 2312–2331. DOI:Google ScholarCross Ref
[29] Manès Valentin J. M., Kim Soomin, and Cha Sang Kil. 2020. Ankou: Guiding grey-box fuzzing towards combinatorial difference. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (ICSE’20). ACM, New York, NY, 1024–1036. DOI:Google ScholarDigital Library
[30] Medicherla Raveendra Kumar, Komondoor Raghavan, and Roychoudhury Abhik. 2020. Fitness guided vulnerability detection with greybox fuzzing. In Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops (ICSEW’20). ACM, New York, NY, 513–520. Google ScholarDigital Library
[31] Metzman Jonathan, Szekeres László, Simon Laurent, Sprabery Read, and Arya Abhishek. 2021. FuzzBench: An open fuzzer benchmarking platform and service. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1393–1403.Google ScholarDigital Library
[32] Miller Barton P., Fredriksen Louis, and So Bryan. 1990. An empirical study of the reliability of UNIX utilities. Communications of the ACM 33, 12 (Dec.1990), 32–44. DOI:Google ScholarDigital Library
[33] Padhye Rohan, Lemieux Caroline, Sen Koushik, Papadakis Mike, and Traon Yves Le. 2019. Semantic fuzzing with zest. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’19). ACM, New York, NY, 329–340. DOI:Google ScholarDigital Library
[34] Padhye Rohan, Lemieux Caroline, Sen Koushik, Simon Laurent, and Vijayakumar Hayawardh. 2019. FuzzFactory: Domain-specific fuzzing with waypoints. Proceedings of the ACM on Programming Languages 3, OOPSLA (Oct. 2019), Article 174, 29 pages. DOI:Google ScholarDigital Library
[35] Pham V., Boehme M., Santosa A. E., Caciulescu A. R., and Roychoudhury A.. 2019. Smart greybox fuzzing. IEEE Transactions on Software Engineering 47 (2019), 1980–1997. DOI:Google ScholarCross Ref
[36] Pham Van-Thuan, Böhme Marcel, and Roychoudhury Abhik. 2020. AFLNet: A greybox fuzzer for network protocols. In Proceedings of the 13th IEEE International Conference on Software Testing, Verification, and Validation: Testing Tools Track.Google ScholarCross Ref
[37] Poeplau Sebastian and Francillon Aurélien. 2020. Symbolic execution with SymCC: Don’t interpret, compile! In Proceedings of the 29th USENIX Security Symposium (USENIX Security’20). 181–198.Google Scholar
[38] Rawat Sanjay, Jain Vivek, Kumar Ashish, Cojocar Lucian, Giuffrida Cristiano, and Bos Herbert. 2017. VUzzer: Application-aware evolutionary fuzzing. In Proceedings of the 24th Annual Network and Distributed System Security Symposium (NDSS’17). https://www.ndss-symposium.org/ndss2017/ndss-2017-programme/vuzzer-application-aware-evolutionary-fuzzing/.Google ScholarCross Ref
[39] Schumilo Sergej, Aschermann Cornelius, Abbasi Ali, Wörner Simon, and Holz Thorsten. 2021. Nyx: Greybox hypervisor fuzzing using fast snapshots and affine types. In Proceedings of the 30th USENIX Security Symposium (USENIX Security’21). https://www.usenix.org/conference/usenixsecurity21/presentation/schumilo.Google Scholar
[40] Schumilo Sergej, Aschermann Cornelius, Gawlik Robert, Schinzel Sebastian, and Holz Thorsten. 2017. KAFL: Hardware-assisted feedback fuzzing for OS kernels. In Proceedings of the 26th USENIX Conference on Security Symposium (SEC’17). 167–182. Google Scholar
[41] Schumilo Sergej, Aschermann Cornelius, Jemmett Andrea, Abbasi Ali, and Holz Thorsten. 2021. Nyx-Net: Network fuzzing with incremental snapshots. arXiv preprint arXiv:2111.03013 (2021).Google Scholar
[42] Serebryany Konstantin, Bruening Derek, Potapenko Alexander, and Vyukov Dmitry. 2012. AddressSanitizer: A fast address sanity checker. In Proceedings of the 2012 USENIX Annual Technical Conference (USENIX ATC’12). 28.Google Scholar
[43] Shoshitaishvili Yan, Wang Ruoyu, Salls Christopher, Stephens Nick, Polino Mario, Dutcher Andrew, Grosen John, et al. 2016. Sok: (State of) the art of war: Offensive techniques in binary analysis. In Proceedings of the 2016 IEEE Symposium on Security and Privacy (SP’17). IEEE, Los Alamitos, CA, 138–157.Google ScholarCross Ref
[44] Srivastava Prashast and Payer Mathias. 2021. Gramatron: Effective grammar-aware fuzzing. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’21). 244–256.Google Scholar
[45] Stephens Nick, Grosen John, Salls Christopher, Dutcher Audrey, Wang Ruoyu, Corbetta Jacopo, Shoshitaishvili Yan, Kruegel Christopher, and Vigna Giovanni. 2016. Driller: Augmenting fuzzing through selective symbolic execution. In Proceedings of the 2016 Network and Distributed System Security Symposium (NDSS’16). 1–16.Google ScholarCross Ref
[46] Sthamer Harmen-Hinrich. 1995. The Automatic Generation of Software Test Data Using Genetic Algorithms. Ph. D. Dissertation. University of Glamorgan.Google Scholar
[47] Swiecki Robert. n.d. Honggfuzz. Retrieved September 1, 2022 from https://github.com/google/honggfuzz.Google Scholar
[48] Toepfer Fabian and Maier Dominik. 2021. BSOD: Binary-only scalable fuzzing of device drivers. In Proceedings of the 24th International Symposium on Research in Attacks, Intrusions, and Defenses(RAID’21). 48–61.Google Scholar
[49] Vyukov Dmitry. n.d. syzkaller - Kernel Fuzzer. Retrieved September 1, 2022 from https://github.com/google/syzkaller.Google Scholar
[50] Wang Jinghan, Duan Yue, Song Wei, Yin Heng, and Song Chengyu. 2019. Be sensitive and collaborative: Analyzing impact of coverage metrics in greybox fuzzing. In Proceedings of the 22nd International Symposium on Research in Attacks, Intrusions, and Defenses (RAID’19). 1–15. https://www.usenix.org/conference/raid2019/presentation/wang.Google Scholar
[51] Wang Jinghan, Song Chengyu, and Yin Heng. 2021. Reinforcement learning-based hierarchical seed scheduling for greybox fuzzing. In Proceedings of the 2021 Network and Distributed System Security Symposium (NDSS’21).Google ScholarCross Ref
[52] Wang Yanhao, Jia Xiangkun, Liu Yuwei, Zeng Kyle, Bao Tiffany, Wu Dinghao, and Su Purui. 2020. Not all coverage measurements are equal: Fuzzing by coverage accounting for input prioritization. In Proceedings of the 2020 Network and Distributed System Security Symposium (NDSS’21).Google ScholarCross Ref
[53] Xu Wen, Kashyap Sanidhya, Min Changwoo, and Kim Taesoo. 2017. Designing new operating primitives to improve fuzzing performance. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS’17). ACM, New York, NY, 2313–2328. DOI:Google ScholarDigital Library
[54] Yue Tai, Wang Pengfei, Tang Yong, Wang Enze, Yu Bo, Lu Kai, and Zhou Xu. 2020. EcoFuzz: Adaptive energy-saving greybox fuzzing as a variant of the adversarial multi-armed bandit. In Proceedings of the 29th USENIX Security Symposium (USENIX Security’20). 2307–2324. https://www.usenix.org/conference/usenixsecurity20/presentation/yue.Google Scholar
[55] Zalewski Michał. n.d. American Fuzzy Lop. Retrieved September 1, 2022 from https://lcamtuf.coredump.cx/afl/.Google Scholar
[56] Zalewski Michał. 2014. Binary Fuzzing Strategies: What Works, What Doesn’t. Retrieved September 1, 2022 from https://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html.Google Scholar
[57] Zalewski Michał. 2014. Fuzzing Random Programs Without execve(). Retrieved September 1, 2022 from https://lcamtuf.blogspot.com/2014/10/fuzzing-binaries-without-execve.html.Google Scholar
[58] Zalewski Michał. 2015. afl-fuzz: Making Up Grammar with a Dictionary in Hand. Retrieved September 1, 2022 from https://lcamtuf.blogspot.com/2015/01/afl-fuzz-making-up-grammar-with.html.Google Scholar
[59] Zalewski Michał. 2016. American Fuzzy Lop - Whitepaper. Retrieved September 1, 2022 from https://lcamtuf.coredump.cx/afl/technical_details.txt.Google Scholar
[60] Zalewski Michał. 2016. Bunny the Fuzzer. Retrieved September 1, 2022 from https://code.google.com/archive/p/bunny-the-fuzzer/.Google Scholar
[61] Zalewski Michał. 2016. “FidgetyAFL” Implemented in 2.31b. Retrieved September 1, 2022 from https://groups.google.com/g/afl-users/c/1PmKJC-EKZ0/m/zck6Iu77DgAJ.Google Scholar

Index Terms

Dissecting American Fuzzy Lop: A FuzzBench Evaluation
1. Security and privacy
  1. Software and application security
    1. Software security engineering
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

FuzzBench: an open fuzzer benchmarking platform and service
ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Fuzzing is a key tool used to reduce bugs in production software. At Google, fuzzing has uncovered tens of thousands of bugs. Fuzzing is also a popular subject of academic research. In 2020 alone, over 120 papers were published on the topic of improving,...
Read More
Speeding Up Bug Finding using Focused Fuzzing
ARES '18: Proceedings of the 13th International Conference on Availability, Reliability and Security

Greybox fuzzing has recently emerged as a scalable and practical approach to finding security bugs in software. For example, AFL ---the current state-of-the-art greybox fuzzer --- has found hundreds of vulnerabilities in popular software since its ...
Read More
POSTER: AFL-based Fuzzing for Java with Kelinci
CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security

Grey-box fuzzing is a random testing technique that has been shown to be effective at finding security vulnerabilities in software. The technique leverages program instrumentation to gather information about the program with the goal of increasing the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Software Engineering and Methodology Volume 32, Issue 2
March 2023
946 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/3586025
Editor:
Mauro Pezzè
USI Università della Svizzera italiana and SIT Schaffhausen Institute of Technology, Switzerland
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 March 2023
- Online AM: 20 January 2023
- Accepted: 14 December 2022
- Revised: 2 September 2022
- Received: 15 June 2022
Published in tosem Volume 32, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Fuzzing
AFL
FuzzBench
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 1,092
  Total Downloads
- Downloads (Last 12 months)878
- Downloads (Last 6 weeks)131
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

HTML Format

View this article in HTML Format .

View HTML Format

Dissecting American Fuzzy Lop: A FuzzBench Evaluation

ACM Transactions on Software Engineering and Methodology

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

FuzzBench: an open fuzzer benchmarking platform and service

Speeding Up Bug Finding using Focused Fuzzing

POSTER: AFL-based Fuzzing for Java with Kelinci