ABSTRACT
Starting with a random initial seed, fuzzers search for inputs that trigger bugs or vulnerabilities. However, fuzzers often fail to generate inputs for program paths guarded by restrictive branch conditions. In this paper, we show that by first identifying rare-paths in programs (i.e., program paths with path constraints that are unlikely to be satisfied by random input generation), and then, generating inputs/seeds that trigger rare-paths, one can improve the coverage of fuzzing tools. In particular, we present techniques 1) that identify rare paths using quantitative symbolic analysis, and 2) generate inputs that can explore these rare paths using path-guided concolic execution. We provide these inputs as initial seed sets to three state of the art fuzzers. Our experimental evaluation on a set of programs shows that the fuzzers achieve better coverage with the rare-path based seed set compared to a random initial seed.
- 2006. laf-intel. https://lafintel.wordpress.com/ Google Scholar
- 2022. Calculator. https://github.com/btmills/calculator Google Scholar
- 2022. CodeQL. https://codeql.github.com Google Scholar
- 2022. Docker for AFL++. https://hub.docker.com/r/aflplusplus/aflplusplus Google Scholar
- 2022. Docker for FairFuzz. https://hub.docker.com/r/zjuchenyuan/fairfuzz Google Scholar
- 2023. SV-Benchmark:seq-mthreded. https://gitlab.com/sosy-lab/benchmarking/sv-benchmarks/-/tree/main/c/seq-mthreaded Google Scholar
- Frances E. Allen. 1970. Control Flow Analysis. SIGPLAN Not., 5, 7 (1970), jul, 1–19. issn:0362-1340 https://doi.org/10.1145/390013.808479 Google ScholarDigital Library
- Cornelius Aschermann, Sergej Schumilo, Tim Blazytko, Robert Gawlik, and Thorsten Holz. 2019. REDQUEEN: Fuzzing with Input-to-State Correspondence. https://doi.org/10.14722/ndss.2019.23371 Google ScholarCross Ref
- Abdulbaki Aydin, Lucas Bang, and Tevfik Bultan. 2015. Automata-Based Model Counting for String Constraints. 255–272. isbn:978-3-319-21689-8 https://doi.org/10.1007/978-3-319-21690-4_15 Google ScholarCross Ref
- Thomas Bach, Artur Andrzejak, Ralf Pannemans, and David Lo. 2017. The Impact of Coverage on Bug Density in a Large Industrial Software Project. https://doi.org/10.1109/ESEM.2017.44 Google ScholarDigital Library
- Sofia Bekrar, Chaouki Bekrar, Roland Groz, and Laurent Mounier. 2012. A Taint Based Approach for Smart Fuzzing. Proceedings - IEEE 5th International Conference on Software Testing, Verification and Validation, ICST 2012, 04, https://doi.org/10.1109/ICST.2012.182 Google ScholarDigital Library
- Dirk Beyer. 2021. Software Verification: 10th Comparative Evaluation (SV-COMP 2021). 401–422. isbn:978-3-030-72012-4 https://doi.org/10.1007/978-3-030-72013-1_24 Google ScholarDigital Library
- Dirk Beyer. 2022. Advances in Automatic Software Testing: Test-Comp 2022. 321–335. isbn:978-3-030-99428-0 https://doi.org/10.1007/978-3-030-99429-7_18 Google ScholarDigital Library
- Marcel Bohme, Thuan Pham, and Abhik Roychoudhury. 2017. Coverage-Based Greybox Fuzzing as Markov Chain. IEEE Transactions on Software Engineering, PP (2017), 12, 1–1. https://doi.org/10.1109/TSE.2017.2785841 Google ScholarCross Ref
- Sergey Bratus, Axel Hansen, and Anna Shubina. 2008. LZfuzz: a fast compression-based fuzzer for poorly documented protocols. Google Scholar
- Jacob Burnim and Koushik Sen. 2008. Heuristics for Scalable Dynamic Test Generation. In 2008 23rd IEEE/ACM International Conference on Automated Software Engineering. 443–446. https://doi.org/10.1109/ASE.2008.69 Google ScholarDigital Library
- Cristian Cadar, Daniel Dunbar, and Dawson R Engler. 2008. Klee: unassisted and automatic generation of high-coverage tests for complex systems programs.. In OSDI. 8, 209–224. Google ScholarDigital Library
- Antonio Filieri, Corina Păsăreanu, Willem Visser, and Jaco Geldenhuys. 2014. Statistical symbolic execution with informed sampling. 437–448. https://doi.org/10.1145/2635868.2635899 Google ScholarDigital Library
- Andrea Fioraldi, Dominik Maier, Heiko Eiß feldt, and Marc Heuse. 2020. $AFL++$: Combining Incremental Steps of Fuzzing Research. In 14th USENIX Workshop on Offensive Technologies (WOOT 20). Google Scholar
- Shuitao Gan, Chao Zhang, Peng Chen, Bodong Zhao, Xiaojun Qin, Dong Wu, and Zuoning Chen. 2020. $GREYONE$: Data flow sensitive fuzzing. In 29th USENIX Security Symposium (USENIX Security 20). 2577–2594. Google Scholar
- Jaco Geldenhuys, Matthew Dwyer, and Willem Visser. 2012. Probabilistic symbolic execution. 2012 International Symposium on Software Testing and Analysis, ISSTA 2012 - Proceedings, 07, https://doi.org/10.1145/2338965.2336773 Google ScholarDigital Library
- Patrice Godefroid, Adam Kiezun, and Michael Levin. 2008. Grammar-based Whitebox Fuzzing. Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 43, 206–215. https://doi.org/10.1145/1379022.1375607 Google ScholarDigital Library
- Adrian Herrera, Hendra Gunadi, Shane Magrath, Michael Norrish, Mathias Payer, and Antony Hosking. 2021. Seed selection for successful fuzzing. 230–243. https://doi.org/10.1145/3460319.3464795 Google ScholarDigital Library
- Christian Holler, Kim Herzig, and Andreas Zeller. 2012. Fuzzing with code fragments. In 21st USENIX Security Symposium (USENIX Security 12). 445–458. Google Scholar
- Caroline Lemieux and Koushik Sen. 2018. FairFuzz: a targeted mutation strategy for increasing greybox fuzz testing coverage. 475–485. https://doi.org/10.1145/3238147.3238176 Google ScholarDigital Library
- Caroline Lemieux and Koushik Sen. 2021. FairFuzz-TC: a fuzzer targeting rare branches. International Journal on Software Tools for Technology Transfer, 23, 6 (2021), 01 Dec, 863–866. issn:1433-2787 https://doi.org/10.1007/s10009-020-00569-w Google ScholarDigital Library
- Caroline Lemieux and Koushik Sen. 2021. FairFuzz-TC: a fuzzer targeting rare branches. International Journal on Software Tools for Technology Transfer, 23, 6 (2021), 01 Dec, 863–866. issn:1433-2787 https://doi.org/10.1007/s10009-020-00569-w Google ScholarDigital Library
- Yuekang Li, Bihuan Chen, Mahinthan Chandramohan, Shang-Wei Lin, Yang Liu, and Alwen Tiu. 2017. Steelix: Program-State Based Binary Fuzzing. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2017). Association for Computing Machinery, New York, NY, USA. 627–637. isbn:9781450351058 https://doi.org/10.1145/3106237.3106295 Google ScholarDigital Library
- Hongliang Liang, Xiaoxiao Pei, Xiaodong Jia, Wuwei Shen, and Jian Zhang. 2018. Fuzzing: State of the Art. IEEE Transactions on Reliability, 67, 3 (2018), 1199–1218. https://doi.org/10.1109/TR.2018.2834476 Google ScholarCross Ref
- Jie Liang, Yu Jiang, Mingzhe Wang, Xun Jiao, Yuanliang Chen, Houbing Song, and Kim-Kwang Raymond Choo. 2021. DeepFuzzer: Accelerated Deep Greybox Fuzzing. IEEE Transactions on Dependable and Secure Computing, 18, 6 (2021), 2675–2688. https://doi.org/10.1109/TDSC.2019.2961339 Google ScholarDigital Library
- Chenyang Lyu, Shouling Ji, Chao Zhang, Yuwei Li, Wei-Han Lee, Yu Song, and Raheem Beyah. 2019. $MOPT$: Optimized mutation scheduling for fuzzers. In 28th USENIX Security Symposium (USENIX Security 19). 1949–1966. Google Scholar
- Björn Mathis, Rahul Gopinath, Michaël Mera, Alexander Kampmann, Matthias Höschele, and Andreas Zeller. 2019. Parser-Directed Fuzzing. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2019). Association for Computing Machinery, New York, NY, USA. 548–560. isbn:9781450367127 https://doi.org/10.1145/3314221.3314651 Google ScholarDigital Library
- Michał Zalewski. 2014. American Fuzzy Lop. http://lcamtuf.coredump.cx/afl/ Google Scholar
- George C. Necula, Scott McPeak, Shree P. Rahul, and Westley Weimer. 2002. CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs. In Compiler Construction, R. Nigel Horspool (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg. 213–228. isbn:978-3-540-45937-8 Google ScholarDigital Library
- Haibo Pang, Jie Jian, Yan Zhuang, Yingyun Ye, and Zhanbo Li. 2021. SpotFuzz: Fuzzing Based on Program Hot-Spots. Electronics, 10 (2021), 12, 3142. https://doi.org/10.3390/electronics10243142 Google ScholarCross Ref
- Sanjay Rawat, Vivek Jain, Ashish Kumar, Lucian Cojocar, Cristiano Giuffrida, and Herbert Bos. 2017. VUzzer: Application-aware Evolutionary Fuzzing. https://doi.org/10.14722/ndss.2017.23404 Google ScholarCross Ref
- Seemanta Saha, Mara Downing, Tegan Brennan, and Tevfik Bultan. 2022. PREACH: A Heuristic for Probabilistic Reachability to Identify Hard to Reach Statements. In 44th IEEE/ACM 44th International Conference on Software Engineering, ICSE 2022, Pittsburgh, PA, USA, May 25-27, 2022. ACM, 1706–1717. https://doi.org/10.1145/3510003.3510227 Google ScholarDigital Library
- Nick Stephens, John Grosen, Christopher Salls, Andrew Dutcher, Ruoyu Wang, Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna. 2016. Driller: Augmenting Fuzzing Through Selective Symbolic Execution. https://doi.org/10.14722/ndss.2016.23368 Google ScholarCross Ref
- Fish Wang and Yan Shoshitaishvili. 2017. Angr - The Next Generation of Binary Analysis. In 2017 IEEE Cybersecurity Development (SecDev). 8–9. https://doi.org/10.1109/SecDev.2017.14 Google ScholarCross Ref
- Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2017. Skyfire: Data-Driven Seed Generation for Fuzzing. In 2017 IEEE Symposium on Security and Privacy (SP). 579–594. https://doi.org/10.1109/SP.2017.23 Google ScholarCross Ref
- Mingyuan Wu, Ling Jiang, Jiahong Xiang, Yuqun Zhang, Guowei Yang, Huixin Ma, Sen Nie, Shi Wu, Heming Cui, and Lingming Zhang. 2022. Evaluating and Improving Neural Program-Smoothing-based Fuzzing. In 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE). 847–858. https://doi.org/10.1145/3510003.3510089 Google ScholarDigital Library
- Jingbo Yan, Yuqing Zhang, and Dingning Yang. 2013. Structurized grammar‐based fuzz testing for programs with highly structured inputs. Security and Communication Networks, 6 (2013), 11, https://doi.org/10.1002/sec.714 Google ScholarCross Ref
- Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and Understanding Bugs in C Compilers. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’11). Association for Computing Machinery, New York, NY, USA. 283–294. isbn:9781450306638 https://doi.org/10.1145/1993498.1993532 Google ScholarDigital Library
- Hyunguk Yoo and Taeshik Shon. 2016. Grammar-based adaptive fuzzing: Evaluation on SCADA modbus protocol. 557–563. https://doi.org/10.1109/SmartGridComm.2016.7778820 Google ScholarCross Ref
- Wei You, Xuwei Liu, Shiqing Ma, David Perry, Xiangyu Zhang, and Bin Liang. 2019. SLF: Fuzzing without Valid Seed Inputs. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 712–723. https://doi.org/10.1109/ICSE.2019.00080 Google ScholarDigital Library
- Insu Yun, Sangho Lee, Meng Xu, Yeongjin Jang, and Taesoo Kim. 2018. $QSYM$: A practical concolic execution engine tailored for hybrid fuzzing. In 27th USENIX Security Symposium (USENIX Security 18). 745–761. Google Scholar
- Lei Zhao, Yue Duan, Heng Yin, and Jifeng Xuan. 2019. Send Hardest Problems My Way: Probabilistic Path Prioritization for Hybrid Fuzzing.. In NDSS. https://doi.org/10.14722/ndss.2019.23504 Google ScholarCross Ref
Index Terms
- Rare Path Guided Fuzzing
Recommendations
Guiding Greybox Fuzzing with Mutation Testing
ISSTA 2023: Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and AnalysisGreybox fuzzing and mutation testing are two popular but mostly independent fields of software testing research that have so far had limited overlap. Greybox fuzzing, generally geared towards searching for new bugs, predominantly uses code coverage ...
Fuzzing: A Survey for Roadmap
Fuzz testing (fuzzing) has witnessed its prosperity in detecting security flaws recently. It generates a large number of test cases and monitors the executions for defects. Fuzzing has detected thousands of bugs and vulnerabilities in various ...
Growing A Test Corpus with Bonsai Fuzzing
ICSE '21: Proceedings of the 43rd International Conference on Software EngineeringThis paper presents a coverage-guided grammar-based fuzzing technique for automatically synthesizing a corpus of concise test inputs. We walk-through a case study of a compiler designed for education and the corresponding problem of generating ...
Comments