ABSTRACT
Many real-world programs take highly structured and very complex inputs. The automated testing of such programs is non-trivial. If the test input does not adhere to a specific file format, the program returns a parser error. For symbolic execution-based whitebox fuzzing the corresponding error handling code becomes a significant time sink. Too much time is spent in the parser exploring too many paths leading to trivial parser errors. Naturally, the time is better spent exploring the functional part of the program where failure with valid input exposes deep and real bugs in the program. In this paper, we suggest to leverage information about the file format and the data chunks of existing, valid files to swiftly carry the exploration beyond the parser code. We call our approach Model-based Whitebox Fuzzing (MoWF) because the file format input model of blackbox fuzzers can be exploited as a constraint on the vast input space to rule out most invalid inputs during path exploration in symbolic execution. We evaluate on 13 vulnerabilities in 8 large program binaries with 6 separate file formats and found that MoWF exposes all vulnerabilities while both, traditional whitebox fuzzing and model-based blackbox fuzzing, expose only less than half, respectively. Our experiments also demonstrate that MoWF exposes 70% vulnerabilities without any seed inputs.
- Specification of the DEFLATE Compression Algorithm. https://tools.ietf.org/html/rfc1951. Accessed: 2016-02-13.Google Scholar
- Tool: IDA multi-processor disassembler and debugger. https://www.hex-rays.com/products/ida/. Accessed: 2016-04-04.Google Scholar
- Tool: Peach Fuzzer Platform. http://www.peachfuzzer.com/products/peach-platform/. Accessed: 2016-01-23.Google Scholar
- Tool: Peach Fuzzer Platform (Input Model). http://community.peachfuzzer.com/v3/DataModeling.html. Accessed: 2016-01-23.Google Scholar
- Tool: SPIKE Fuzzer Platform. http://www.immunitysec.com. Accessed: 2016-01-23.Google Scholar
- Tool: Suley Fuzzer. https://github.com/OpenRCE/sulley. Accessed: 2016-01-23.Google Scholar
- G. Banks, M. Cova, V. Felmetsger, K. Almeroth, R. Kemmerer, and G. Vigna. Snooze: Toward a stateful network protocol fuzzer. In Proceedings of the 9th International Conference on Information Security, ISC’06, pages 343–358, 2006. Google ScholarDigital Library
- N. Bjorner and A.-D. Phan. vz - maximal satisfaction with z3. In T. Kutsia and A. Voronkov, editors, SCSS 2014. 6th International Symposium on Symbolic Computation in Software Science, volume 30 of EPiC Series in Computing, pages 1–9, 2014.Google Scholar
- M. Böhme and S. Paul. On the efficiency of automated testing. In Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2014, pages 632–642, 2014. Google ScholarDigital Library
- C. Cadar, D. Dunbar, and D. Engler. Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, OSDI’08, pages 209–224, 2008. Google ScholarDigital Library
- V. Chipounov, V. Kuznetsov, and G. Candea. S2e: A platform for in-vivo multi-path analysis of software systems. In Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVI, pages 265–278, 2011. Google ScholarDigital Library
- L. De Moura and N. Bjørner. Z3: An efficient smt solver. In Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS’08/ETAPS’08, pages 337–340, 2008. Google ScholarDigital Library
- V. Ganesh, T. Leek, and M. Rinard. Taint-based directed whitebox fuzzing. In Proceedings of the 31st International Conference on Software Engineering, ICSE ’09, pages 474–484, 2009. Google ScholarDigital Library
- P. Godefroid, A. Kiezun, and M. Y. Levin. Grammar-based whitebox fuzzing. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’08, pages 206–215, 2008. Google ScholarDigital Library
- P. Godefroid, M. Y. Levin, and D. A. Molnar. Automated whitebox fuzz testing. In Proceedings of the 2008 Network and Distributed System Security Symposium, volume 8 of NDSS ’08, pages 151–166, 2008.Google Scholar
- I. Haller, A. Slowinska, M. Neugschwandtner, and H. Bos. Dowsing for overflows: A guided fuzzer to find buffer boundary violations. In Proceedings of the 22Nd USENIX Conference on Security, SEC’13, pages 49–64, 2013. Google ScholarDigital Library
- F. M. Kifetew, R. Tiella, and P. Tonella. Generating valid grammar-based test inputs by means of genetic programming and annotated grammars. Empirical Software Engineering, pages 1–34, 2016.Google Scholar
- S. Y. Kim, S. Cha, and D.-H. Bae. Automatic and lightweight grammar generation for fuzz testing. Comput. Secur., 36:1–11, July 2013. Google ScholarDigital Library
- Z. Lin and X. Zhang. Deriving input syntactic structure from execution. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering, SIGSOFT ’08/FSE-16, pages 83–93, 2008. Google ScholarDigital Library
- C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: Building customized program analysis tools with dynamic instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’05, pages 190–200, 2005. Google ScholarDigital Library
- B. P. Miller, L. Fredriksen, and B. So. An empirical study of the reliability of unix utilities. Commun. ACM, 33(12):32–44, Dec. 1990. Google ScholarDigital Library
- V.-T. Pham, W. B. Ng, K. Rubinov, and A. Roychoudhury. Hercules: Reproducing crashes in real-world application binaries. In Proceedings of the 37th International Conference on Software Engineering - Volume 1, ICSE ’15, pages 891–901, 2015. Google ScholarDigital Library
- N. Stephens, J. Grosen, C. Salls, A. Dutcher, R. Wang, J. Corbetta, Y. Shoshitaishvili, C. Kruegel, and G. Vigna. Driller: Augmenting fuzzing through selective symbolic execution. In NDSS ’16, pages 1–16, 2016.Google ScholarCross Ref
- Tool. LibPNG Library. http://www.libpng.org/pub/png/libpng.html. Accessed: 2016-02-13.Google Scholar
- Tool. Video Lan Client (VLC). http://www.videolan.org/index.html. Accessed: 2016-02-13.Google Scholar
- T. Wang, T. Wei, G. Gu, and W. Zou. Taintscope: A checksum-aware directed fuzzing tool for automatic software vulnerability detection. In Proceedings of the 2010 IEEE Symposium on Security and Privacy, SP ’10, pages 497–512, 2010. Google ScholarDigital Library
- X. Wang, L. Zhang, and P. Tanofsky. Experience report: How is dynamic symbolic execution different from manual testing? a study on klee. In Proceedings of the 2015 International Symposium on Software Testing and Analysis, ISSTA 2015, pages 199–210, 2015. Google ScholarDigital Library
Index Terms
- Model-based whitebox fuzzing for program binaries
Recommendations
Grammar-based whitebox fuzzing
PLDI '08: Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and ImplementationWhitebox fuzzing is a form of automatic dynamic test generation, based on symbolic execution and constraint solving, designed for security testing of large applications. Unfortunately, the current effectiveness of whitebox fuzzing is limited when ...
Grammar-based whitebox fuzzing
PLDI '08Whitebox fuzzing is a form of automatic dynamic test generation, based on symbolic execution and constraint solving, designed for security testing of large applications. Unfortunately, the current effectiveness of whitebox fuzzing is limited when ...
LLVM-based Hybrid Fuzzing with LibKluzzer (Competition Contribution)
Fundamental Approaches to Software EngineeringAbstractLibKluzzer is a novel implementation of hybrid fuzzing, which combines the strengths of coverage-guided fuzzing and dynamic symbolic execution (a.k.a. whitebox fuzzing). While coverage-guided fuzzing can discover new execution paths at nearly ...
Comments