research-article

Model-based whitebox fuzzing for program binaries

Authors:
Van-Thuan Pham

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

,
Marcel Böhme

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

,
Abhik Roychoudhury

National University of Singapore, Singapore

National University of Singapore, Singapore
View Profile

ASE '16: Proceedings of the 31st IEEE/ACM International Conference on Automated Software EngineeringAugust 2016Pages 543–553https://doi.org/10.1145/2970276.2970316

Published:25 August 2016Publication History

ASE '16: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering

Pages 543–553

ABSTRACT

Many real-world programs take highly structured and very complex inputs. The automated testing of such programs is non-trivial. If the test input does not adhere to a specific file format, the program returns a parser error. For symbolic execution-based whitebox fuzzing the corresponding error handling code becomes a significant time sink. Too much time is spent in the parser exploring too many paths leading to trivial parser errors. Naturally, the time is better spent exploring the functional part of the program where failure with valid input exposes deep and real bugs in the program. In this paper, we suggest to leverage information about the file format and the data chunks of existing, valid files to swiftly carry the exploration beyond the parser code. We call our approach Model-based Whitebox Fuzzing (MoWF) because the file format input model of blackbox fuzzers can be exploited as a constraint on the vast input space to rule out most invalid inputs during path exploration in symbolic execution. We evaluate on 13 vulnerabilities in 8 large program binaries with 6 separate file formats and found that MoWF exposes all vulnerabilities while both, traditional whitebox fuzzing and model-based blackbox fuzzing, expose only less than half, respectively. Our experiments also demonstrate that MoWF exposes 70% vulnerabilities without any seed inputs.

References

Specification of the DEFLATE Compression Algorithm. https://tools.ietf.org/html/rfc1951. Accessed: 2016-02-13.Google Scholar
Tool: IDA multi-processor disassembler and debugger. https://www.hex-rays.com/products/ida/. Accessed: 2016-04-04.Google Scholar
Tool: Peach Fuzzer Platform. http://www.peachfuzzer.com/products/peach-platform/. Accessed: 2016-01-23.Google Scholar
Tool: Peach Fuzzer Platform (Input Model). http://community.peachfuzzer.com/v3/DataModeling.html. Accessed: 2016-01-23.Google Scholar
Tool: SPIKE Fuzzer Platform. http://www.immunitysec.com. Accessed: 2016-01-23.Google Scholar
Tool: Suley Fuzzer. https://github.com/OpenRCE/sulley. Accessed: 2016-01-23.Google Scholar
G. Banks, M. Cova, V. Felmetsger, K. Almeroth, R. Kemmerer, and G. Vigna. Snooze: Toward a stateful network protocol fuzzer. In Proceedings of the 9th International Conference on Information Security, ISC’06, pages 343–358, 2006. Google ScholarDigital Library
N. Bjorner and A.-D. Phan. vz - maximal satisfaction with z3. In T. Kutsia and A. Voronkov, editors, SCSS 2014. 6th International Symposium on Symbolic Computation in Software Science, volume 30 of EPiC Series in Computing, pages 1–9, 2014.Google Scholar
M. Böhme and S. Paul. On the efficiency of automated testing. In Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2014, pages 632–642, 2014. Google ScholarDigital Library
C. Cadar, D. Dunbar, and D. Engler. Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, OSDI’08, pages 209–224, 2008. Google ScholarDigital Library
V. Chipounov, V. Kuznetsov, and G. Candea. S2e: A platform for in-vivo multi-path analysis of software systems. In Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVI, pages 265–278, 2011. Google ScholarDigital Library
L. De Moura and N. Bjørner. Z3: An efficient smt solver. In Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS’08/ETAPS’08, pages 337–340, 2008. Google ScholarDigital Library
V. Ganesh, T. Leek, and M. Rinard. Taint-based directed whitebox fuzzing. In Proceedings of the 31st International Conference on Software Engineering, ICSE ’09, pages 474–484, 2009. Google ScholarDigital Library
P. Godefroid, A. Kiezun, and M. Y. Levin. Grammar-based whitebox fuzzing. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’08, pages 206–215, 2008. Google ScholarDigital Library
P. Godefroid, M. Y. Levin, and D. A. Molnar. Automated whitebox fuzz testing. In Proceedings of the 2008 Network and Distributed System Security Symposium, volume 8 of NDSS ’08, pages 151–166, 2008.Google Scholar
I. Haller, A. Slowinska, M. Neugschwandtner, and H. Bos. Dowsing for overflows: A guided fuzzer to find buffer boundary violations. In Proceedings of the 22Nd USENIX Conference on Security, SEC’13, pages 49–64, 2013. Google ScholarDigital Library
F. M. Kifetew, R. Tiella, and P. Tonella. Generating valid grammar-based test inputs by means of genetic programming and annotated grammars. Empirical Software Engineering, pages 1–34, 2016.Google Scholar
S. Y. Kim, S. Cha, and D.-H. Bae. Automatic and lightweight grammar generation for fuzz testing. Comput. Secur., 36:1–11, July 2013. Google ScholarDigital Library
Z. Lin and X. Zhang. Deriving input syntactic structure from execution. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering, SIGSOFT ’08/FSE-16, pages 83–93, 2008. Google ScholarDigital Library
C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: Building customized program analysis tools with dynamic instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’05, pages 190–200, 2005. Google ScholarDigital Library
B. P. Miller, L. Fredriksen, and B. So. An empirical study of the reliability of unix utilities. Commun. ACM, 33(12):32–44, Dec. 1990. Google ScholarDigital Library
V.-T. Pham, W. B. Ng, K. Rubinov, and A. Roychoudhury. Hercules: Reproducing crashes in real-world application binaries. In Proceedings of the 37th International Conference on Software Engineering - Volume 1, ICSE ’15, pages 891–901, 2015. Google ScholarDigital Library
N. Stephens, J. Grosen, C. Salls, A. Dutcher, R. Wang, J. Corbetta, Y. Shoshitaishvili, C. Kruegel, and G. Vigna. Driller: Augmenting fuzzing through selective symbolic execution. In NDSS ’16, pages 1–16, 2016.Google ScholarCross Ref
Tool. LibPNG Library. http://www.libpng.org/pub/png/libpng.html. Accessed: 2016-02-13.Google Scholar
Tool. Video Lan Client (VLC). http://www.videolan.org/index.html. Accessed: 2016-02-13.Google Scholar
T. Wang, T. Wei, G. Gu, and W. Zou. Taintscope: A checksum-aware directed fuzzing tool for automatic software vulnerability detection. In Proceedings of the 2010 IEEE Symposium on Security and Privacy, SP ’10, pages 497–512, 2010. Google ScholarDigital Library
X. Wang, L. Zhang, and P. Tanofsky. Experience report: How is dynamic symbolic execution different from manual testing? a study on klee. In Proceedings of the 2015 International Symposium on Software Testing and Analysis, ISSTA 2015, pages 199–210, 2015. Google ScholarDigital Library

Index Terms

Model-based whitebox fuzzing for program binaries
1. Security and privacy
  1. Systems security
    1. Vulnerability management
      1. Vulnerability scanners
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

Grammar-based whitebox fuzzing
PLDI '08: Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation

Whitebox fuzzing is a form of automatic dynamic test generation, based on symbolic execution and constraint solving, designed for security testing of large applications. Unfortunately, the current effectiveness of whitebox fuzzing is limited when ...
Read More
Grammar-based whitebox fuzzing
PLDI '08

Whitebox fuzzing is a form of automatic dynamic test generation, based on symbolic execution and constraint solving, designed for security testing of large applications. Unfortunately, the current effectiveness of whitebox fuzzing is limited when ...
Read More
LLVM-based Hybrid Fuzzing with LibKluzzer (Competition Contribution)
Fundamental Approaches to Software Engineering
Abstract
LibKluzzer is a novel implementation of hybrid fuzzing, which combines the strengths of coverage-guided fuzzing and dynamic symbolic execution (a.k.a. whitebox fuzzing). While coverage-guided fuzzing can discover new execution paths at nearly ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ASE '16: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering
August 2016
899 pages
ISBN:9781450338455
DOI:10.1145/2970276
General Chair:
David Lo
Singapore Management University, Singapore
,
Program Chairs:
Sven Apel
University of Passau, Germany
,
Sarfraz Khurshid
University of Texas at Austin, USA
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 August 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Program Binaries
Symbolic Execution
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate82of337submissions,24%
Upcoming Conference
ASE '24

Sponsor:

sigsoft online

sigsoft online

ASE '24: 39th IEEE/ACM International Conference on Automated Software Engineering

October 27 - November 1, 2024

Sacramento , CA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 59
  Total Citations
  View Citations
- 662
  Total Downloads
- Downloads (Last 12 months)55
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.