skip to main content
10.1145/2970276.2970316acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

Model-based whitebox fuzzing for program binaries

Published:25 August 2016Publication History

ABSTRACT

Many real-world programs take highly structured and very complex inputs. The automated testing of such programs is non-trivial. If the test input does not adhere to a specific file format, the program returns a parser error. For symbolic execution-based whitebox fuzzing the corresponding error handling code becomes a significant time sink. Too much time is spent in the parser exploring too many paths leading to trivial parser errors. Naturally, the time is better spent exploring the functional part of the program where failure with valid input exposes deep and real bugs in the program. In this paper, we suggest to leverage information about the file format and the data chunks of existing, valid files to swiftly carry the exploration beyond the parser code. We call our approach Model-based Whitebox Fuzzing (MoWF) because the file format input model of blackbox fuzzers can be exploited as a constraint on the vast input space to rule out most invalid inputs during path exploration in symbolic execution. We evaluate on 13 vulnerabilities in 8 large program binaries with 6 separate file formats and found that MoWF exposes all vulnerabilities while both, traditional whitebox fuzzing and model-based blackbox fuzzing, expose only less than half, respectively. Our experiments also demonstrate that MoWF exposes 70% vulnerabilities without any seed inputs.

References

  1. Specification of the DEFLATE Compression Algorithm. https://tools.ietf.org/html/rfc1951. Accessed: 2016-02-13.Google ScholarGoogle Scholar
  2. Tool: IDA multi-processor disassembler and debugger. https://www.hex-rays.com/products/ida/. Accessed: 2016-04-04.Google ScholarGoogle Scholar
  3. Tool: Peach Fuzzer Platform. http://www.peachfuzzer.com/products/peach-platform/. Accessed: 2016-01-23.Google ScholarGoogle Scholar
  4. Tool: Peach Fuzzer Platform (Input Model). http://community.peachfuzzer.com/v3/DataModeling.html. Accessed: 2016-01-23.Google ScholarGoogle Scholar
  5. Tool: SPIKE Fuzzer Platform. http://www.immunitysec.com. Accessed: 2016-01-23.Google ScholarGoogle Scholar
  6. Tool: Suley Fuzzer. https://github.com/OpenRCE/sulley. Accessed: 2016-01-23.Google ScholarGoogle Scholar
  7. G. Banks, M. Cova, V. Felmetsger, K. Almeroth, R. Kemmerer, and G. Vigna. Snooze: Toward a stateful network protocol fuzzer. In Proceedings of the 9th International Conference on Information Security, ISC’06, pages 343–358, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. N. Bjorner and A.-D. Phan. vz - maximal satisfaction with z3. In T. Kutsia and A. Voronkov, editors, SCSS 2014. 6th International Symposium on Symbolic Computation in Software Science, volume 30 of EPiC Series in Computing, pages 1–9, 2014.Google ScholarGoogle Scholar
  9. M. Böhme and S. Paul. On the efficiency of automated testing. In Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2014, pages 632–642, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Cadar, D. Dunbar, and D. Engler. Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, OSDI’08, pages 209–224, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. V. Chipounov, V. Kuznetsov, and G. Candea. S2e: A platform for in-vivo multi-path analysis of software systems. In Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVI, pages 265–278, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. L. De Moura and N. Bjørner. Z3: An efficient smt solver. In Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS’08/ETAPS’08, pages 337–340, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. V. Ganesh, T. Leek, and M. Rinard. Taint-based directed whitebox fuzzing. In Proceedings of the 31st International Conference on Software Engineering, ICSE ’09, pages 474–484, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. P. Godefroid, A. Kiezun, and M. Y. Levin. Grammar-based whitebox fuzzing. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’08, pages 206–215, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. Godefroid, M. Y. Levin, and D. A. Molnar. Automated whitebox fuzz testing. In Proceedings of the 2008 Network and Distributed System Security Symposium, volume 8 of NDSS ’08, pages 151–166, 2008.Google ScholarGoogle Scholar
  16. I. Haller, A. Slowinska, M. Neugschwandtner, and H. Bos. Dowsing for overflows: A guided fuzzer to find buffer boundary violations. In Proceedings of the 22Nd USENIX Conference on Security, SEC’13, pages 49–64, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. F. M. Kifetew, R. Tiella, and P. Tonella. Generating valid grammar-based test inputs by means of genetic programming and annotated grammars. Empirical Software Engineering, pages 1–34, 2016.Google ScholarGoogle Scholar
  18. S. Y. Kim, S. Cha, and D.-H. Bae. Automatic and lightweight grammar generation for fuzz testing. Comput. Secur., 36:1–11, July 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Z. Lin and X. Zhang. Deriving input syntactic structure from execution. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering, SIGSOFT ’08/FSE-16, pages 83–93, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: Building customized program analysis tools with dynamic instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’05, pages 190–200, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. B. P. Miller, L. Fredriksen, and B. So. An empirical study of the reliability of unix utilities. Commun. ACM, 33(12):32–44, Dec. 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. V.-T. Pham, W. B. Ng, K. Rubinov, and A. Roychoudhury. Hercules: Reproducing crashes in real-world application binaries. In Proceedings of the 37th International Conference on Software Engineering - Volume 1, ICSE ’15, pages 891–901, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. N. Stephens, J. Grosen, C. Salls, A. Dutcher, R. Wang, J. Corbetta, Y. Shoshitaishvili, C. Kruegel, and G. Vigna. Driller: Augmenting fuzzing through selective symbolic execution. In NDSS ’16, pages 1–16, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  24. Tool. LibPNG Library. http://www.libpng.org/pub/png/libpng.html. Accessed: 2016-02-13.Google ScholarGoogle Scholar
  25. Tool. Video Lan Client (VLC). http://www.videolan.org/index.html. Accessed: 2016-02-13.Google ScholarGoogle Scholar
  26. T. Wang, T. Wei, G. Gu, and W. Zou. Taintscope: A checksum-aware directed fuzzing tool for automatic software vulnerability detection. In Proceedings of the 2010 IEEE Symposium on Security and Privacy, SP ’10, pages 497–512, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. X. Wang, L. Zhang, and P. Tanofsky. Experience report: How is dynamic symbolic execution different from manual testing? a study on klee. In Proceedings of the 2015 International Symposium on Software Testing and Analysis, ISSTA 2015, pages 199–210, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Model-based whitebox fuzzing for program binaries

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ASE '16: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering
        August 2016
        899 pages
        ISBN:9781450338455
        DOI:10.1145/2970276
        • General Chair:
        • David Lo,
        • Program Chairs:
        • Sven Apel,
        • Sarfraz Khurshid

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 August 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate82of337submissions,24%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader