skip to main content
10.1145/3338906.3338932acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Finding and understanding bugs in software model checkers

Published: 12 August 2019 Publication History

Abstract

Software Model Checking (SMC) is a well-known automatic program verification technique and frequently adopted for checking safety-critical software. Thus, the reliability of SMC tools themselves (i.e., software model checkers) is critical. However, little work exists on validating software model checkers, an important problem that this paper tackles by introducing a practical, automated fuzzing technique. For its simplicity and generality, we focus on control-flow reachability (e.g., whether or how many times a branch is reached) and address two specific challenges for effective fuzzing: oracle and scalability. Given a deterministic program, we (1) leverage its concrete executions to synthesize valid branch reachability properties (thus solving the oracle problem) and (2) fuse such individual properties into a single safety property (thus improving the scalability of fuzzing and reducing manual inspection). We have realized our approach as the MCFuzz tool and applied it to extensively test three state-of-the-art C software model checkers, CPAchecker, CBMC, and SeaHorn. MCFuzz has found 62 unique bugs in all three model checkers -- 58 have been confirmed, and 20 have been fixed. We have further analyzed and categorized these bugs (which are diverse), and summarized several lessons for building reliable and robust model checkers. Our testing effort has been well-appreciated by the model checker developers, and also led to improved tool usability and documentation.

References

[1]
Anonymous. 2019. List of bugs found by MCFuzz. Retrieved Feb. 2019 from https://github.com/MCFuzzer
[2]
Thomas Ball, Rupak Majumdar, Todd Millstein, and Sriram K. Rajamani. 2001. Automatic predicate abstraction of C programs. In PLDI. 203–213.
[3]
Thomas Ball and Sriram K. Rajamani. 2002. The SLAM project: debugging system software via static analysis. In POPL. 1–3.
[4]
Earl T. Barr, Mark Harman, Phil McMinn, Muzammil Shahbaz, and Shin Yoo. 2015. The oracle problem in software testing: A survey. IEEE TSE 41, 5 (2015), 507–525.
[5]
Dirk Beyer. 2017. Software Verification with Validation of Results (Report on SV-COMP 2017). In TACAS. 331–349.
[6]
Dirk Beyer, Adam Chlipala, Thomas A. Henzinger, Ranjit Jhala, and Rupak Majumdar. 2004. Generating Tests from Counterexamples. In ICSE. 326–335.
[7]
Dirk Beyer, Thomas A. Henzinger, Ranjit Jhala, and Rupak Majumdar. 2007. The software model checker Blast. STTT 9, 5-6 (2007), 505–525.
[8]
Dirk Beyer and M Erkan Keremoglu. 2011. CPAchecker: A tool for configurable software verification. In CAV. 184–190.
[9]
Dirk Beyer and Thomas Lemberger. 2017. Software Verification: Testing vs. Model Checking. In HVC. 99–114.
[10]
Armin Biere, Alessandro Cimatti, Edmund Clarke, and Yunshan Zhu. 1999. Symbolic model checking without BDDs. In TACAS. 193–207.
[11]
Aaron R. Bradley. 2011. SAT-based model checking without unrolling. In VMCAI. 70–87.
[12]
Aaron R. Bradley. 2012. IC3 and beyond: Incremental, Inductive Verification. In CAV. 4.
[13]
Alexandra Bugariu, Valentin Wüstholz, Maria Christakis, and Peter Müller. 2018. Automatically testing implementations of numerical abstract domains. In ASE. 768–778.
[14]
Cristian Cadar and Alastair F. Donaldson. 2016. Analysing the program analyser. In ICSE Companion. 765–768.
[15]
Tsong Y. Chen, Shing C. Cheung, and Shiu Ming Yiu. 1998. Metamorphic testing: a new approach for generating next test cases. Technical Report. HKUST-CS98-01, Hong Kong University of Science and Technology.
[16]
Alessandro Cimatti and Alberto Griggio. 2012. Software model checking via IC3. In CAV. 277–293.
[17]
Edmund Clarke, Orna Grumberg, Somesh Jha, Yuan Lu, and Helmut Veith. 2003. Counterexample-guided abstraction refinement for symbolic model checking. JACM 50, 5 (2003), 752–794.
[18]
Edmund Clarke, Daniel Kroening, and Flavio Lerda. 2004. A tool for checking ANSI-C programs. In TACAS. 168–176.
[19]
Patrick Cousot and Radhia Cousot. 1977. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In POPL. 238–252.
[20]
Pascal Cuoq, Benjamin Monate, Anne Pacalet, Virgile Prevosto, John Regehr, Boris Yakobowski, and Xuejun Yang. 2012. Testing Static Analyzers with Randomly Generated Programs. In NFM. 120–125.
[21]
Brett Daniel, Danny Dig, Kely Garcia, and Darko Marinov. 2007. Automated testing of refactoring engines. In ESEC/FSE. 185–194.
[22]
Alastair F. Donaldson, Leopold Haller, Daniel Kroening, and Philipp Rümmer. 2011. Software verification using k-induction. In SAS. 351–368.
[23]
Vijay D’Silva, Daniel Kroening, and Georg Weissenbacher. 2008. A survey of automated techniques for formal software verification. IEEE TCAD 27, 7 (2008), 1165–1178.
[24]
Facebook. 2019. Infer static analyzer. Retrieved Feb. 2019 from http://fbinfer.com/
[25]
Gordon Fraser, Franz Wotawa, and Paul Ammann. 2009. Testing with model checkers: a survey. STVR 19, 3 (2009), 215–261.
[26]
Susanne Graf and Hassen Saïdi. 1997. Construction of abstract state graphs with PVS. In CAV. 72–83.
[27]
Arie Gurfinkel, Temesghen Kahsai, Anvesh Komuravelli, and Jorge A. Navas. 2015. The SeaHorn verification framework. In CAV. 343–361.
[28]
Michael Hind. 2001. Pointer analysis: Haven’t we solved this problem yet?. In PASTE. 54–61.
[29]
Ranjit Jhala and Rupak Majumdar. 2009. Software model checking. ACM CSUR 41, 4 (2009), 21.
[30]
Timotej Kapus and Cristian Cadar. 2017. Automatic testing of symbolic execution engines via program generation and differential testing. In ASE. 590–600.
[31]
Alexey V. Khoroshilov, Vadim S. Mutilin, Alexandre Petrenko, and Vladimir Zakharov. 2009. Establishing Linux Driver Verification Process. In PSI. 165–176.
[32]
Anvesh Komuravelli, Arie Gurfinkel, Sagar Chaki, and Edmund M Clarke. 2013. Automatic abstraction in SMT-based unbounded software model checking. In CAV. 846–862.
[33]
Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO. 75.
[34]
Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler validation via equivalence modulo inputs. In PLDI. 216–226.
[35]
Vu Le, Chengnian Sun, and Zhendong Su. 2015. Finding Deep Compiler Bugs via Guided Stochastic Program Mutation. In OOPSLA. 386–399.
[36]
Xavier Leroy. 2009. Formal verification of a realistic compiler. CACM 52, 7 (2009), 107–115.
[37]
Christopher Lidbury, Andrei Lascu, Nathan Chong, and Alastair F. Donaldson. 2015. Many-core compiler fuzzing. PLDI (2015), 65–76.
[38]
William M. McKeeman. 1998. Differential Testing for Software. Digital Technical Journal 10, 1 (1998), 100–107.
[39]
Jan Midtgaard and Anders Møller. 2017. Quickchecking static analysis properties. STVR 27, 6 (2017).
[40]
Felix Pauck, Eric Bodden, and Heike Wehrheim. 2018. Do Android Taint Analysis Tools Keep Their Promises?. In ESEC/FSE. 331–341.
[41]
Zvonimir Pavlinovic, Akash Lal, and Rahul Sharma. 2016. Inferring annotations for device drivers from verification histories. In ASE. 450–460.
[42]
Lina Qiu, Yingying Wang, and Julia Rubin. 2018. Analyzing the Analyzers: FlowDroid/IccTA, AmanDroid, and DroidSafe. In ISSTA. 176–186.
[43]
Chengnian Sun, Vu Le, and Zhendong Su. 2016. Finding compiler bugs via live code mutation. In OOPSLA. 849–863.
[44]
SV-COMP. 2019. Competition on Software Verification (SV-COMP). Retrieved Feb. 2019 from https://sv-comp.sosy-lab.org/2019/
[45]
The Clang Team. 2019. Clang 9 documentation: LibTooling. Retrieved Feb. 2019 from https://clang.llvm.org/docs/LibTooling.html
[46]
The CompCert Team. 2019. Using the CompCert C interpreter. Retrieved Feb. 2019 from http://compcert.inria.fr/man/manual004.html
[47]
The Eclipse Team. 2019. Eclipse CDT (C/C++ Development Tooling). Retrieved Feb. 2019 from https://www.eclipse.org/cdt/
[48]
The GCC Team. 2019. GCC, the GNU Compiler Collection. Retrieved Feb. 2019 from https://gcc.gnu.org/
[49]
The ISO Team. 2019. ISO/IEC 9899:2018:Information technology – Programming languages – C. Retrieved Feb. 2019 from https://www.iso.org/standard/74528.
[51]
The SVCOMP Team. 2019. 7th Competition on Software Verification (SVCOMP). Retrieved Feb. 2019 from https://sv-comp.sosy-lab.org/2018/results/ results-verified/
[52]
The SMTInterpol Team. 2019. SMTInterpol: an Interpolating SMT Solver. Retrieved Feb. 2019 from https://ultimate.informatik.uni-freiburg.de/smtinterpol/
[53]
Jingyue Wu, Gang Hu, Yang Tang, and Junfeng Yang. 2013. Effective dynamic detection of alias analysis errors. In ESEC/FSE. 279–289.
[54]
Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In PLDI. 283–294.

Cited By

View all
  • (2024)SMT2Test: From SMT Formulas to Effective Test CasesProceedings of the ACM on Programming Languages10.1145/36897198:OOPSLA2(222-245)Online publication date: 8-Oct-2024
  • (2024)Finding and Understanding Defects in Static Analyzers by Constructing Automated OraclesProceedings of the ACM on Software Engineering10.1145/36607811:FSE(1656-1678)Online publication date: 12-Jul-2024
  • (2024)An Examination of the Effectiveness of SMOTE-Based Algorithms on Software Defect PredictionAnalytics, Machine Learning, and Artificial Intelligence10.1007/978-3-031-75157-8_14(204-216)Online publication date: 21-Nov-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEC/FSE 2019: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
August 2019
1264 pages
ISBN:9781450355728
DOI:10.1145/3338906
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Fuzz Testing
  2. Software Model Checking
  3. Software Testing

Qualifiers

  • Research-article

Conference

ESEC/FSE '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)47
  • Downloads (Last 6 weeks)3
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)SMT2Test: From SMT Formulas to Effective Test CasesProceedings of the ACM on Programming Languages10.1145/36897198:OOPSLA2(222-245)Online publication date: 8-Oct-2024
  • (2024)Finding and Understanding Defects in Static Analyzers by Constructing Automated OraclesProceedings of the ACM on Software Engineering10.1145/36607811:FSE(1656-1678)Online publication date: 12-Jul-2024
  • (2024)An Examination of the Effectiveness of SMOTE-Based Algorithms on Software Defect PredictionAnalytics, Machine Learning, and Artificial Intelligence10.1007/978-3-031-75157-8_14(204-216)Online publication date: 21-Nov-2024
  • (2023)Time-dependent entropic analysis of software bugsInternational Journal of System Assurance Engineering and Management10.1007/s13198-023-01976-314:5(1718-1725)Online publication date: 16-Jun-2023
  • (2023)Robustness Testing of Software VerifiersSoftware Engineering and Formal Methods10.1007/978-3-031-47115-5_5(66-84)Online publication date: 31-Oct-2023
  • (2023)Software Fault Localization: an Overview of Research, Techniques, and ToolsHandbook of Software Fault Localization10.1002/9781119880929.ch1(1-117)Online publication date: 21-Apr-2023
  • (2022)SEDiff: scope-aware differential fuzzing to test internal function models in symbolic executionProceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3540250.3549080(57-69)Online publication date: 7-Nov-2022
  • (2022)Configurable Benchmarks for C Model CheckersNASA Formal Methods10.1007/978-3-031-06773-0_18(338-354)Online publication date: 24-May-2022
  • (2021)Feedback-Guided Circuit Structure Mutation for Testing Hardware Model Checkers2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)10.1109/ICCAD51958.2021.9643509(1-9)Online publication date: 1-Nov-2021
  • (2021)An Empirical Study of the Reliability of High-Level Synthesis Tools2021 IEEE 29th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)10.1109/FCCM51124.2021.00034(219-223)Online publication date: May-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media