research-article

Finding and understanding bugs in software model checkers

Authors:

Zhendong SuAuthors Info & Claims

ESEC/FSE 2019: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Pages 763 - 773

https://doi.org/10.1145/3338906.3338932

Published: 12 August 2019 Publication History

Abstract

Software Model Checking (SMC) is a well-known automatic program verification technique and frequently adopted for checking safety-critical software. Thus, the reliability of SMC tools themselves (i.e., software model checkers) is critical. However, little work exists on validating software model checkers, an important problem that this paper tackles by introducing a practical, automated fuzzing technique. For its simplicity and generality, we focus on control-flow reachability (e.g., whether or how many times a branch is reached) and address two specific challenges for effective fuzzing: oracle and scalability. Given a deterministic program, we (1) leverage its concrete executions to synthesize valid branch reachability properties (thus solving the oracle problem) and (2) fuse such individual properties into a single safety property (thus improving the scalability of fuzzing and reducing manual inspection). We have realized our approach as the MCFuzz tool and applied it to extensively test three state-of-the-art C software model checkers, CPAchecker, CBMC, and SeaHorn. MCFuzz has found 62 unique bugs in all three model checkers -- 58 have been confirmed, and 20 have been fixed. We have further analyzed and categorized these bugs (which are diverse), and summarized several lessons for building reliable and robust model checkers. Our testing effort has been well-appreciated by the model checker developers, and also led to improved tool usability and documentation.

References

[1]

Anonymous. 2019. List of bugs found by MCFuzz. Retrieved Feb. 2019 from https://github.com/MCFuzzer

[2]

Thomas Ball, Rupak Majumdar, Todd Millstein, and Sriram K. Rajamani. 2001. Automatic predicate abstraction of C programs. In PLDI. 203–213.

Digital Library

[3]

Thomas Ball and Sriram K. Rajamani. 2002. The SLAM project: debugging system software via static analysis. In POPL. 1–3.

Digital Library

[4]

Earl T. Barr, Mark Harman, Phil McMinn, Muzammil Shahbaz, and Shin Yoo. 2015. The oracle problem in software testing: A survey. IEEE TSE 41, 5 (2015), 507–525.

Digital Library

[5]

Dirk Beyer. 2017. Software Verification with Validation of Results (Report on SV-COMP 2017). In TACAS. 331–349.

Digital Library

[6]

Dirk Beyer, Adam Chlipala, Thomas A. Henzinger, Ranjit Jhala, and Rupak Majumdar. 2004. Generating Tests from Counterexamples. In ICSE. 326–335.

Digital Library

[7]

Dirk Beyer, Thomas A. Henzinger, Ranjit Jhala, and Rupak Majumdar. 2007. The software model checker Blast. STTT 9, 5-6 (2007), 505–525.

Digital Library

[8]

Dirk Beyer and M Erkan Keremoglu. 2011. CPAchecker: A tool for configurable software verification. In CAV. 184–190.

Digital Library

[9]

Dirk Beyer and Thomas Lemberger. 2017. Software Verification: Testing vs. Model Checking. In HVC. 99–114.

[10]

Armin Biere, Alessandro Cimatti, Edmund Clarke, and Yunshan Zhu. 1999. Symbolic model checking without BDDs. In TACAS. 193–207.

Digital Library

[11]

Aaron R. Bradley. 2011. SAT-based model checking without unrolling. In VMCAI. 70–87.

Digital Library

[12]

Aaron R. Bradley. 2012. IC3 and beyond: Incremental, Inductive Verification. In CAV. 4.

Digital Library

[13]

Alexandra Bugariu, Valentin Wüstholz, Maria Christakis, and Peter Müller. 2018. Automatically testing implementations of numerical abstract domains. In ASE. 768–778.

Digital Library

[14]

Cristian Cadar and Alastair F. Donaldson. 2016. Analysing the program analyser. In ICSE Companion. 765–768.

Digital Library

[15]

Tsong Y. Chen, Shing C. Cheung, and Shiu Ming Yiu. 1998. Metamorphic testing: a new approach for generating next test cases. Technical Report. HKUST-CS98-01, Hong Kong University of Science and Technology.

[16]

Alessandro Cimatti and Alberto Griggio. 2012. Software model checking via IC3. In CAV. 277–293.

Digital Library

[17]

Edmund Clarke, Orna Grumberg, Somesh Jha, Yuan Lu, and Helmut Veith. 2003. Counterexample-guided abstraction refinement for symbolic model checking. JACM 50, 5 (2003), 752–794.

Digital Library

[18]

Edmund Clarke, Daniel Kroening, and Flavio Lerda. 2004. A tool for checking ANSI-C programs. In TACAS. 168–176.

[19]

Patrick Cousot and Radhia Cousot. 1977. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In POPL. 238–252.

Digital Library

[20]

Pascal Cuoq, Benjamin Monate, Anne Pacalet, Virgile Prevosto, John Regehr, Boris Yakobowski, and Xuejun Yang. 2012. Testing Static Analyzers with Randomly Generated Programs. In NFM. 120–125.

Digital Library

[21]

Brett Daniel, Danny Dig, Kely Garcia, and Darko Marinov. 2007. Automated testing of refactoring engines. In ESEC/FSE. 185–194.

Digital Library

[22]

Alastair F. Donaldson, Leopold Haller, Daniel Kroening, and Philipp Rümmer. 2011. Software verification using k-induction. In SAS. 351–368.

Digital Library

[23]

Vijay D’Silva, Daniel Kroening, and Georg Weissenbacher. 2008. A survey of automated techniques for formal software verification. IEEE TCAD 27, 7 (2008), 1165–1178.

Digital Library

[24]

Facebook. 2019. Infer static analyzer. Retrieved Feb. 2019 from http://fbinfer.com/

[25]

Gordon Fraser, Franz Wotawa, and Paul Ammann. 2009. Testing with model checkers: a survey. STVR 19, 3 (2009), 215–261.

Digital Library

[26]

Susanne Graf and Hassen Saïdi. 1997. Construction of abstract state graphs with PVS. In CAV. 72–83.

Digital Library

[27]

Arie Gurfinkel, Temesghen Kahsai, Anvesh Komuravelli, and Jorge A. Navas. 2015. The SeaHorn verification framework. In CAV. 343–361.

[28]

Michael Hind. 2001. Pointer analysis: Haven’t we solved this problem yet?. In PASTE. 54–61.

Digital Library

[29]

Ranjit Jhala and Rupak Majumdar. 2009. Software model checking. ACM CSUR 41, 4 (2009), 21.

Digital Library

[30]

Timotej Kapus and Cristian Cadar. 2017. Automatic testing of symbolic execution engines via program generation and differential testing. In ASE. 590–600.

Digital Library

[31]

Alexey V. Khoroshilov, Vadim S. Mutilin, Alexandre Petrenko, and Vladimir Zakharov. 2009. Establishing Linux Driver Verification Process. In PSI. 165–176.

Digital Library

[32]

Anvesh Komuravelli, Arie Gurfinkel, Sagar Chaki, and Edmund M Clarke. 2013. Automatic abstraction in SMT-based unbounded software model checking. In CAV. 846–862.

[33]

Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO. 75.

Digital Library

[34]

Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler validation via equivalence modulo inputs. In PLDI. 216–226.

Digital Library

[35]

Vu Le, Chengnian Sun, and Zhendong Su. 2015. Finding Deep Compiler Bugs via Guided Stochastic Program Mutation. In OOPSLA. 386–399.

Digital Library

[36]

Xavier Leroy. 2009. Formal verification of a realistic compiler. CACM 52, 7 (2009), 107–115.

Digital Library

[37]

Christopher Lidbury, Andrei Lascu, Nathan Chong, and Alastair F. Donaldson. 2015. Many-core compiler fuzzing. PLDI (2015), 65–76.

Digital Library

[38]

William M. McKeeman. 1998. Differential Testing for Software. Digital Technical Journal 10, 1 (1998), 100–107.

[39]

Jan Midtgaard and Anders Møller. 2017. Quickchecking static analysis properties. STVR 27, 6 (2017).

[40]

Felix Pauck, Eric Bodden, and Heike Wehrheim. 2018. Do Android Taint Analysis Tools Keep Their Promises?. In ESEC/FSE. 331–341.

Digital Library

[41]

Zvonimir Pavlinovic, Akash Lal, and Rahul Sharma. 2016. Inferring annotations for device drivers from verification histories. In ASE. 450–460.

Digital Library

[42]

Lina Qiu, Yingying Wang, and Julia Rubin. 2018. Analyzing the Analyzers: FlowDroid/IccTA, AmanDroid, and DroidSafe. In ISSTA. 176–186.

Digital Library

[43]

Chengnian Sun, Vu Le, and Zhendong Su. 2016. Finding compiler bugs via live code mutation. In OOPSLA. 849–863.

Digital Library

[44]

SV-COMP. 2019. Competition on Software Verification (SV-COMP). Retrieved Feb. 2019 from https://sv-comp.sosy-lab.org/2019/

[45]

The Clang Team. 2019. Clang 9 documentation: LibTooling. Retrieved Feb. 2019 from https://clang.llvm.org/docs/LibTooling.html

[46]

The CompCert Team. 2019. Using the CompCert C interpreter. Retrieved Feb. 2019 from http://compcert.inria.fr/man/manual004.html

[47]

The Eclipse Team. 2019. Eclipse CDT (C/C++ Development Tooling). Retrieved Feb. 2019 from https://www.eclipse.org/cdt/

[48]

The GCC Team. 2019. GCC, the GNU Compiler Collection. Retrieved Feb. 2019 from https://gcc.gnu.org/

[49]

The ISO Team. 2019. ISO/IEC 9899:2018:Information technology – Programming languages – C. Retrieved Feb. 2019 from https://www.iso.org/standard/74528.

[50]

html

[51]

The SVCOMP Team. 2019. 7th Competition on Software Verification (SVCOMP). Retrieved Feb. 2019 from https://sv-comp.sosy-lab.org/2018/results/ results-verified/

[52]

The SMTInterpol Team. 2019. SMTInterpol: an Interpolating SMT Solver. Retrieved Feb. 2019 from https://ultimate.informatik.uni-freiburg.de/smtinterpol/

[53]

Jingyue Wu, Gang Hu, Yang Tang, and Junfeng Yang. 2013. Effective dynamic detection of alias analysis errors. In ESEC/FSE. 279–289.

Digital Library

[54]

Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In PLDI. 283–294.

Digital Library

Cited By

Zhang CSu Z(2024)SMT2Test: From SMT Formulas to Effective Test CasesProceedings of the ACM on Programming Languages10.1145/36897198:OOPSLA2(222-245)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689719
He WDi PMing MZhang CSu TLi SSui Y(2024)Finding and Understanding Defects in Static Analyzers by Constructing Automated OraclesProceedings of the ACM on Software Engineering10.1145/36607811:FSE(1656-1678)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660781
Ghosh SDatta SMajumder S(2024)An Examination of the Effectiveness of SMOTE-Based Algorithms on Software Defect PredictionAnalytics, Machine Learning, and Artificial Intelligence10.1007/978-3-031-75157-8_14(204-216)Online publication date: 21-Nov-2024
https://doi.org/10.1007/978-3-031-75157-8_14
Show More Cited By

Index Terms

Finding and understanding bugs in software model checkers
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging
  2. Software organization and properties
    1. Extra-functional properties
      1. Software reliability
    2. Software functional properties
      1. Formal methods
        Model checking

Recommendations

A Comparative Study of Software Model Checkers as Unit Testing Tools: An Industrial Case Study

Conventional testing methods often fail to detect hidden flaws in complex embedded software such as device drivers or file systems. This deficiency incurs significant development and support/maintenance cost for the manufacturers. Model checking ...
Ultimate TestGen: Test-Case Generation with Automata-based Software Model Checking (Competition Contribution)
Fundamental Approaches to Software Engineering
Abstract
We introduce Ultimate TestGen, a novel tool for automatic test-case generation. Like many other test-case generators, Ultimate TestGen builds on verification technology, i.e., it checks the (un)reachability of test goals and generates test cases ...
Testing with model checkers: a survey

About a decade after the initial proposal to use model checkers for the generation of test cases we take a look at the results in this field of research. Model checkers are formal verification tools, capable of providing counterexamples to violated ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ESEC/FSE 2019: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

August 2019

1264 pages

ISBN:9781450355728

DOI:10.1145/3338906

General Chairs:
Marlon Dumas
University of Tartu, Estonia
,
Dietmar Pfahl
University of Tartu, Estonia
,
Program Chairs:
Sven Apel
Saarland University, Germany
,
Alessandra Russo
Imperial College, UK

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ESEC/FSE '19

Sponsor:

SIGSOFT

ESEC/FSE '19: 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

August 26 - 30, 2019

Tallinn, Estonia

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

13
Total Citations
View Citations
516
Total Downloads

Downloads (Last 12 months)47
Downloads (Last 6 weeks)3

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang CSu Z(2024)SMT2Test: From SMT Formulas to Effective Test CasesProceedings of the ACM on Programming Languages10.1145/36897198:OOPSLA2(222-245)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689719
He WDi PMing MZhang CSu TLi SSui Y(2024)Finding and Understanding Defects in Static Analyzers by Constructing Automated OraclesProceedings of the ACM on Software Engineering10.1145/36607811:FSE(1656-1678)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660781
Ghosh SDatta SMajumder S(2024)An Examination of the Effectiveness of SMOTE-Based Algorithms on Software Defect PredictionAnalytics, Machine Learning, and Artificial Intelligence10.1007/978-3-031-75157-8_14(204-216)Online publication date: 21-Nov-2024
https://doi.org/10.1007/978-3-031-75157-8_14
Raghuvanshi KAgarwal ASingh AJain K(2023)Time-dependent entropic analysis of software bugsInternational Journal of System Assurance Engineering and Management10.1007/s13198-023-01976-314:5(1718-1725)Online publication date: 16-Jun-2023
https://doi.org/10.1007/s13198-023-01976-3
Dyck FRichter CWehrheim H(2023)Robustness Testing of Software VerifiersSoftware Engineering and Formal Methods10.1007/978-3-031-47115-5_5(66-84)Online publication date: 31-Oct-2023
https://doi.org/10.1007/978-3-031-47115-5_5
Wong WGao RLi YAbreu RWotawa FLi D(2023)Software Fault Localization: an Overview of Research, Techniques, and ToolsHandbook of Software Fault Localization10.1002/9781119880929.ch1(1-117)Online publication date: 21-Apr-2023
https://doi.org/10.1002/9781119880929.ch1
Li PMeng WLu KRoychoudhury ACadar CKim M(2022)SEDiff: scope-aware differential fuzzing to test internal function models in symbolic executionProceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3540250.3549080(57-69)Online publication date: 7-Nov-2022
https://dl.acm.org/doi/10.1145/3540250.3549080
Fink XBerger PKatoen J(2022)Configurable Benchmarks for C Model CheckersNASA Formal Methods10.1007/978-3-031-06773-0_18(338-354)Online publication date: 24-May-2022
https://dl.acm.org/doi/10.1007/978-3-031-06773-0_18
Zhang CSun MLi JSu TPu G(2021)Feedback-Guided Circuit Structure Mutation for Testing Hardware Model Checkers2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)10.1109/ICCAD51958.2021.9643509(1-9)Online publication date: 1-Nov-2021
https://doi.org/10.1109/ICCAD51958.2021.9643509
Herklotz YDu ZRamanathan NWickerson J(2021)An Empirical Study of the Reliability of High-Level Synthesis Tools2021 IEEE 29th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)10.1109/FCCM51124.2021.00034(219-223)Online publication date: May-2021
https://doi.org/10.1109/FCCM51124.2021.00034
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten