Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

figure a

1 Introduction

Kind 2 is an SMT-based model checker for synchronous reactive systems. It relies on off-the-shelf SMT solvers to prove or disprove quantifier-free regular safety properties of models written in an extension of the synchronous dataflow language Lustre [11]. These properties can be expressed, in a separate annotation language, as invariants or as assume-guarantee-style contracts. Kind 2 is inspired by its predecessor PKind [14] and uses several of the same techniques. However, it was engineered and implemented from scratch. Both checkers have several model checking engines, based on various techniques, which run concurrently and in cooperation, with the goal of proving or disproving properties and contracts.

Kind 2 is open-source and distributed in binary and source-code form under a liberal license at http://kind.cs.uiowa.edu. This paper focuses on its novel features, in particular, powerful invariant generation techniques, contract-based compositional reasoning, and proof certificate generation.

2 Functionality and Main Features

We start with a summary of Kind 2 ’s basic functionality, i.e., (dis)proving safety properties of reactive systems, and then describe Kind 2 ’s distinguishing features.

Safety Analysis. Lustre is a dataflow language that allows one to define system components as nodes, each of which maps a continuous flow of inputs (of various basic types) to continuous flows of outputs based on both current input values and previous input and output values (see Fig. 1 for a simple example). Bigger components can be built by parallel composition of smaller ones, achieved syntactically with node calls. Through the use of observers [12], any (LTL) regular safety property can be expressed in Lustre as an invariant property, hence Kind 2 focuses on checking just invariant properties.

Fig. 1.
figure 1

Example of annotated Lustre. Node sofar encodes the “always in the past” operator of pLTL.

After various transformations and slicing, Kind 2 encodes Lustre nodes internally as state transition systems \( \langle \mathbf {s}_{}, I(\mathbf {s}_{}), T(\mathbf {s}_{}, \mathbf {s}_{}') \rangle \) where \(\mathbf {s}_{}\) is the vector of typed state variables, \(I\) the initial state predicate, and \(T\) is a two-state transition predicate (with \(\mathbf {s}_{}'\) being a renamed version of \(\mathbf {s}_{}\)). An P for such a system is a predicate over the variables \(\mathbf {s}_{}\) that must hold in every reachable state of the system. Instances of \(I\), \(T\) and P are quantifier-free first-order formulas over the theories of equality with uninterpreted functions and linear integer and real arithmetic.

The node construct allows one to specify modular and hierarchical systems. Kind 2 takes advantage of this by performing over nodes. Each node can be assigned its own properties and verified individually. The results of the verification process (e.g., proven properties and auxiliary invariants) can be reused in the analysis of other components calling that node. Kind 2 takes this approach further by allowing the user to specify assume-guarantee-style contracts for each node, effectively enabling by fine-grained abstraction of sub-components.

At the component level, given an encoding \(S\;\triangleq \; \langle \mathbf {s}_{}, I(\mathbf {s}_{}), T(\mathbf {s}_{}, \mathbf {s}_{}') \rangle \) of a Lustre node and a property P, Kind 2 tries to verify that P is invariant for \(S\) using a combination (described in Sect. 2) of different induction-based model checking engines: \(k\)-induction [16], IC3 [3] and various auxiliary invariant generation methods. \(K\)-induction is a generalization of standard induction and consists in finding a value k for which P holds in all reachable states within \(k-1\) steps (base case), and is preserved by transition chains of length k (step case). IC3 is a popular directed reachability approach that iteratively strengthens the given property until it becomes inductive. We use an extension of IC3 to infinite-state systems which is based on an efficient form of approximate quantifier elimination. In our experience, IC3 is often complementary to \(k\)-induction as it can prove properties that are not k-inductive for any k while \(k\)-induction can handle properties that IC3 finds hard to strengthen to an inductive one. The invariant generation engines of Kind 2 produce on the fly auxiliary invariants that are used to incrementally strengthen the transition relation T, increasing the chances of proving the step case of \(k\)-induction and facilitating the job of IC3.

Incremental and Modular Invariant Generation. PKind introduced an invariant generation technique parameterized by a partial order \(\preceq \) over some (equality) type \(\uptau \) [13]. It starts from a set of candidate terms \(\mathbb {C}\) of type \(\uptau \) over a system \(S\) and heuristically produces invariants of the form \(c \preceq c'\) and \(c = c'\) where \(c, c' \in \mathbb {C}\). For the bool type, used in Lustre both for Boolean state variables and for properties, \(\preceq \) is implication and \(\mathbb {C}\) is constructed by mining the initial state predicate and the transition relation of \(S\) for Boolean terms. The approach maintains an index k and a directed acyclic graph (DAG), whose vertices are sets of terms from a partition of \(\mathbb {C}\). A vertex \(V = \{c_1, c_2, \ldots , c_n\}\) denotes the chain of equalities \(c_1 = c_2 = \cdots = c_n\). An edge from node V to \(V'\) denotes the inequality \(c \preceq c'\) for any term c in V and \(c'\) in \(V'\). The DAG is a compact representation of a set of invariant conjectures about \(S\). Initially, \(k = 0\) and the DAG has a single node \(\mathbb {C}\), conjecturing that all the terms in \(\mathbb {C}\) are equivalent in every reachable state of \(S\). This conjecture is tested with a Bounded Model Checking-style query to an SMT solver for a counterexample k states away from an initial state. If none is found, the conjecture is correct for states reachable in up to k steps from an initial one, and k is incremented. Otherwise, the DAG is modified by removing edges or splitting nodes so that its refined conjecture is consistent with the latest counterexample and all previous ones. The algorithm refines its DAG and increments k until k reaches a user-specified upper bound \(d\). It then performs a multi-property \((d+ 1)\)-induction step check over each element of the conjecture. Any equality or inequality between two candidate terms in the conjecture that is i-inductive for \(i \le d\) will be proved and communicated as invariant.

We have modified this technique so that it progresses in lockstep. When the conjecture is correct at depth k, the invariant generation engine of Kind 2 performs the \((k+1)\)-induction step check right away. This allows it to output invariants that are k-inductive for a small k faster. An additional benefit is that there is no need for a user-defined upper bound \(d\), whose value can vastly influence runtimes—for instance on large systems, where unrolling the transition predicate several times can be extremely expensive.

Furthermore, Kind 2 can execute this invariant generation technique when the input system is defined as the composition of two or more nodes. In that case, the subsystem hierarchy is traversed bottom-up. For each subsystem \(S\), a set of \((k+1)\)-inductive invariants (with k initially 0) is obtained as discussed above. Those invariants are then instantiated in every subsystem that has \(S\) as a direct subcomponent, recursively. Once the process reaches the top-level system, any invariants discovered at that level are communicated to the other reasoning engines of Kind 2. At that point a new bottom-up traversal starts with a greater value of k. This approach has two significant advantages with respect to running invariant generation on the full system monolithically: (i) it discovers invariants for subsystems more easily and quickly; (ii) it is self-reinforcing since instances of the invariants discovered for a subsystem of a component \(S\) can be used to help prove invariant conjectures for \(S\).

Compositional Reasoning. is a popular technique to improve the scalability of verification tools on systems defined as hierarchies of components.Footnote 1 Components have enforcing their use in a certain context in order for them to guarantee certain properties (Fig. 2 for an example). Analyzing a component consists in checking that its contract holds after abstracting at call-site all of its (possibly complex) sub-components by their own contract. A contract for a system \(S\;\triangleq \; \langle \mathbf {s}_{}, I(\mathbf {s}_{}), T(\mathbf {s}_{}, \mathbf {s}_{}') \rangle \) is a pair \(C\;\triangleq \; \langle A_{}(\mathbf {s}_{}),G_{}(\mathbf {s}_{}) \rangle \) where, informally, the predicate \(A\) describes properties that S expects its inputs to have, while the predicate \(G\) expresses how the component behaves when \(A\) holds at all times. A contract can introduce local variables (streams), refer to previous values of streams, and call arbitrary Lustre nodes.

Fig. 2.
figure 2

Lustre nodes with contracts.

This makes Kind 2 ’s contract language expressive enough to represent any regular safety properties, once they are recast in terms of past temporal logic (see [6] for more details on the contract language and its use). In Kind 2, verifying that S satisfies its contract reduces to verifying that \(G(\mathbf {s}_{})\) is an invariant for the system \( S_A\; = \; \langle \ \mathbf {s}_{},\ I(\mathbf {s}_{}) \wedge A(\mathbf {s}_{}),\ A(\mathbf {s}_{}) \wedge T(\mathbf {s}_{}, \mathbf {s}_{}') \wedge A(\mathbf {s}_{}')\ \rangle . \) If \(S\) is a component of some larger system \(S'\), which provides it with input values, then \(S\) can be abstracted by its guarantee \(G\) at call-site in \(S'\) as long as the assumption \(A\) at call-site is an invariant for \(S'\). If it is, we say the call is . If the call is unsafe, then so is \(S'\) since it does not respect the contract of \(S\). If all components of a system verify their contract and make only safe calls then the overall system is safe. Kind 2 can construct this argument via a analysis, where system components are analyzed bottom-up in the subsystem hierarchy with a process similar to modular invariant generation.

Refinement. Kind 2 ’s modular and compositional analysis of multi-component systems resorts to contract refinement when needed. Consider a system \(S_1\) with contract \(C_1 = \langle A_{1}(\mathbf {s}_{1}),G_{1}(\mathbf {s}_{1}) \rangle \) that uses a subsystem \(S_2\) with contract \(C_2 = \langle A_{2}(\mathbf {s}_{2}),G_{2}(\mathbf {s}_{2}) \rangle \). Suppose that Kind 2 cannot prove \(S_1\)’s contract compositionally, that is, by abstracting \(S_2\) by its contract. A reason for this might be that the abstraction provided by \(C_2\) is too weak. Kind 2 will then \(S_1\) in the analysis by replacing \(S_2\)’s contract with \(S_2\) itself, provided, however, that the following conditions are met: (i) \(S_2\) is safe (i.e., it verifies its contract and does not make unsafe calls), and (ii) all calls to \(S_2\) in \(S_1\) are provably safe. If the new analysis succeeds, the user is notified of the specific contract abstraction under which the result was obtained. Otherwise, the refinement process continues recursively until no more contract refinement is possible. When a system like \(S_2\) is used instead of its contract \(C_2\) it is because it provably admits a smaller set of execution traces than \(C_2\). Because of this, when analyzing a newly refined system, Kind 2 retains any invariant/property already proved and any information on properties that are still unproven or falsified. This means that when the analysis restarts after refinement, Kind 2 will only check the proof obligations that were not previously discharged, in effect restarting precisely from where the previous analysis had stopped.

Certification. Having to trust the results of complex model checkers like Kind 2 is a source of concern for some users. To address this problem, Kind 2 can produce an independently checkable for the properties that it claims to have proven for a (sub)system.Footnote 2 This certificate is in the form of a (expressed as a formula together with a specific value of k) that implies all the proven properties. This form is general enough that it can be effectively produced by all the model checking engines described previously. Certificates coming from these engines are combined conjunctively thanks to the fact that a k-inductive invariant is also \(k'\)-inductive for any \(k' \ge k\). Individual certificates are initially generated by single engines based on their deductions regarding some set of properties and invariants. The combined certificate is then simplified along two dimensions, the value of k and the size and complexity of the invariant itself, using various fixpoint-based heuristics relying on unsat cores and counterexamples to induction. The final certificate output by Kind 2 is written in SMT-LIB 2 format and embedded in an SMT-LIB 2 script that checks that the certificate is k-inductive and implies the proven input properties. As a first approximation, any SMT-LIB 2-compliant solver can then be used as a . This essentially shifts the burden of trust from Kind 2 to the SMT solver, reducing the trusted core to the latter. In our initial empirical evaluation, this approach allows Kind 2 to generate and check certificates, with an SMT solver, with a reasonable overhead (in all cases, less than an order of magnitude). We are currently working on eliminating the SMT solver as well from the trusted core by capitalizing on the proof-producing capabilities of certain SMT solvers. Specifically, in collaboration with the developers of the CVC4 solver [2], we are instrumenting Kind 2 to generate from CVC4 a final certificate in the LFSC language [17]. This way, the trusted core will reduce even further, to the much simpler LFSC proof checker.

Architecture. Kind 2 is written in OCaml and has a concurrent architecture similar to that of PKind. Its various engines (base case and inductive step of \(k\)-induction, IC3, invariant generation, and so on) run simultaneously and in cooperation. They exchange information, mostly about properties proved or disproved to be invariant, through a message passing interface implemented on top of the ZeroMQ library. The concurrent execution of the base (BMC) of \(k\)-induction with the step case makes Kind 2 efficient at disproving properties. This architecture provides superior support for systems with multiple components and properties since it allows Kind 2 to check several properties per component at the same time and output counterexamples or proven properties incrementally, as it discovers them. Various off-the-shelf SMT solvers (currently, CVC4 [2], Yices [9], and Z3 [8]) are used as backend reasoning engines.

Fig. 3.
figure 3

Comparison between Kind 2 and other infinite-state model checkers. (colour figure online)

Table 1. Techniques implemented in the tools.

3 Experimental Evaluation

Compositionality and certificate generation make Kind 2 ’s internal architecture more complex, and with a higher potential overhead, than comparable model checkers. So we provide an evaluation of Kind 2 ’s performance as a monolithic model-checker first (without certificate generation), before discussing the performance of its compositional reasoning features.

Comparison with Other Tools. We compared Kind 2 with a number of recent model checkers for infinite-state systems: PKind [14]; JKind [10], a model checker similar to PKind developed in Java by Rockwell-Collins; Zustre, a Lustre front end for the Z3-based model checker Spacer [15]; and nuXmv [5], a general purpose model checker for synchronous finite-state and infinite-state systems. Table 1 shows the techniques implemented by each tool among a modular version (m) of \(k\)-induction with or without invariant generation (ig), and IC3 possibly augmented with interpolation (i) or implicit predicate abstraction [7] (ia). We ran each tool on a Linux machine with two 12-core 64-bit AMD Opteron processors and 32GB of memory on a set of single-property benchmarks that includes those discussed in [14].Footnote 3 nuXmv was given encoded versions of the original Lustre problems in its own input language, which were provided to use by its developers. We gave a timeout of five minutes for each problem. Figure 3 shows that Kind 2 is very competitive with its peers, outperforming its predecessor PKind and providing an answer (either valid or a counterexample) in more cases than any other tool.

Compositional vs. Monolithic Verification. We evaluated compositional reasoning in Kind 2 on the TCM (Transport Class Model) for medium-sized aircraft discussed and verified compositionally by hand by Brat et al. [4]. The subsystem of the TCM we had access to, which is modeled in Lustre, includes components for the latitudinal and longitudinal controllers, and for the mode logic that decides which controller should be active at any time. The controllers are heavily numerical and contain non-linear expressions, which are problematic for current SMT solvers. We wrote contracts corresponding to Federal Aviations Regulations [4] for most of the components of the subsystem. We also abstracted non-linear expressions by components with a linear contract.

The runtime to verify every component of the system bottom-up, including the abstractions of non-linear expressions, is about 400 s on a 2014 i7 CPU running OSX. A comparison with a purely monolithic approach is not possible because of the presence of non-linearity. All SMT solvers we tried would return unknown, even for checks dealing with a single, relatively simple component. As a consequence, we did a monolithic analysis of a modified TCM system where the non-linear expressions are replaced by their linear contract but otherwise nothing else is abstracted. In this setting, the analysis of the top level of the system ran for two hours without reaching a conclusion. We refer the interested reader to Champion el al. [6] for a more in-depth discussion.

4 Applications

Kind 2 is used in academia and in a variety of industrial settings. For the latter, it is for instance one of the backend model checkers in the AGREE framework for compositional verification of AADL models [1] at Rockwell-Collins. It has been used at General Electric for model-based test case generation. It is also used in an open-source model-checking plugin for Simulink developed by NASA Ames and CMU, which relies on Lustre model checkers and produces user feedback at the Simulink block level. Kind 2 ’s proof certificates are leveraged as an innovative way to approach tool qualification with respect to DO-178C requirements in a NASA and FAA funded project.