figure a

1 Introduction

Shared-memory concurrent algorithms are critical components of many systems, for example as locks, reference counters, work-queues, and garbage collectors [12]. These algorithms must achieve high performance, while also enforcing properties such as mutual exclusion and safe memory reclamation. In pursuit of performance, modern algorithms have become increasingly complex. As a result, by-hand correctness arguments are unreliable, and formal verification remains very challenging.

Concurrent algorithms often depend on intangible concepts such as thread-local ownership of resources, and protocols between threads. For example, a thread that acquires a lock takes ownership of the guarded resource, and the mutual exclusion protocol forbids other threads from accessing the lock at the same time. Beginning with Concurrent Separation Logic (CSL) [18], program logics have integrated these concepts directly in reasoning, which has enabled the verification of many challenging algorithms (see Sect. 7, Related Work).

However, these logics derived from CSL are very complex, with auxiliary proof constructs such as fractional permissions, shared regions, and labelled transition systems. Complexity makes these logics difficult to learn and difficult to reason with, and non-standard proof constructs make tooling hard to develop, and therefore rare. As a result, there are substantial barriers to applying these logics in practice.

We present Starling, a new program logic and verification tool for concurrent algorithms. Our approach is inspired by CSL and its relatives, but we dispense with heavyweight auxiliary proof concepts. Starling’s proofs are lightweight, easy to read, and easy to automate – but powerful enough to verify challenging concurrent algorithms.

Starling’s approach is based on views – units of linear, invariant information that can be held by a single thread. Proofs in Starling are written in a lightweight proof-outline style, with views annotating program points and constraints defining their meaning in the underlying domain. Notions such as ownership and protocol can be expressed through interactions between views. For example, we can have a view expressing that the thread holds a lock, then express mutual exclusion by forbidding two threads from holding this view at the same time.

Starling’s reasoning is built on the pre-existing Views framework [6]: this was designed as an off-the-shelf metatheory for encoding other logics, but we instead instantiate it directly as a simple view-based logic. The Views framework works by reducing a concurrent proof to multiple applications of a single core proof rule. We use this to reduce a Starling proof to a collection of verification conditions that can be discharged using a sequential solver. Building on the Views framework means that Starling requires minimal extra metatheory and can easily be automated.

Our approach is agnostic to the underlying data domain: we require only an appropriate sequential solver. In this paper, we instantiate our approach with two domains. First, for algorithms that use shared variables and linear arithmetic, we generate SMT queries, which are discharged using Z3 [5]. For algorithms that use dynamic linked data-structures, we generate queries written in separation logic, which we discharge using GRASShopper [20]. In both cases, our approach lets us map uniformly from concurrent reasoning into sequential verification conditions.

We have tested Starling on a collection of real-world concurrent algorithms. Many of these are synchronisation algorithms, one of the most important class of concurrent algorithm. Our running example is Rust’s Atomic Reference-Count algorithm, which prevents reuse of an object after it has been freed. We also verify several different lock algorithms including the CLH queue-lock algorithm, Peterson’s algorithm, and a fine-grained list algorithm. As is often the case in concurrency, these algorithms are small in size but exhibit killer subtleties that make verification very challenging. Other approaches would require considerably more proof annotations, or customised auxiliary proof constructs. We show that these algorithms can be verified using a lightweight, automated approach.

Our tool is open source (MIT license) and available on GitHub:

https://github.com/septract/starling-tool

2 Motivating Example: ARC

The Atomic Reference-Count (ARC) algorithm is used to ensure that a shared object is not disposed before all threads are finished with it. In Rust, the ARC forms an important part of the concurrency model [23]. Our version of the ARC has three operations:

figure b

2.1 Specification

To specify the ARC using our approach, we first declare the view atom \(\mathsf {arc}()\). A view atom is a unit of linear, invariant information that can be held by a thread. The atom \(\mathsf {arc}()\) states the thread holds a single reference to the ARC object. We do not specify the meaning of \(\mathsf {arc}()\) in the program state yet (in this way view atoms resemble the abstract predicates of Dinsdale-Young et al [8]).

View atoms can be conjoined into unboundedly large views using the composition operator, \(*\). This operator is linear, not standard conjunction: for example the view \({\mathsf {arc}()} * {\mathsf {arc}()} * {\mathsf {arc}()}\) asserts that the thread holds three separate references to the ARC object. A thread could also hold zero references to the ARC, represented by the special unit view \({{\mathsf {emp}}}\). The \(*\) operator is generalised from separating conjunction in separation logic, but views need not have disjoint heap representations.

Using \({\mathsf {arc}()}\) and \({{\mathsf {emp}}}\), we give the ARC operations Hoare-style specifications:

figure c

The clone method creates a new reference, represented by a duplicate \({\mathsf {arc}()}\) atom in its postcondition. The access method requires an ARC reference to ensure the object has not been disposed: the \({\mathsf {arc}()}\) atom in its precondition represents this. The drop method takes an ARC reference, represented by an \({\mathsf {arc}()}\) atom, and destroys it leaving \({{\mathsf {emp}}}\).

In our tool, specifications are implicitly framed with arbitrary views. The frame represents other views held locally or by other threads. For example, the thread might hold three ARC references, and then call drop():

figure d

As can be seen, the frame \(\mathsf {arc}() * \mathsf {arc}()\) is unaffected by calling drop(). Likewise, if some other thread held \(\mathsf {arc}() * \mathsf {arc}()\) it would be unaffected by the call.

Framing means that every view must continue to hold irrespective of the behaviour of other threads. However, \({\mathsf {arc}()}\) atoms are not independent in their underlying representation, nor between each other. In their representation, all the \({\mathsf {arc}()}\) views refer to the same shared variables. Also, the reference count must not be smaller than the total number of \({\mathsf {arc}()}\) atoms across all threads – otherwise a thread could access the object after it has been disposed. Reasoning about this combination of thread-local views and inter-thread interaction is the core problem that our approach solves.

Fig. 1.
figure 1

Shared-variable version of ARC, and proof.

2.2 Proof

Figure 1 shows an ARC implementation, and a proof that it satisfies our specification. (Here, and elsewhere, we elide some details such as variable declarations.)

In this implementation we model a single ARC instance by shared variables. The integer variable count holds the reference count, while disposal is modelled by the boolean variable free. This simplification to variables means we can discharge the proof using an SMT solver. Below, we verify a heap-allocated ARC using the GRASShopper separation-logic solver.

Our programming language is a standard while-language, with atomic commands written with angle-brackets, \(\texttt {<| |>}\). The proof itself consists of Hoare-style assertions, written in views, that are interleaved into the program. These assertions are written using assertion brackets \(\texttt {\{| |\}}\) As well as plain views, views can hold conditional on local variables: for example, in Fig. 1 we write . The complete syntax for Starling’s input language is given in Appendix A.

In addition to the \(\mathsf {arc}()\) atom discussed above, the proof uses the additional atom \(\mathsf {countCopy}(c)\), which represents the fact that c was previously observed as the value of count. (It does not mean that count is currently c, as count can change through the action of other threads).

The meaning of the views in the underlying program state is given by constraints. There are unboundedly many possible composite views, but we need only give meanings for a minimal set of defining views – meanings for others are derived from these. Section 3 explains how this derivation works.

In Fig. 1, the meaning of a single \(\mathsf {countCopy}(c)\) atom is given by the following constraint:

figure e

Once a thread observes count as 1 in a fetch-and-decrement, the ARC cannot be disposed by any other thread, and the value of count will always be zero. This depends on count accurately recording the number of references to the ARC: once count is 1, the only thread with access is the current one.

Constraints can also specify interactions between views. Interactions can be between views on the same or multiple different threads – we make no distinction between the two. In Fig. 1, two \(\mathsf {countCopy}(c)\) atoms have the following meaning:

figure f

If two threads take copies of count, only one of them can equal 1: again, this depends on the counter accurately recording the number of references.

The final important properties represented in the proof are, first, that the ARC is not disposed until all references are removed; and, second, that count accurately records the number of references. Each \(\mathsf {arc}()\) atom represents a reference, so we need the following:

figure g

In the proof, this is expressed directly by the following constraint on views:

figure h

The iter[n] keyword indicates that we have n instances of the \(\mathsf {arc}()\) atom on the same thread or across different threads.

Fig. 2.
figure 2

Heap-allocated version of ARC, and proof.

2.3 Heap-Allocated ARC

The implementation in Fig. 1 modelled a single ARC by shared variables – as a result, we can discharge this proof using an SMT back-end. In Fig. 2, we give a more realistic implementation where ARCs are heap-allocated structs. To discharge this proof, we use GRASShopper, a solver for separation logic [20].

The most important implementation change is a new method init which allocates a new ARC. This method has the following specification:

figure i

A further difference is that heap commands are written in GRASShopper’s input language. We embed these using the special brackets %{ }, and allow variables to be referenced using the inner brackets \(\texttt {[| |]}\). For example, in clone, we write the following for an atomic increment:

figure j

By combining heap commands we can build complex atomic operations – for example an atomic fetch-and-decrement operation, as used in drop:

figure k

Despite the fact that this implementation targets a much richer domain than shared variables, we can apply the same proof strategy as Fig. 1. The same views are needed, though they are now parameterised by the address of the ARC. Likewise, the same constraints are needed, modified to use GRASShopper’s constraint language. As with commands, we embed GRASShopper assertions using the special brackets %{ }. For example, this is the constraint on a single \(\mathsf {countCopy}(x,c)\) atom:

figure l

Here, requires that x is in the set of allocated ARCs – this corresponds to the requirement that free is false in Fig. 1. Likewise, corresponds to the constraint on the value of count.

With both the variable-based and heap-based versions of the ARC, our approach gives a simple proof that captures the algorithm’s linear nature. Our approach lets us convert these lightweight proofs into verification conditions that can be discharged by either SMT or GRASShopper as appropriate. We next explain how this translation works.

3 Theory

Starling’s theory works by recasting the pre-existing Views framework [6] into a form suitable for automation. As the Views framework has been proved sound in Coq, this gives us a simple way of justifying the soundness of our translation into a set of verification conditions.

3.1 Owicki-Gries

For comparison, we first consider the Owicki-Gries method [19], one of the simplest approaches to Hoare-style verification of a concurrent program. Owicki-Gries presents us with a single core rule for validating a proof outline.Footnote 1 Let \(\textsf {Axioms}\) be the set of atomic Hoare triples of the proof; \(\textsf {Formula}\) the set of all formulas used in the outline; and \(\models _{\text {Hoare}}\) the entailment rule for Hoare logic. Then, the Owicki-Gries proof rule is written as:

This rule expresses two key correctness properties for a concurrent system. First, each command behaves correctly in a sequential setting – the post-state Q is established from the pre-state P. Second, no command interferes with any properties needed by other threads – the frame F is preserved by c.

To achieve completeness, Owicki-Gries needs auxiliary variables: additional variables that capture key aspects of the local state of each thread. To encode Starling into Owicki-Gries, we would need to use auxiliary variables to encode the more rich interactions our constraint system permits. However, these variables can hide the details of the verification and make proof discovery harder. We need a different approach.

3.2 Views

We eliminate the need for auxiliary variables, while keeping much of the shape and simplicity of Owicki-Gries, by building on the Views framework [6]. Views was originally an off-the-shelf metatheory for proving the soundness of concurrent reasoning systems; we recast it as an Owicki-Gries-style proof rule. In this paper, we introduce just enough of the Views framework to support Starling’s theory – this fits with the framework’s purpose as reusable metatheory.

The Views framework is designed to allow a broad range of reasoning systems to be encoded into a small set of parameters. If these parameters satisfy a few key properties, the encoded reasoning system is sound.

The parameters that must be instantiated include the sets Views, from which all assertions in the logic are derived; Cmds, containing atomic commands; and Axioms, containing the atomic Hoare triples over views and commands. The reasoning system must also define a view composition operator \(*\) and unit view \({\mathsf {emp}}\), which together must form a monoid with Views; a reification function \(\lfloor {\_}\rfloor \) mapping Views to their representation in the underlying state; and a semantic function \(\llbracket {\_}\rrbracket \) mapping atomic commands to state transformers.

Taken together, these parameters must satisfy the key property of axiom soundness:

(1)

This rule requires that every atomic Hoare triple generated by the reasoning system upholds sequential correctness, and inter-thread non-interference, just as we saw in Owicki-Gries. As the Views approach makes no distinction between contexts that on the same thread or other threads, it captures both Concurrent Separation Logic’s Frame and Parallel rules:

figure m

In Starling, we recast Rule (1) to generate verification conditions from proofs. In comparison to Owicki-Gries, the Views proof rule allows us to avoid auxiliary variables in most cases. In Owicki-Gries, assertions and contexts are joined by conjunction, but in the Views rule they are joined by view composition, \(*\), and their reification is defined separately. This means that we can define interactions between views that go beyond their individual reifications – for example to enforce mutual exclusion between views. This gives our proof system its power.

3.3 Instantiating the Views Rule

We first instantiate the Views framework parameters in a way that is suitable for Starling’s reasoning. For Starling, view atoms consist of a name and a sequence of value arguments, and views are multisets of view atoms. More formally, we define Views as:

figure n

(Below we sometimes call these plain views to distinguish them from constructs such as view patterns.)

Starling Views form a monoid with the multiset union \(\cup _\mathsf{m}\) as the view composition \(*\), and the empty multiset \(\emptyset \) as the unit view \({\mathsf {emp}}\).

We first change Rule (1) by making the state accessed by a command explicit. We model the state as a pair (ls) of thread-local and shared components. The command semantics \(\llbracket c \rrbracket \) is then a relation over these states. We write \(\lfloor P\rfloor (s)\) to say that state s is in the representation of P, and (for now) ignore the local state. The resulting rule is:

$$\begin{aligned} \begin{array}{ll} \forall \,\{P\}\,c\,\{Q\} \in \textsf {Axioms}. &{} \\ \forall ((l,s), (l',s')) \in \llbracket c \rrbracket . \forall V \in \textsf {Views}.\, &{} \lfloor P *V\rfloor (s) \Rightarrow \lfloor Q *V\rfloor (s') \end{array} \end{aligned}$$
(2)

For example, in Fig. 1, of the atomic triples in Axioms is:

figure o

Rule (2) yields a proof term with the following shape for each combination of this triple and frame V:

$$ \forall ((l,s), (l',s')) \in \llbracket \texttt {count++} \rrbracket . \forall V \in \textsf {Views}.\, \lfloor {\mathsf {arc}()} * V\rfloor (s) \Rightarrow \lfloor {\mathsf {arc}() * \mathsf {arc}()} * V\rfloor (s') $$

3.4 Integrating Local State

Rule (2) is not sufficient for the ARC proof in Fig. 1. First, the view atom countCopy(c) refers to a local variable c, not a value. Second, the view \(\mathsf {arc}()\) is defined using the iterator variable n. Finally, we need the ability to choose whether atoms appear in a view based on local conditions to encode assertions such as .

To incorporate these local-state properties into the rule, we introduce syntactic view expressions, with the following syntax:

$$ \begin{array}{rcl} P&\; {:}{:}= \;\;&\textsf {emp} \, \mid \, (B \rightarrow a[n](\overline{e})) * P \end{array} $$

View expressions are used to encode Starling’s assertion syntax. Each view expression P is a \(*\)-composition of atom expressions. These have a name a, a list \(\overline{e}\) of integer or boolean argument expressions, an integer iterator expression n, and a boolean guard expression B. The argument, iterator, and guard expressions are all interpreted in the local state.

To map a view expression to a view, we must interpret its local-state expressions. Given a local state l and expression X, we write l(X) for the value of X in l. Using this, we define a function \(\llbracket - \rrbracket _{l}\) which maps from view expressions into views:Footnote 2

figure p

Here, the empty view expression maps to an empty multiset, i.e. the unit plain view. Other view expressions map to the appropriate view atoms, dictated by the values of the local-state expressions. The argument expressions dictate the values of the view atom’s arguments. The guard expression controls whether any view atoms are created, and the iterator expression dictates the number of instances of the view atom.

To integrate this into our core proof rule, we amend Axioms so that pre- and post-conditions are view expressions, not plain views. This means that they must be interpreted by the semantic function \(\llbracket - \rrbracket _{l}\). Our modified rule is as follows:

(3)

3.5 Context Reduction

The quantification \(\forall V\) over context views means that Rule (3) cannot be used directly for automated verification. As two smaller views can be composed into a larger one, there are arbitrarily many possible values of V, and by default we must consider them all.

Other logics allow a degree of context reduction here. For example, in Owicki-Gries, if two threads separately assert \(F_1\) and \(F_2\), and each is preserved, we need not consider the context \(F_1 \wedge F_2\). This means we can validate our proof outline for an unbounded number of threads by considering a finite set of entailments.

We cannot use this simple context reduction, because in Views any context may contribute information not represented in its sub-views. This generality is desirable – it is what gives our proof system its power. We can preserve it while gaining context reduction by defining reification in a particular way.

Defining Function. The first restriction on reification is we only consider functions where the reification of a composite view implies the conjunction of its sub-view reifications. In other words, view composition cannot lose information, which lets us avoid considering sub-views of composite views. More formally, we require that for all views, \( \lfloor P *Q\rfloor \Rightarrow \lfloor P\rfloor \wedge \lfloor Q\rfloor \).

The second restriction is that we bound the set of views that can contribute information to the reification. Intuitively, this means that we only need to consider these defining sub-views in our proof rule. To enforce this, we require that the reification function is derived from a syntactic defining function.

In a Starling proof, the defining function is given precisely by the constraints. For example, in Fig. 1 we have:

figure q

On the left we have a view pattern countCopy(m) * countCopy(n), while on the right we have a formula giving the meaning for this pattern.

View patterns allow a definition to match many different views with similar shapes. A view pattern r has the syntax:

$$ r \; {:}{:}= \; \textsf {emp} \, \mid \, a[n](\overline{x}) * r $$

A pattern is either \({\mathsf {emp}}\), or a \(*\)-composition of pattern atoms. Each atom has a name a, variable arguments \(\overline{x}\) which bind to the arguments of a view atom, and an iterator variable n which records the number of view atoms matched.

A definition is then a tuple \((\overline{y}, r, p)\) where, r is a view pattern, p is a formula of the underlying theory, and \(\overline{y}\) is a set of free variables used in the definition. In the example constraint above, \(\overline{y}\) is the set of variables \(\{ m, n \}\), the pattern r is \(\mathsf {countCopy}[1](m) * \mathsf {countCopy}[1](n)\), and the formula p is \((m \ne 1) \vee (n \ne 1)\).

A defining function D is then a finite set of definitions (derived from the constraints in the proof). Using such a D, we can then induce a reification function where only definitions contribute information. The reification of a view-expression V, for a shared state s, is the conjunction of all the definitions that match some sub-view of V.

figure r

We write \(r \subseteq _\mathsf{m} V\) (using multiset subset) to indicate that r is a sub-view of V, meaning there is a pattern match.

A pattern may be matched under any instantiations of its free variables \(\overline{y}\). We express this using the special quantification \(\hat{\forall } \overline{y}\). Given a formula X that includes r and p, \(\hat{\forall } \overline{y}. X\) is shorthand for quantifying over all possible assignments to \(\overline{y}\), and substituting in r and p. This has the effect of converting r into a plain view. Many theories, such as SMT, can natively handle the \(\hat{\forall } \overline{y}\) construction without further expansion.

Fig. 3.
figure 3

Derivation of Rule (4), with outer quantifiers elided.

Rule Context Reduction. Using this definition, we can modify Rule (3) to reduce the contexts we consider to just those in the defining function.

First we introduce two lemmas. The first lemma (reification monotone) states that the reifications of larger views are more restrictive than those of smaller views. This justifies us considering only defining views in the premise of the proof rule, because any larger context will be more restrictive.

Lemma 1

(Reification monotone). \(V_1 \subseteq _\mathsf{m} V_2 \implies (\forall s. \lfloor V_2\rfloor (s) \Rightarrow \lfloor V_1\rfloor (s))\)

The second lemma (view adjoint) defines the relationship between multiset union \(\cup _\mathsf{m}\), multiset subset \(\subseteq _\mathsf{m}\), and multiset minus \(\setminus _\mathsf{m}\). We use \(\setminus _\mathsf{m}\) in our new rule to construct a ‘weakest context’, analogous to a weakest precondition.

Lemma 2

(View adjoint). \( (V_1 \setminus _\mathsf{m} V_2) \subseteq _\mathsf{m} V_3 \implies V_1 \subseteq _\mathsf{m} (V_2 \cup _\mathsf{m} V_3) \)

Now we take Rule (3) and (eliding the two outer quantifiers) rewrite it as shown in Fig. 3. This at last gives us Starling’s core proof rule:

figure s

This is the rule that we use to generate verification conditions from Starling input proofs such as Fig. 1. The atomic steps of the program form the set Axioms; the built-in semantics of commands specify \(\llbracket c \rrbracket \); and the constraints specify the defining function D and the reification \(\lfloor - \rfloor \). The significant advantage of this rule is that, rather than quantify over an infinite set of context views, it quantifies only over finite sets, and therefore generates a finite set of proof terms.

Consider the \(\mathsf{arc}()\) proof term we examined in Sect. 3.3. If rather than using Rule (2), we apply our new rule, we get the following outcome:

$$ \begin{array}{l} \forall ((l,s), (l',s')) \in \llbracket \texttt {count++} \rrbracket . \\ \forall (\overline{y}, r,p)\in D.\, \hat{\forall } \overline{y} . \lfloor \llbracket \mathsf {arc}() \rrbracket _{l} \cup _\mathsf{m} (r \setminus _\mathsf{m} \llbracket \mathsf {arc}() * \mathsf {arc}() \rrbracket _{l'})\rfloor (s) \Rightarrow p(s') \end{array} $$

3.6 Finite Pattern Matching

Rule (4) gives us a finite set of proof terms. However, we must also translate each term into finitely many verification conditions. The key issue is ensuring that the number of pattern matches in each reification is finite.

Most cases of pattern matching are trivially finite, but iterated views require careful treatment. An iterated view expression \(B \rightarrow a[n](\overline{y})\) can produce n many subviews. As a result, if a view pattern r and view V are both iterated, there may be unboundedly many valid distinct matches (for \(i = 1, 2, \ldots \)).

To solve this, a definition \((\overline{y}, r, p)\) where p is dependent on an iterator n must satisfy the following downclosure properties:

figure t

These properties let us just consider the largest iterator value when constructing pattern matches. Our tool checks downclosure as an extra proof obligation.

A further subtlety is that iterated definitions can match against combinations of atoms when they can be made equal through parameter equality. For example, \(\mathsf {A}[n](x)\) matches \((B_1 \rightarrow \mathsf {A}[i](y)) * (B_2 \rightarrow \mathsf {A}[j](z))\) to form \(((B_1 \wedge B_2 \wedge y = z) \rightarrow \mathsf {A}[i+j](y))\). We can solve this by expanding out the equalities as if they are separate view atoms before matching – this does not change the view’s meaning.

4 SMT Back-End

We now have a proof outline for the ARC (Sect. 2) and a proof rule to convert it into verification conditions (Sect. 3). We now show how to verify these conditions using an SMT solver – in our case, Z3 [5]. To do this, we must convert the defining function, multiset minus, and command semantics into forms supported by Z3.

Definition Quantification. We begin by eliminating the defining function. Consider the following term we generated from our running example at the end of Sect. 3.5:

figure u

As the defining function D is bounded, we can expand the quantification into a finite set of terms. For example, for the pattern \(\mathsf {arc}[n]()\), we get the following term:

figure v

We get this by substituting the view pattern into the left of the implication in place of r, and the corresponding formula into the right in place of p. We also eliminate the \(\hat{\forall } \overline{y}\) by quantifying over the single variable n that is bound in \(\overline{y}\). For simplicity later, we treat r as a view expression over \(l'\).

Multiset Minus. We next eliminate multiset minus. We can easily reduce our proof term so that all instances of \(\setminus _\mathsf{m}\) have the following shape:

$$ \llbracket B_1 \rightarrow a[n_1](\overline{y_1})) * P \rrbracket _{l'} \setminus _\mathsf{m} \llbracket B_2 \rightarrow a[n_2](\overline{y_2}) \rrbracket _{l'} $$

We eliminate this shape by case-splitting on the relationship between \(B_1\) and \(B_2\), \(n_1\) and \(n_2\), and \(\overline{y_1}\) and \(\overline{y_2}\). The main subtlety is that some, but not all instances in the iterator \(a[n_1]\) may be subtracted, i.e. we may be left with the iterator \(a[n_1 - n_2]\). If we are left with anything on the right of the \(\setminus _\mathsf{m}\), we then apply the simplification step to the remainder formula P.

In our example, subtracting \(\llbracket \mathsf {arc}() * \mathsf {arc}() \rrbracket _{l'}\) from \(\llbracket \mathsf {arc}[n]() \rrbracket _{l'}\) leaves \(n - 2\) copies of \(\mathsf {arc}()\). If \(n \le 2\), nothing is left: we express this as a guarded view. The multiset minus rewrite yields the following term:

$$ \begin{array}{l} \forall ((l,s), (l',s')) \in \llbracket \texttt {count++} \rrbracket . \forall n. \\ \lfloor {\llbracket \mathsf {arc}() \rrbracket _{l}} \cup _\mathsf{m} {\llbracket (n> 2 \rightarrow \mathsf {arc}[n - 2]()) \rrbracket _{l'}}\rfloor \Rightarrow (n > 0 \Rightarrow \lnot free \wedge n \le count)(s') \end{array} $$

Commands as Predicates. To eliminate the command, we recast it as a boolean predicate over pre- and post-states. To do so, we instantiate two copies of each variable: one set for (ls), and another (primed) set for \((l', s')\). We conjoin this command predicate into the proof term, replacing the outer quantification with implicit ones over the variable sets. Expanding out the reification and the local-state interpretations, and ensuring we handle the subtleties in Sect. 3.6, we get:

$$ \begin{array}{l} \left( \begin{array}{@{}l@{}} count' = count + 1 \wedge free' = free \wedge c' = c \\ {} \wedge (1> 0 \Rightarrow \lnot free \wedge 1 \le count) \\ {} \wedge (n> 2 \Rightarrow (n-1> 0 \Rightarrow \lnot free \wedge n-1 \le count)) \\ {} \wedge (n> 2 \Rightarrow (n-2> 0 \Rightarrow \lnot free \wedge n-2 \le count)) \end{array} \right) \\ \qquad \qquad \qquad \qquad \implies (n > 0 \Rightarrow \lnot free' \wedge n \le count') \end{array} $$

SMT Term. Finally, we negate the outer implication for each condition, so Z3 tries to find counter-example instantiations for the condition’s variables. We can also simplify the term. For example, we remove the \(n-2\) case, as it is implied by the \(n-1\) case. The resulting term, in the SMT-LIB language accepted by Z3, is:

figure w

5 GRASShopper Back-End

For heap-based programs like the ARC in Fig. 2, we target the GRASShopper solver [20] rather than Z3. GRASShopper is a separation-logic solver, but its underlying model is based on sets of heap locations and reachability properties over sets. For example, the following GRASShopper predicate asserts that the set of locations Footprint contains a list with head x and tail y:

figure x

Here, acc(Footprint) is a spatial assertion claiming ownership of the locations in Footprint. The Btwn(next,x,z,y) predicate asserts that z is reachable between x and y by following the next field – in other words, z is in the list starting at x and ending at y. The set comprehension

figure y

therefore contains the set of locations in the list.

Most of the pipeline for producing GRASShopper proofs is similar to the SMT case. However, the presence of a heap model causes some differences. Suppose we try to model the allocated ARC equivalent of our previous working example,

figure z

Given a context of \(\mathsf {arc}(x) * \mathsf {arc}(x)\) (that is, the same x as in the local state of the thread), our translation would give the following in pseudo-SMT format:

figure aa

As we cannot discharge this term using SMT, we convert it into a GRASShopper procedure. Input and output variables are represented by arguments to the procedure. The command becomes the procedure body, and the left- and right-hand sides of the proof rule body become requires and ensures clauses.

Both the requires and ensures clause existentially quantify over a footprint set representing the whole heap – in the ARC, this is the ArcFoot set. This allows predicates to require access to the footprint, represented by acc(ArcFoot), and to conjoin constraints on this shared footprint arising from the views.

In general, it would not be sound to introduce an arbitrary existential to the consequent side of the term. The problem is that existential might be witnessed differently across different terms (see the derivation in Sect. 3). However, our encoding into GRASShopper is sound, because GRASShopper will always witness the footprint the same way, as the set of all available heap locations.

With this translation, the above pseudo-SMT query becomes:

figure ab

In some cases we need to model the mutation of variables. To do this, we declare fresh GRASShopper variables in the procedure body, and connect them to the input and output variables by assertion.

5.1 Example: CLH Queue Lock

GRASShopper’s support for dynamic data-structures allows us to target much more complex algorithms than the ARC. In this section we verify the queue-based CLH lock [16], which also demonstrates a subtle ownership-transfer pattern between threads. For space reasons, we give the main proof in Appendix B, and here only explain the key details.

The code and inline views are given in Fig. 4. In the CLH lock, each participating thread owns a single node. To contend for the lock, a thread adds its own node to the queue, and waits on its predecessor. Releasing the lock means setting the node’s lock flag to false. Once the predecessor is released, the thread can take hold of the lock.

This protocol is reflected in the views in Fig. 4. A node starts life dormant, i.e. not on the queue. It is then made active when its lock flag is set, and then is queued. Once the algorithm establishes that the node is at the end of the queue, it becomes locked. Finally, once the lock is released the node leaves the queue, and it becomes dormant again.

Fig. 4.
figure 4

CLH queue-based lock algorithm. Note that the head pointer and pred field are ghost code necessary to verify the algorithm.

The key property of the CLH lock (and any lock) is mutual exclusion: each node is held exclusively, and the lock as a whole can only be held by one thread. In our approach, we can specify this using constraints, for example:

figure ac

The queue data-structure is similarly defined by constraints. For example the \(\mathsf {locked}()\) atom is defined using GRASShopper assertions similar to the list_segment predicate above.

figure ad

The most subtle reasoning step happens in lines 2–6 of unlock in Fig. 4, when the thread releases the lock. As some other thread may be waiting on its current node, it cannot be reused immediately. Instead the thread takes ownership of its dormant predecessor. Thus threads always have a single exclusively-held node, but the exact node held varies over time.

This ownership transfer is reflected in the proof in Fig. 4 and the mutual exclusion constraints above. The terms passed to GRASShopper precisely encode the required properties, even though GRASShopper itself cannot reason about ownership transfer. Other reasoning approaches would capture this through regions or shared protocols: we encode it through views.

6 Examples and Performance Results

We have tested Starling on a range of examples: the ARC algorithm discussed in Sect. 2; a standard compare-and-swap spinlock; a ticket-based FIFO lock, as used in Linux [2]; a reader-writer lock which combines the classic Courtois et al. algorithm [3] with tickets; Peterson’s algorithm; the CLH queue-lock discussed in Sect. 5 [16]; and a lock-coupling list algorithm previously verified by Vafeiadis [26] (note we verify memory safety, not linearizability). For several of these we have verified both a static version encoded in shared variables (using SMT) and a version allocated on the heap (using GRASShopper).

Fig. 5.
figure 5

Benchmarks for example algorithms.

These algorithm are small in size, but all are challenging to verify, and each demonstrates an aspect of Starling’s reasoning. Verifying the ARC example would typically require a primitive notion of “permissions” in separation logic – Starling can directly handle it without resorting to new metatheory. The CLH lock has an implied protocol between threads that performs ownership transfer of the node from one thread to the next, again handled directly by the theory. The other synchronisation algorithms similarly involve subtle protocols between threads that, in other reasoning systems, would need auxiliary proof constructs. The lock-coupling example shows that we can reason about complex fine-grained data-structures where the protocol is entwined with the list nodes.

Figure 5 gives performance statistics for our examples. From left to right we give statistics for: the total lines of input code and proof (including auxiliary GRASShopper code); the approximate number of which are proof annotations; the lines of generated GRASShopper output; the total number of proof terms generated; the number of those successfully discharged using SMT/Z3 (the remainder are sent to GRASShopper); the total proof time (excluding GRASShopper); of that time, the total spent on the tool itself, and on SMT/Z3; the total memory in the .NET runtime working set at the end of the proof, in mebibytes; and the average maximum resident set size over 3 runs of GRASShopper on the output from Starling, in mebibytes (these loosely approximate the total memory used).

Times reported are the average of 3 runs. Benchmarks were run on a 2016 series MacBook Pro, with 8 GB RAM and a 2.9 GHz dual-core Intel Core i5.

7 Related Work

Our approach builds on Views [6], and thus is part of the family of logics descended from Concurrent Separation Logic [18]. These logics all use separating conjunction to reason about distinct threads, and many of these logics have introduced auxiliary constructs to assist with reasoning. For example, Svendsen and Birkedal’s iCAP [24] combines reasoning about interference (derived from Rely-Guarantee [14]), abstraction through abstract predicates, a rich system of protocols based on capabilities, and higher-order propositions. Other significant logics include CaReSL [25], TaDA [22], FCSL [17], and others – each comes with a different collection of auxiliary constructs.

As discussed in Sect. 3, our approach also has similarities to Owicki-Gries reasoning [19]. In Owicki-Gries, many kinds of interaction between threads need to be encoded through auxiliary variables. Views allow us to capture these interactions directly in a more intuitive style.

Starling inherits much of the generality of the Views framework – see [6] for encodings of multiple previous logics. We can encode many of the auxiliary proof constructs used in other logics. For example, Boyland-style fractional permissions [1] can be encoded by a view with a permission-value argument, which can then be split and joined by entailment. iCAP-style protocols can be encoded by making each protocol state into a view, and using constraints to enforce mutual exclusion between these state-views.

A few CSL-style logics have automated tool support. FCSL [17] and Verifast [13] both support automated proof-checking, albeit with a considerable annotation burden as all steps must be given explicitly. SmallfootRG [26] supports proof-checking for the RGsep logic, but requires annotations of invariants and rely-conditions – in our system these are defined implicitly by the constraints.

Caper [7] is the tool most similar to ours. It supports reasoning about functional specifications that our tool cannot presently handle – for example that an element is correctly inserted into a bag. However, Caper’s logic is built on auxiliary guard algebras, shared regions, and actions. It is therefore significantly more complex than our approach both in reasoning and in metatheory. Caper uses Z3, as do we, but its heap reasoning is custom-built, and we are uncertain whether it could verify an example of the complexity of the CLH lock or lock-coupling list. We handle these examples using the GRASShopper heap solver [20], and our approach is designed to be generic in the choice of back-end solver.

We have not undertaken a precise comparison, but we believe for our heap-based examples, all competing tools would require significantly more annotations. For example, the CLH lock is our most challenging algorithm: in Verifast, its code and proof require 343 lines, while Starling requires 134 lines.Footnote 3

Several other tools share similarities with our approach. VCC [4] is a verifier based on Z3 which has been used to verify large-scale concurrent C programs. In VCC, concepts such as permission and ownership are encoded through auxiliary state. Our approach encodes these properties through view interactions.

QED [9] is a refinement-based approach to verification: concurrent programs are related to their atomic specifications by a series of sound refinement steps. We are hopeful that our approach could be combined with this style of reasoning as well as CSL-style program logic.

Our SMT/Z3 back-end has similarities to Threader [11], and unlike our tool, Threader can infer invariants using a Horn-clause solver. However, it only targets shared-variable algorithms – we can handle heap-based algorithms. Invariant inference in our approach is a topic of future work.

There is a lot of work on model-checking concurrent systems – e.g. [21, 27]. In model-checking terms we require significant annotation, but our context reduction means that our proofs apply to an unbounded number of threads, context switches and unrolling of loops.

8 Conclusions

We have presented a new logic-based approach to verifying concurrent programs. Our approach is lightweight, automated, and based on a sound bedrock of existing theory. Because we build on the generic Views framework, we believe our approach could be reused by other concurrent logics as a way to target sequential solvers.

One next step will be invariant inference for Starling. Our proof terms are already in quasi-Horn clause form, and preliminary experiments suggest we can infer view definitions using an off-the-shelf solver such as HSF [10]. We also plan to extend Starling with modular reasoning, meaning that proofs of libraries and clients can be performed separately, as in iCAP [24]. Finally, we plan to extend Starling to prove algorithm linearizability rather than pre-post specifications, as in Vafeiadis [26] and Liang and Feng [15].