A Refinement Proof for a Garbage Collector

Havelund, Klaus; Shankar, Natarajan

doi:10.1007/978-3-030-31514-6_6

Klaus Havelund¹² &
Natarajan Shankar¹³

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11500))

385 Accesses

Abstract

We describe how the PVS theorem prover has been used to verify a safety property of a widely studied garbage collection algorithm. The safety property asserts that “nothing but garbage is ever collected”. The garbage collection algorithm and its composition with the user program can be regarded as a concurrent system with two processes working on a shared memory. Such concurrent systems can be encoded in PVS as state transition systems using a model similar to TLA [16]. The safety criterion is formulated as a refinement and proved using refinement mappings. Russinoff [19] originally verified the algorithm in the Boyer-Moore prover, but his proof was not based on refinement. Furthermore, the safety property formulation required a glass box view of the algorithm. Using refinement, however, the safety criterion makes sense independent of the garbage collection algorithm. As a by-product, we encode a version of the theory of refinement mappings in PVS. The paper reflects substantial work that was done over two decades ago, but which is still relevant.

K.Havelund – The research performed by his author was carried out at LITP, Paris 6, France (supported by an HCM grant); SRI International, California, USA; Aalborg University, Denmark; and Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
PVS stands for Prototype Verification System.
2.
The actual PVS specification shown on page 23 is more abstract and does not specify the memory as being implemented as an array. We use an array implementation here for clarity of presentation.
3.
By formulating this colouring as an iteration, we can avoid introducing a history variable at a lower refinement level. Note that any node can be coloured, not only accessible nodes. This allows a later refinement to colour nodes that originally were accessible, but later have become garbage.
4.
This allows for stuttering where rules are applied without changing the state.

References

PVS specification and verification system. http://pvs.csl.sri.com. Accessed 03 Mar 2019
Abadi, M., Lamport, L.: The existence of refinement mappings. Theor. Comput. Sci. 82, 253–284 (1991)
Article MathSciNet Google Scholar
Ben-Ari, M.: Algorithms for on-the-fly garbage collection. ACM TOPLAS 6, 333–344 (1984)
Article Google Scholar
Burdy, L.: B vs. Coq to prove a garbage collector. In: Boulton, R.J., Jackson, P.B. (eds.) 14th International Conference on Theorem Proving in Higher Order Logics: Supplemental Proceedings, pp. 85–97. September (2001)
Google Scholar
Coupet-Grimal, S., Nouvet, C.: Formal verification of an incremental garbage collector. J. Logic Comput. 13(6), 815–833 (2003)
Article MathSciNet Google Scholar
Dijkstra, E.W., Lamport, L., Martin, A., Scholten, C.S., Steffens, E.F.M.: On-the-fly garbage collection: an exercise in cooperation. ACM 21(11), 966–975 (1978)
Article Google Scholar
Doligez, D., Gonthier, G.: Portable, unobtrusive garbage collection for multiprocessor systems. In: Proceedings of the 21st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 70–83. ACM (1994)
Google Scholar
Sandberg Ericsson, A., Myreen, M.O., Åman Pohjola, J.: A verified generational garbage collector for CakeML. In: Ayala-Rincón, M., Muñoz, C.A. (eds.) ITP 2017. LNCS, vol. 10499, pp. 444–461. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66107-0_28
Chapter Google Scholar
Gammie, P., Hosking, A.L., Engelhardt, K.: Relaxing safely: verified on-the-fly garbage collection for x86-tso. In: ACM SIGPLAN Notices, vol. 50, pp. 99–109. ACM (2015)
Article Google Scholar
Gonthier, G.: Verifying the safety of a practical concurrent garbage collector. In: Alur, R., Henzinger, T.A. (eds.) CAV 1996. LNCS, vol. 1102, pp. 462–465. Springer, Heidelberg (1996). https://doi.org/10.1007/3-540-61474-5_103
Chapter Google Scholar
Havelund, K.: Mechanical verification of a garbage collector. In: Rolim, J., et al. (eds.) IPPS 1999. LNCS, vol. 1586, pp. 1258–1283. Springer, Heidelberg (1999). https://doi.org/10.1007/BFb0098007
Chapter Google Scholar
Havelund, K., Shankar, N.: Experiments in theorem proving and model checking for protocol verification. In: Gaudel, M.-C., Woodcock, J. (eds.) FME 1996. LNCS, vol. 1051, pp. 662–681. Springer, Heidelberg (1996). https://doi.org/10.1007/3-540-60973-3_113
Chapter Google Scholar
Havelund, K., Shankar, N.: A mechanized refinement proof for a garbage collector. Technical report, October 1997. http://www.havelund.com/Publications/gc-refine-report.pdf
Stenzel, O.: The Physics of Thin Film Optical Spectra. SSSS, vol. 44, pp. 163–180. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-21602-7
Book Google Scholar
Jackson, P.B.: Verifying a garbage collection algorithm. In: Grundy, J., Newey, M. (eds.) TPHOLs 1998. LNCS, vol. 1479, pp. 225–244. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0055139
Chapter Google Scholar
Lamport, L.: The temporal logic of actions. ACM TOPLAS 16(3), 872–923 (1994)
Article Google Scholar
McCreight, A., Shao, Z., Lin, C., Li, L.: A general framework for certifying garbage collectors and their mutators. In: ACM SIGPLAN Notices, vol. 42, pp. 468–479. ACM (2007)
Google Scholar
Pixley, C.: An incremental garbage collection algorithm for multi-mutator systems. Distrib. Comput. 3, 41–50 (1988)
Article Google Scholar
Russinoff, D.M.: A mechanically verified incremental garbage collector. Formal Aspects Comput. 6, 359–390 (1994)
Article Google Scholar
van de Snepscheut, J.L.A.: Algorithms for on-the-fly garbage collection revisited. Inf. Process. Lett. 24(4), 211–216 (1987)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Jet Propulsion Laboratory, California Institute of Technology, Pasadena, USA
Klaus Havelund
Computer Science Laboratory, SRI International, Menlo Park, USA
Natarajan Shankar

Authors

Klaus Havelund
View author publications
You can also search for this author in PubMed Google Scholar
Natarajan Shankar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Klaus Havelund .

Editor information

Editors and Affiliations

Technische Universität Wien, Vienna, Austria
Ezio Bartocci
University of Maryland, College Park, MD, USA
Rance Cleaveland
Technische Universität Wien, Vienna, Austria
Radu Grosu
University of Pennsylvania, Philadelphia, PA, USA
Oleg Sokolsky

Appendices

A Formalization in PVS

This appendix describes how in general transition systems and refinement mappings are encoded in PVS, and in particular how the garbage collector refinement is encoded.

1.1 A.1 Transition Systems and Their Refinement

Recall from Sect. 3 that an observed transition system is a five-tuple of the form: \((\varSigma ,\varSigma _o,I,N,\pi )\) (Definition 4). In PVS we model this as a theory with two type definitions, and three function definitions.

The correspondence with the five-tuple is as follows: \(\varSigma \) \(=\) \(\mathtt{State}\), \(\varSigma _o\) \(=\) \(\mathtt{O\_State}\), \(\pi \) \(=\) \(\mathtt{proj}\), I \(=\) \(\mathtt{init}\) and N \(=\) \(\mathtt{next}\). The init function is a predicate on states, while the next function is a predicate on pairs of states. We shall formulate the specification of the garbage collector as well as all its refinements in this way. It will become clear below how in particular the function next is defined. Now we can define what is a trace (Definition 2) and what is an invariant (Definition 3). This is done in the theory Traces.

The theory is parameterized with the State type of the observed transition system. The VAR declarations are just associations of types to names, such that in later definitions, axioms, and lemmas, these names are assumed to have the corresponding types. In addition, axioms and lemmas are assumed to be universally quantified with these names over the types. Note that pred[T] in PVS is short for the function space [T -> bool]. The type sequence[T] is short for [nat -> T]; that is: the set of functions from natural numbers to T. A sequence of States is hence an infinite enumeration of states. Given a transition system with initiality predicate init and next-state relation next, a sequence sq is a trace of this transition system if trace(init,next)(sq) holds. A predicate p is an invariant if invariant(init,next)(p) holds. That is: if for any trace tr, p holds in all positions n of that trace. Note how the predicate trace(init,next) (it is a predicate on sequences) is turned into a type in PVS by surrounding it with parentheses – the type containing all the elements for which the predicate holds, namely all the program traces.

The next notion we introduce in PVS is that of a refinement between two observed transition systems (Definition 5). The theory Refine_Predicate below defines the function refines, which is a predicate on a pair of observed transition systems: a low level implementation system as the first parameter, and a high level specification system as as the second parameter.

The theory is parameterized with the state space S_State of the high level specification theory, the state space I_State of the low level implementation theory, and the observed state space O_State, which we remember is common for the two observed transition systems. Refinement is defined as follows: for all traces i_tr of the implementation system, there exists a trace s_tr of the specification system, such that when mapping the respective projection functions to the traces, they become equal. The function map has the type map : [[D->R] -> [sequence[D] -> sequence[R]]] and simply applies a function to all the elements of a sequence. Finally, we introduce in the theory Refinement the notion of a refinement mapping (Definition 6) and its use for proving refinement (Theorem 1). The theory is parameterized with a specification observed transition system (prefixes S), an implementation observed transition system (prefixes I), an abstraction function abs, and an invariant I_inv over the implementation system.

The theory contains a number of assumptions on the parameters and a theorem, which has been proven using the assumptions. Hence, the way to use this parameterized theory is to apply it to arguments that satisfy the assumptions, prove these, and then obtain as a consequence, the theorem which states that the implementation refines the specification (corresponding to Theorem 1). This theorem has been proved once and for all. The assumptions are as stated in Definition 6. We shall further need to assume transitivity of the refinement relation, and this is formulated (and proved) in the theory Refine_Predicate_Transitive.

1.2 A.2 The Specification

In this section we outline how the initial specification from Sect. 5 of the garbage collector is modeled in PVS. We start with the specification of the memory structure, and then continue with the two processes that work on this shared structure.

1.2.1 A.2.1 The Memory

The memory type is introduced in the theory Memory, parameterized with the memory boundaries. That is, NODES, SONS, and ROOTS define respectively the number of nodes (rows), the number of sons (columns/cells) per node, and the number of nodes that are roots. They must all be positive natural numbers (different from 0). There is also an obvious assumption that ROOTS is not bigger than NODES. These three memory boundaries are parameters to all our theories. The Memory type is defined as an abstract (non-empty) type upon which a constant and collection of functions are defined. First, however, types of nodes, indexes and roots are defined. The constant null_array represents the initial memory containing 0 in all memory cells (axiom mem_ax1). The function son returns the pointer contained in a particular cell. That is, the expression son(n,i)(m) returns the pointer contained in the cell identified by node n and index i. Finally, the function set_son assigns a pointer to a cell. That is, the expression set_son(n,i,k)(m) returns the memory m updated in cell (n,i) to contain (a pointer to node) k. In order to define what is an accessible node, we introduce the function points_to, which defines what it means for one node, n1, to point to another, n2, in the memory m.

The function accessible is then defined inductively, yielding the least predicate on nodes n (true on the smallest set of nodes) where either n is a root, or n is pointed to from an already reachable node k. Finally we define the operation for appending a garbage node to the list of free nodes, that can be allocated by the mutator. This operation is defined abstractly, assuming as little as possible about its behaviour. Note that, since the free list is supposed to be part of the memory, we could easily have defined this operation in terms of the functions son and set_son, but this would have required that we took some design decisions as to how the list was represented (for example where the head of the list should be and whether new elements should be added first or last). The axiom append_ax defining the append operation says that in appending a garbage node, only that node becomes accessible, and the accessibility of all other nodes stays unchanged.

1.2.2 A.2.2 The Mutator and the Collector

The complete PVS formalization of the top level specification presented in Sect. 5 is given below.

The state is simply the memory, and so is the observable state. Hence, there are no hidden variables, and the projection function proj is the identity. The next-state relation next is defined as a disjunction between three disjuncts, each representing a possible single transition of the total system. The first two disjuncts represent a move of the mutator and the collector, respectively, each move defined through a function. The third possibility just represents stuttering: the fact that a process does not change the state (needed for technical reasons).

Since each process (mutator, collector) only has one location we do not model these locations explicitly. The function Rule_mutate represents a move by the mutator, which is non-deterministic in the choice of the nodes n,k and index i. The function, when applied to an old state, yields a new state, where (if k is accessible) a pointer has been changed. Non-deterministic choices are modeled via existential quantifications. Each transition function is defined in terms of an IF-THEN-ELSE expression, where the condition represents the guard of the transition (the situation where the transition may meaningfully be applied), and where the ELSE part returns the unchanged state, in case the guard is false^{Footnote 4}. The function Rule_append represents a move by the collector. In each step, either the mutator makes a move, or the collector does. This corresponds to an interleaving semantics of concurrency. Note how the repeated execution is guaranteed by our interpretation of what is a trace in terms of the next-state relation.

1.3 A.3 The First Refinement

In this section we outline how the first refinement from Sect. 6.1 of the garbage collector is modeled in PVS. In order to keep the presentation reasonably sized, we only illustrate this first refinement. The remaining refinements follow the same pattern. First, we describe a collection of colouring functions. The theory Coloured_Memory below introduces the primitives needed for colouring memory nodes. The type Colour represents the colours black (true) and white (false). The type Colours contains possible colourings of the memory, each being a mapping from nodes to their colours. The functions colour, set_colour and blackened are formalizations of those presented in Fig. 7.

We now show how the first refinement is formulated in PVS. The entire theory called Garbage_Collector1 is presented below.

First of all, the state type is a record type with a field for each program variable. In addition to the ordinary program variables, there is a program counter “variable” for each process: MU for the mutator, and CHI for the collector. Each program counter ranges over a type that contains the possible labels. The observed state is still just the memory, hence ignoring, for example, the colouring C. We see that the mutator next-state relation MUTATOR is now defined as a disjunction between a mutate transition and a colour transition. The collector next-state relation COLLECTOR is defined as the disjunction between six possible transitions.

B The Proof in PVS

The proof of a single refinement lemma (step) is divided into three activities: discovery and proof of function lemmas; discovery and proof of invariant lemmas; and proof of the refinement lemma. A function lemma states a property of one or more auxiliary functions involved, which in our case are for example properties about the functions accessible and blackened. An invariant is a predicate on states, and an invariant lemma states that an invariant holds in every reachable state of the concrete implementation (Garbage_Collector1 in our case). Recall that we needed such an invariant when applying the Refinement theory (page 21). The function lemmas are used in proofs of invariant lemmas, which again are used in proofs of refinement lemmas.

We shall show these lemmas for the first refinement, using a bottom-up presentation for pedagogical reasons, starting with function lemmas, and ending with the refinement lemma. In, reality, however, the proof was “discovered” top down: the refinement lemma was stated (by applying the Refinement theory to proper arguments), and during the proof of the corresponding ASSUMPTIONs, the need for invariant lemmas were discovered, and during their proofs, function lemmas were discovered.

1.1 B.1 Function Lemmas

During the proof, we need a new set of auxiliary functions to “observe” (or calculate) certain values based on the current state of the memory. These observer functions occur in invariants. In the first refinement step, we shall need the function blackened defined in the theory Memory_Observers below.

This function is similar to the function which is part of the first refinement, page 25, except that it has an additional natural number argument. The function returns true if all nodes above (and including) that argument are black if accessible. The theory contains other functions, but these are first needed in later refinements and will not be discussed here. The lemmas about auxiliary functions that we need for the first refinement are given in the theory Memory_Properties below.

The theory in its entirety contains other lemmas, needed for later refinements, which we shall however not present here. The lemma accessible1 is a key lemma, and it says that the set_son operator cannot turn garbage nodes into accessible nodes.

1.2 B.2 Invariant Lemmas

We can now state the invariant needed for the first refinement step. This is given in the theory Garbage_Collector1_Inv. The invariant really needed for the refinement proof is inv1, corresponding to the invariant on page 13; but during the proof of that, invariant inv2 is needed.

Invariant inv1 is in fact the safety property originally formulated for the garbage collector in [19]. Its proof requires a generalization, which is inv2. This shows an example, where we have to strengthen an invariant (inv1) to a stronger invariant (inv2), which is then proven instead.

1.3 B.3 The Refinement Lemma

The first refinement step is formulated as an application of the Refinement theory which we defined on page 21. This is done in the theory Refinement1 shown below.

The theory imports the specification garbage collector Garbage_Collector, giving it the name S; the implementation Garbage_Collector1, named I1; and the implementation invariant I defined in the theory Garbage_Collector1_Inv. The theory further defines the abstraction function abs, and finally applies the Refinement theory. This application gives rise to four TCCs (Type Checking Conditions) generated by PVS, which have to be proven in order for the PVS specification to be well formed (type check). Furthermore, the proof of these TCCs yields the correctness of the refinement. The TCCs are shown below:

There is a TCC for each ASSUMPTION of the Refinement theory. In particular R1_TCC3 states the simulation property, and R1_TCC4 states the invariant property. As illustrated in Subsect. 6.1.2 p. X, we show for each concrete transition which abstract transition it simulates, for example we had that APPEND.1 \(\ll \) COLLECT.1, which in this PVS setting is formulated as the following lemma.

The technique illustrated above for the first refinement step is repeated for the next two, yielding two further theories Refinement2 and Refinement3. All 3 refinements can now be composed, and the bottom level implementation can be shown to refine the top level specification using transitivity of the refinement relation. This is expressed in the theory Composed_Refinement below, where the theorem ref is our main correctness criteria.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Havelund, K., Shankar, N. (2019). A Refinement Proof for a Garbage Collector. In: Bartocci, E., Cleaveland, R., Grosu, R., Sokolsky, O. (eds) From Reactive Systems to Cyber-Physical Systems. Lecture Notes in Computer Science(), vol 11500. Springer, Cham. https://doi.org/10.1007/978-3-030-31514-6_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-31514-6_6
Published: 23 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31513-9
Online ISBN: 978-3-030-31514-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics