Abstract
This paper presents, we believe, the most comprehensive evidence of a theorem prover’s soundness to date. Our subject is the Milawa theorem prover. We present evidence of its soundness down to the machine code. Milawa is a theorem prover styled after NQTHM and ACL2. It is based on an idealised version of ACL2’s computational logic and provides the user with high-level tactics similar to ACL2’s. In contrast to NQTHM and ACL2, Milawa has a small kernel that is somewhat like an LCF-style system. We explain how the Milawa theorem prover is constructed as a sequence of reflective extensions from its kernel. The kernel establishes the soundness of these extensions during Milawa’s bootstrapping process. Going deeper, we explain how we have shown that the Milawa kernel is sound using the HOL4 theorem prover. In HOL4, we have formalized its logic, proved the logic sound, and proved that the source code for the Milawa kernel (1,700 lines of Lisp) faithfully implements this logic. Going even further, we have combined these results with the x86 machine-code level verification of the Lisp runtime Jitawa. Our top-level theorem states that Milawa can never claim to prove anything that is false when it is run on this Lisp runtime.
Similar content being viewed by others
References
Hoare, C.A.R.: An axiomatic basis for computer programming. Commun. ACM 12(10), 576–580 (1969)
Kaufmann, M., Manolios, P., Moore, J.S.: Computer-Aided Reasoning: An Approach. Kluwer Academic Publishers, Norwell (2000)
Bertot, Y., Castéran, P.: Interactive Theorem Proving and Program Development: Coq’Art: The Calculus of Inductive Constructions. Texts in Theoretical Computer Science. Springer, Berlin (2004)
Slind, K., Norrish, M.: A brief overview of HOL4. In: Mohamed, O.A., Muñoz, C., Tahar, S. (eds.) TPHOLs. LNCS, Springer, Berlin (2008)
Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL — A Proof Assistant for Higher-Order Logic Volume 2283 of LNCS. Springer, Berlin Heidelberg (2002)
Davis, J.C.: A Self-Verifying Theorem Prover. PhD thesis, University of Texas, Austin (2009)
Boyer, R.S., Kaufmann, M., Moore, J.S.: The Boyer-Moore theorem prover and its interactive enhancement. Comput. Math. Appl. 29(2), 27–62 (1995)
Gordon, M.J., Milner, A.J., Wadsworth, C.P.: Edinburgh LCF: A Mechanised Logic of Computation. LNCS, Springer, Berlin (1979)
Harrison, J.: HOL Light: an overview. In: Berghofer, S., Nipkow, T., Urban, C., Wenzel, M. (eds.) TPHOLs. LNCS, Springer, Berlin (2009)
Myreen, M.O., Davis, J.: A verified runtime for a verified theorem prover. In: Interactive Theorem Proving (ITP). LNCS, Springer, Berlin (2011)
Harrison, J.: Towards self-verification of HOL light. In: Furbach, U., Shankar, N. (eds.) IJCAR. LNAI, Springer, Berlin (2006)
Griffioen, D., Huisman, M.: A comparison of PVS and Isabelle/HOL. In: Gundy, J., Newey, M. (eds.) Theorem Proving in Higher Order Logics (TPHOLS ’98). Volume 1479 of LNCS, pp. 123–142. Springer, Berlin (1998)
Brummayer, R., Biere, A.: Fuzzing and delta-debugging SMT solvers. In: SMT ’09, ACM, pp. 1–5 (2009)
Brummayer, R., Lonsing, F., Biere, A.: Automated testing and debugging of SAT and QBF solvers. In: Proceedings of the 13th International Conference on Theory and Applications of Satisfiability Testing. SAT ’10, pp. 44–57. Springer, Berlin (2010)
Järvisalo, M, Heule, M.J., Biere, A.: Inprocessing rules. In: Gramlich, B., Miller, D., Sattler, U. (eds.) Automated Reasoning. Volume 7364 of LNCS, pp. 355–370. Springer, Berlin (2012)
Barendregt, H., Wiedijk, F.: The challenge of computer mathematics. Phil. Trans. R. Soc. A 363(1835), 2351–2375 (2005)
Wetzler, N., Heule, M., Hunt, W.A. Jr.: DRAT-trim: Efficient checking and trimming using expressive clausal proofs. In: SAT ’14. Volume 8561 of LNCS, pp. 422–429. Springer, Berlin (2014)
Balabanov, V., Jiang, J.R.: Unified qbf certification and its applications. Form. Methods Syst. Des. 41(1), 45–65 (2012)
Böhme, S., Fox, A., Sewell, T., Weber, T.: Reconstruction of Z3’s bit-vector proofs in HOL4 and Isabelle/HOL. In: CPP ’11. Volume 7086 of LNCS, pp. 183–198. Springer, Berlin (2011)
McCune, W., Shumsky, O.: Ivy: a preprocessor and proof checker for first-order logic. In: Computer-Aided Reasoning: ACL2 Case Studies. Kluwer Academic Publishers, Norwell (2000)
Darbari, A., Fischer, B., Marques-Silva, J.: Industrial-strength certified SAT solving through verified SAT proof checking. In: ICTAC ’10. Volume 6255 of LNCS, pp. 260–274. Springer, Berlin (2010)
Weber, T., Amjad, H.: Efficiently checking propositional refutations in HOL theorem provers. J. Appl. Logic 7(1), 26–40 (2009)
Marić, F.: Formalization and implementation of modern SAT solvers. J. Autom. Reason. 43(1), 81–119 (2009)
Hurd, J.: The OpenTheory standard theory library. In: Bobaru, M., Havelund, K., Holzmann, G.J., Joshi, R. (eds.) NASA Formal Methods. LNCS, Springer, Berlin (2011)
Kaufmann, M., Moore, J.S.: Structured theory development for a mechanized logic. J. Autom. Reason. 26(2), 161–203 (2001)
Davis, J.: Reasoning about file input in ACL2. In: Manolios, P., Wilding, M. (eds.) ACL2 ’06 (2006)
Kaufmann, M., Moore, J.: Design goals of ACL2. Technical Report 101, Computational Logic, Inc. (1994)
Rager, D.L., Hunt, W.A. Jr.: Implementing a parallelism library for a functional subset of LISP. In: International Lisp Conference (ILC), pp. 18–30 (2009)
Boyer, R.S., Hunt, W.A. Jr.: Function memoization and unique object representation for ACL2 functions. In: ACL2 ’06, ACM (2006)
Hunt, W.A. Jr., Krug, R.B., Moore, J.: Linear and nonlinear arithmetic in ACL2. In: Geist, D. (ed.) Correct Hardware Design and Verification Methods (CHARME ’03). Volume 2860 of LNCS, pp. 319–333. Springer, Berlin (2003)
Hunt, W.A. Jr., Kaufmann, M., Krug, R.B., Moore, J., Smith, E.W.: Meta reasoning in ACL2. In: Hurd, J., Melham, T. (eds.) Theorem Proving in Higher Order Logics (TPHOLS ’05). Volume 3603 of LNCS, pp. 163–178. Springer, Berlin (2005)
Brock, B., Kaufmann, M., Moore, J.S.: Rewriting with equivalence relations in ACL2. J. Autom. Reason. 40(4), 293–306 (2008)
Kaufmann, M., Moore, J.S., Ray, S., Reeber, E.: Integrating external deduction tools with acl2. J. Autom. Reason. 7(1), 3–25 (2009)
Harrison, J.: Metatheory and reflection in theorem proving: a survey and critique. Technical Report CRC-053. SRI Cambridge, Millers Yard, Cambridge, UK (1995)
McCarthy, J.: Recursive functions of symbolic expressions and their computation by machine, part 1. Commun. ACM 3(4), 184–195 (1960)
Shoenfield, J.R.: Mathematical Logic. The Association for Symbolic Logic (1967)
Shankar, N.: Metamathematics, Machines, and Gödel’s Proof. Cambridge University Press, Cambridge (1994)
Boyer, R.S., Moore, J.S.: A Computational Logic Handbook, 2nd edn. Academic Press, New York (1997)
Myreen, M.O., Gordon, M.J.C.: Verified LISP implementations on ARM, x86 and PowerPC. In: Berghofer, S., Nipkow, T., Urban, C., Wenzel, M. (eds.) TPHOLs. LNCS, Springer, Berlin (2009)
Kaufmann, M., Slind, K.: Proof pearl: Wellfounded induction on the ordinals up to 𝜖 0. In: Schneider, K., Brandt, J. (eds.) Theorem Proving in Higher Order Logics (TPHOLs), pp. 294–301. LNCS, Springer, Berlin (2007)
Myreen, M.O.: Functional programs: conversions between deep and shallow embeddings. In: Interactive Theorem Proving (ITP). LNCS, Springer, Berlin (2012)
Myreen, M.O.: Verified just-in-time compiler on x86. In: Hermenegildo, M.V., Palsberg, J. (eds.) Principles of Programming Languages (POPL), ACM (2010)
Myreen, M.O.: Formal verification of machine-code programs. PhD thesis, University of Cambridge, Cambridge (2009)
Myreen, M.O., Slind, K., Gordon, M.J.: Extensible proof-producing compilation. In: de Moor, O., Schwartzbach, M.I. (eds.) Compiler Construction (CC). LNCS, Springer, Berlin (2009)
Manolios, P., Moore, J.S.: Partial functions in ACL2. J. Autom. Reason. 31 (2), 107–127 (2003)
Kumar, R., Arthan, R., Myreen, M.O., Owens, S.: HOL with definitions: semantics, soundness, and a verified implementation. In: Klein, G., Gamboa, R. (eds.) Interactive Theorem Proving (ITP). LNCS, Springer, Berlin (2014)
Myreen, M.O., Owens, S., Kumar, R.: Steps towards verified implementations of HOL light. In: Blazy, S., Paulin-Mohring, C., Pichardie, D. (eds.) Interactive Theorem Proving (ITP). LNCS, Springer, Berlin (2013)
Kumar, R., Myreen, M.O., Norrish, M., Owens, S.: CakeML: a verified implementation of ML. In: Jagannathan, S., Sewell, P. (eds.) Principles of Programming Languages (POPL), ACM (2014)
Gordon, M.J.C., Hunt, W.A. Jr., Kaufmann, M., Reynolds, J.: An embedding of the ACL2 logic in HOL. In: International Workshop on the ACL2 Theorem Prover and Its Applications (ACL2), ACM, pp. 40–46 (2006)
Gordon, M.J.C., Reynolds, J., Hunt, W.A. Jr., Kaufmann, M.: An integration of HOL and ACL2. In: Formal Methods in Computer-Aided Design (FMCAD). IEEE Computer Society, pp. 153–160 (2006)
McCune, W., Shumsky, O.: System description: Ivy. In: Automated Deduction (CADE), pp. 401–405. LNCS, Springer, Berlin (2000)
Ridge, T., Margetson, J.: A mechanically verified, sound and complete theorem prover for first order logic. In: Hurd, J., Melham, T.F. (eds.) TPHOLs. LNCS, Springer, Berlin (2005)
Marić, F.: Formal verification of a modern SAT solver by shallow embedding into Isabelle/HOL. Theor. Comput. Sci. 411(50), 4333–4356 (2010)
Haftmann, F., Bulwahn, L.: Code generation from Isabelle/HOL theories Isabelle2011-1 Documentation. http://isabelle.in.tum.de
Author information
Authors and Affiliations
Corresponding author
Additional information
Dedicated to John McCarthy (1927–2011)
The second author was partially supported by the Royal Society UK and the Swedish Research Council.
Rights and permissions
About this article
Cite this article
Davis, J., Myreen, M.O. The Reflective Milawa Theorem Prover is Sound (Down to the Machine Code that Runs it). J Autom Reasoning 55, 117–183 (2015). https://doi.org/10.1007/s10817-015-9324-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10817-015-9324-6