What Can Large Language Models Do for Theorem Proving and Formal Methods?

Johansson, Moa

doi:10.1007/978-3-031-46002-9_25

Moa Johansson⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14380))

Included in the following conference series:

International Conference on Bridging the Gap between AI and Reality

643 Accesses
1 Citations

Abstract

With the introduction of large language models, AI for natural language have taken a leap. These systems are now also being used for tasks that has previously been dominated by symbolic methods, such as program synthesis and even to support formalising mathematics and assist theorem provers. We survey some recent applications in theorem proving, focusing on how they combine neural networks with symbolic systems, and report on a case-study of using GPT-4 for the task of automated conjecturing a.k.a. theory exploration.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Colton, S.: The HR program for theorem generation. In: Voronkov, A. (ed.) CADE 2002. LNCS (LNAI), vol. 2392, pp. 285–289. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45620-1_24
Chapter Google Scholar
Cunningham, G., Bunescu, R.C., Juedes, D.: Towards autoformalization of mathematics and code correctness: experiments with elementary proofs (2023)
Google Scholar
Davis, E., Aaronson, S.: Testing GPT-4 with Wolfram Alpha and Code Interpreter plug-ins on math and science problems (2023)
Google Scholar
Fajtlowicz, S.: On conjectures of Graffiti. Ann. Discrete Math. 38, 113–118 (1988)
Article MathSciNet Google Scholar
First, E., Rabe, M.N., Ringer, T., Brun, Y.: Baldur: Whole-proof generation and repair with large language models (2023). https://arxiv.org/abs/2303.04910
Jiang, A.Q., et al.: Thor: Wielding hammers to integrate language models and automated theorem provers. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K., editors, Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fUeOyt-2EOp
Johansson, M., Smallbone, N.: Exploring mathematical conjecturing with large language models. In: Proceedings of NeSy 2023, 17th International Workshop on Neural-Symbolic Learning and Reasoning (2023)
Google Scholar
Johansson, M., Dixon, L., Bundy, A.: Conjecture synthesis for inductive theories. J. Autom. Reason. 47(3), 251–289, Oct (2011). ISSN 1573–0670. https://doi.org/10.1007/s10817-010-9193-y
Johansson, M., Rosén, D., Smallbone, N., Claessen, K.: Hipster: integrating theory exploration in a proof assistant. In: Watt, S.M., Davenport, J.H., Sexton, A.P., Sojka, P., Urban, J. (eds.) CICM 2014. LNCS (LNAI), vol. 8543, pp. 108–122. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08434-3_9
Chapter Google Scholar
Kaliszyk, C., Urban, J., Vyskocil, J.: System description: statistical parsing of informalized Mizar formulas. In: 2017 19th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), pp. 169–172 (2017). https://doi.org/10.1109/SYNASC.2017.00036
Lenat, D.B.: AM, an artificial intelligence approach to discovery in mathematics as heuristic search (1976)
Google Scholar
Lewkowycz, A., et al.: Solving quantitative reasoning problems with language models. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K., editors, Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=IFXTZERXdM7
McCasland, R.L., Bundy, A., Smith, P.F.: MATHsAiD: automated mathematical theory exploration. Appl. Intell. 47(3), 585–606 (2017). https://doi.org/10.1007/s10489-017-0954-8
Article Google Scholar
OpenAI. GPT-4 technical report. Technical report (2023). https://cdn.openai.com/papers/gpt-4.pdf
Rabe, M.N., Lee, D., Bansal, K., Szegedy, C.: Mathematical reasoning via self-supervised skip-tree training. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=YmqAnY0CMEy
Smallbone, N., Johansson, M., Claessen, K., Algehed, M.: Quick specifications for the busy programmer. J. Functional Program., 27 (2017). https://doi.org/10.1017/S0956796817000090
Szegedy, C.: A promising path towards autoformalization and general artificial intelligence. In: Benzmüller, C., Miller, B. (eds.) CICM 2020. LNCS (LNAI), vol. 12236, pp. 3–20. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-53518-6_1
Chapter Google Scholar
Szegedy, C.: A promising path towards autoformalization and general artificial intelligence. In: Benzmüller, C., Miller, B. (eds.) CICM 2020. LNCS (LNAI), vol. 12236, pp. 3–20. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-53518-6_1
Chapter Google Scholar
Wu, Y., et al.: Autoformalization with large language models. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K., editors, Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=IUikebJ1Bf0
Yang, K., et al.: LeanDojo: theorem proving with retrieval-augmented language models. arXiv preprint arXiv:2306.15626 (2023)

Download references

Author information

Authors and Affiliations

Chalmers University of Technology, Gothenburg, Sweden
Moa Johansson

Authors

Moa Johansson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Moa Johansson .

Editor information

Editors and Affiliations

TU Dortmund University, Dortmund, Germany
Bernhard Steffen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Johansson, M. (2024). What Can Large Language Models Do for Theorem Proving and Formal Methods?. In: Steffen, B. (eds) Bridging the Gap Between AI and Reality. AISoLA 2023. Lecture Notes in Computer Science, vol 14380. Springer, Cham. https://doi.org/10.1007/978-3-031-46002-9_25

Download citation

DOI: https://doi.org/10.1007/978-3-031-46002-9_25
Published: 14 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46001-2
Online ISBN: 978-3-031-46002-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

What Can Large Language Models Do for Theorem Proving and Formal Methods?