Abstract
With the introduction of large language models, AI for natural language have taken a leap. These systems are now also being used for tasks that has previously been dominated by symbolic methods, such as program synthesis and even to support formalising mathematics and assist theorem provers. We survey some recent applications in theorem proving, focusing on how they combine neural networks with symbolic systems, and report on a case-study of using GPT-4 for the task of automated conjecturing a.k.a. theory exploration.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Colton, S.: The HR program for theorem generation. In: Voronkov, A. (ed.) CADE 2002. LNCS (LNAI), vol. 2392, pp. 285–289. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45620-1_24
Cunningham, G., Bunescu, R.C., Juedes, D.: Towards autoformalization of mathematics and code correctness: experiments with elementary proofs (2023)
Davis, E., Aaronson, S.: Testing GPT-4 with Wolfram Alpha and Code Interpreter plug-ins on math and science problems (2023)
Fajtlowicz, S.: On conjectures of Graffiti. Ann. Discrete Math. 38, 113–118 (1988)
First, E., Rabe, M.N., Ringer, T., Brun, Y.: Baldur: Whole-proof generation and repair with large language models (2023). https://arxiv.org/abs/2303.04910
Jiang, A.Q., et al.: Thor: Wielding hammers to integrate language models and automated theorem provers. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K., editors, Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fUeOyt-2EOp
Johansson, M., Smallbone, N.: Exploring mathematical conjecturing with large language models. In: Proceedings of NeSy 2023, 17th International Workshop on Neural-Symbolic Learning and Reasoning (2023)
Johansson, M., Dixon, L., Bundy, A.: Conjecture synthesis for inductive theories. J. Autom. Reason. 47(3), 251–289, Oct (2011). ISSN 1573–0670. https://doi.org/10.1007/s10817-010-9193-y
Johansson, M., Rosén, D., Smallbone, N., Claessen, K.: Hipster: integrating theory exploration in a proof assistant. In: Watt, S.M., Davenport, J.H., Sexton, A.P., Sojka, P., Urban, J. (eds.) CICM 2014. LNCS (LNAI), vol. 8543, pp. 108–122. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08434-3_9
Kaliszyk, C., Urban, J., Vyskocil, J.: System description: statistical parsing of informalized Mizar formulas. In: 2017 19th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), pp. 169–172 (2017). https://doi.org/10.1109/SYNASC.2017.00036
Lenat, D.B.: AM, an artificial intelligence approach to discovery in mathematics as heuristic search (1976)
Lewkowycz, A., et al.: Solving quantitative reasoning problems with language models. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K., editors, Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=IFXTZERXdM7
McCasland, R.L., Bundy, A., Smith, P.F.: MATHsAiD: automated mathematical theory exploration. Appl. Intell. 47(3), 585–606 (2017). https://doi.org/10.1007/s10489-017-0954-8
OpenAI. GPT-4 technical report. Technical report (2023). https://cdn.openai.com/papers/gpt-4.pdf
Rabe, M.N., Lee, D., Bansal, K., Szegedy, C.: Mathematical reasoning via self-supervised skip-tree training. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=YmqAnY0CMEy
Smallbone, N., Johansson, M., Claessen, K., Algehed, M.: Quick specifications for the busy programmer. J. Functional Program., 27 (2017). https://doi.org/10.1017/S0956796817000090
Szegedy, C.: A promising path towards autoformalization and general artificial intelligence. In: Benzmüller, C., Miller, B. (eds.) CICM 2020. LNCS (LNAI), vol. 12236, pp. 3–20. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-53518-6_1
Szegedy, C.: A promising path towards autoformalization and general artificial intelligence. In: Benzmüller, C., Miller, B. (eds.) CICM 2020. LNCS (LNAI), vol. 12236, pp. 3–20. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-53518-6_1
Wu, Y., et al.: Autoformalization with large language models. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K., editors, Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=IUikebJ1Bf0
Yang, K., et al.: LeanDojo: theorem proving with retrieval-augmented language models. arXiv preprint arXiv:2306.15626 (2023)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Johansson, M. (2024). What Can Large Language Models Do for Theorem Proving and Formal Methods?. In: Steffen, B. (eds) Bridging the Gap Between AI and Reality. AISoLA 2023. Lecture Notes in Computer Science, vol 14380. Springer, Cham. https://doi.org/10.1007/978-3-031-46002-9_25
Download citation
DOI: https://doi.org/10.1007/978-3-031-46002-9_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46001-2
Online ISBN: 978-3-031-46002-9
eBook Packages: Computer ScienceComputer Science (R0)