Abstract
Interactive Theorem Provers (ITPs) are an indispensable tool in the arsenal of formal method experts as a platform for construction and (formal) verification of proofs. The complexity of the proofs in conjunction with the level of expertise typically required for the process to succeed can often hinder the adoption of ITPs. A recent strain of work has investigated methods to incorporate machine learning models trained on ITP user activity traces as a viable path towards full automation. While a valuable line of investigation, many problems still require human supervision to be completed fully, thus applying learning methods to assist the user with useful recommendations can prove more fruitful. Following the vein of user assistance, we introduce CoProver, a proof recommender system based on transformers, capable of learning from past actions during proof construction, all while exploring knowledge stored in the ITP concerning previous proofs. CoProver employs a neurally learnt sequence-based encoding of sequents, capturing long distance relationships between terms and hidden cues therein. We couple CoProver with the Prototype Verification System (PVS) and evaluate its performance on two key areas, namely: (1) Next Proof Action Recommendation, and (2) Relevant Lemma Retrieval given a library of theories. We evaluate CoProver on a series of well-established metrics originating from the recommender system and information retrieval communities, respectively. We show that CoProver successfully outperforms prior state of the art applied to recommendation in the domain. We conclude by discussing future directions viable for CoProver (and similar approaches) such as argument prediction, proof summarization, and more.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
A step is a single forward-predict pass over a training instance, and multiple steps can be performed over the same data during the training phase.
- 3.
Initial experiments with larger samples showed no difference in performance with a system trained with the smaller set.
- 4.
- 5.
References
Alama, J., Kühlwein, D., Tsivtsivadze, E., Urban, J., Heskes, T.: Premise selection for mathematics by corpus analysis and kernel methods. CoRR abs/1108.3446 (2011). arXiv:1108.3446
Balunovic, M., Bielik, P., Vechev, M.T.: Learning to solve SMT formulas. In: NeurIPS, pp. 10338–10349 (2018)
Bansal, K., Loos, S., Rabe, M., Szegedy, C., Wilcox, S.: Holist: An environment for machine learning of higher order logic theorem proving. In: International Conference on Machine Learning, pp. 454–463. PMLR (2019)
Biere, A., Heule, M., van Maaren, H., Walsh, T. (eds.): Handbook of Satisfiability. IOS Press (2009)
Blanchette, J.C., Greenaway, D., Kaliszyk, C., Kühlwein, D., Urban, J.: A learning-based fact selector for Isabelle/HOL. J. Autom. Reason. 57(3), 219–244 (2016)
Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., Shah, R.: Signature verification using a "siamese" time delay neural network. In: Advances in Neural Information Processing Systems, vol. 6 (1993)
Brown, T.B., et al.: Language models are few-shot learners. CoRR abs/2005.14165 (2020). arXiv:2005.14165
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota (Jun 2019). 10.18653/v1/N19-1423, http://aclanthology.org/N19-1423
Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. No. 57 in Monographs on Statistics and Applied Probability, Chapman & Hall/CRC, Boca Raton, Florida, USA (1993)
First, E., Brun, Y., Guha, A.: Tactok: semantics-aware proof synthesis. In: Proceedings of the ACM on Programming Languages 4(OOPSLA), pp. 1–31 (2020)
Gage, P.: A new algorithm for data compression. C Users J. D(2), 23–38 (1994)
Gauthier, T., Kaliszyk, C., Urban, J.: Learning to reason with hol4 tactics. arXiv preprint arXiv:1804.00595 (2018)
Gauthier, T., Kaliszyk, C., Urban, J., Kumar, R., Norrish, M.: Learning to prove with tactics. CoRR abs/1804.00596 (2018), arXiv:1804.00596
Gauthier, T., Kaliszyk, C., Urban, J., Kumar, R., Norrish, M.: Tactictoe: learning to prove with tactics. J. Autom. Reason. 65(2), 257–286 (2021)
Gransden, T., Walkinshaw, N., Raman, R.: SEPIA: search for proofs using inferred automata. In: Felty, A.P., Middeldorp, A. (eds.) CADE 2015. LNCS (LNAI), vol. 9195, pp. 246–255. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21401-6_16
Hsu, C.W., Chang, C.C., Lin, C.J.: A practical guide to support vector classification. Tech. rep., Department of Computer Science, National Taiwan University (2003). http://www.csie.ntu.edu.tw/~cjlin/papers.html
Huang, D., Dhariwal, P., Song, D., Sutskever, I.: Gamepad: A learning environment for theorem proving. arXiv preprint arXiv:1806.00608 (2018)
Huang, P.S., He, X., Gao, J., Deng, L., Acero, A., Heck, L.: Learning deep structured semantic models for web search using clickthrough data. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, pp. 2333–2338. CIKM ’13, Association for Computing Machinery, New York, NY, USA (2013). https://doi.org/10.1145/2505515.2505665
Irving, G., Szegedy, C., Alemi, A.A., Eén, N., Chollet, F., Urban, J.: Deepmath-deep sequence models for premise selection. Adv. Neural. Inf. Process. Syst. 29, 2235–2243 (2016)
Jakubův, J., Urban, J.: ENIGMA: efficient learning-based inference guiding machine. In: Geuvers, H., England, M., Hasan, O., Rabe, F., Teschke, O. (eds.) CICM 2017. LNCS (LNAI), vol. 10383, pp. 292–302. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-62075-6_20
Jiang, A.Q., et al.: Thor: Wielding hammers to integrate language models and automated theorem provers (2022). https://doi.org/10.48550/ARXIV.2205.10893, arXiv:2205.10893
Jiang, A.Q., et al.: Draft, sketch, and prove: Guiding formal theorem provers with informal proofs (2022). https://doi.org/10.48550/ARXIV.2210.12283, arXiv:2210.12283
Jones, K.S.: A statistical interpretation of term specificity and its application in retrieval. J. Document. 28, 11–21 (1972)
Kaliszyk, C., Urban, J.: Femalecop: Fairly efficient machine learning connection prover. In: Logic for Programming, Artificial Intelligence, and Reasoning. pp. 88–96. Springer (2015)
Komendantskaya, E., Heras, J., Grov, G.: Machine learning in proof general: Interfacing interfaces. arXiv preprint arXiv:1212.3618 (2012)
Kühlwein, D., Blanchette, J.C., Kaliszyk, C., Urban, J.: MaSh: machine learning for sledgehammer. In: Blazy, S., Paulin-Mohring, C., Pichardie, D. (eds.) ITP 2013. LNCS, vol. 7998, pp. 35–50. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39634-2_6
Lample, Get al.: Hypertree proof search for neural theorem proving (2022). https://doi.org/10.48550/ARXIV.2205.11491, arXiv:2205.11491
Li, S., et al.: PyTorch Distributed: Experiences on accelerating data parallel training. CoRR abs/2006.15704 (2020), arXiv:2006.15704
Li, W., Yu, L., Wu, Y., Paulson, L.C.: Isarstep: a benchmark for high-level mathematical reasoning. arXiv preprint arXiv:2006.09265 (2020)
Liu, Y., et al.: RoBERTa: A robustly optimized BERT pretraining approach (2019). https://doi.org/10.48550/ARXIV.1907.11692, arXiv:1907.11692
Loos, S., Irving, G., Szegedy, C., Kaliszyk, C.: Deep network guided proof search. arXiv preprint arXiv:1701.06972 (2017)
Luhn, H.P.: A statistical approach to mechanized encoding and searching of literary information. IBM J. Res. Dev. 1(4), 309–317 (1957). https://doi.org/10.1147/rd.14.0309
Manning, C.D., Raghavan, P., Schatze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge, UK (2008). http://nlp.stanford.edu/IR-book/information-retrieval-book.html
Mikuła, M., et al.: Magnushammer: A transformer-based approach to premise selection (2023)
Mitra, B., Craswell, N.: (2018)
Otten, J., Bibel, W.: leancop: lean connection-based theorem proving. J. Symb. Comput. 36(1–2), 139–161 (2003)
Owre, S., Rushby, J.M., Shankar, N.: PVS: a prototype verification system. In: Kapur, D. (ed.) CADE 1992. LNCS, vol. 607, pp. 748–752. Springer, Heidelberg (1992). https://doi.org/10.1007/3-540-55602-8_217
Rabe, M.N., Szegedy, C.: Towards the automatic mathematician. In: Platzer, A., Sutcliffe, G. (eds.) CADE 2021. LNCS (LNAI), vol. 12699, pp. 25–37. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-79876-5_2
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019), arXiv:1910.10683
Reimers, N., Gurevych, I.: Sentence-BERT: Sentence embeddings using Siamese BERT-Networks. CoRR abs/1908.10084 (2019). arXiv:1908.10084
Schulz, S.: E - a brainiac theorem prover. J. AI Commun. 15(2/3), 111–126 (2002)
Selsam, D., Bjørner, N.: Guiding high-performance SAT solvers with Unsat-core predictions. In: Janota, M., Lynce, I. (eds.) SAT 2019. LNCS, vol. 11628, pp. 336–353. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-24258-9_24
Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1715–1725. Association for Computational Linguistics, Berlin, Germany (Aug 2016). https://doi.org/10.18653/v1/P16-1162, http://aclanthology.org/P16-1162
Shankar, N.: Automated reasoning, fast and slow. In: Proceedings of the 24th international conference on Automated Deduction, pp. 145–161. CADE’13, Springer-Verlag, Berlin, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38574-2_10, http://dx.doi.org/10.1007/978-3-642-38574-2_10
Urban, J., Vyskočil, J., Štěpánek, P.: MaLeCoP machine learning connection prover. In: Brünnler, K., Metcalfe, G. (eds.) TABLEAUX 2011. LNCS (LNAI), vol. 6793, pp. 263–277. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22119-4_21
Vaswani, A.,et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Wang, M., Tang, Y., Wang, J., Deng, J.: Premise selection for theorem proving by deep graph embedding. arXiv preprint arXiv:1709.09994 (2017)
Whalen, D.: Holophrasm: a neural automated theorem prover for higher-order logic. arXiv preprint arXiv:1608.02644 (2016)
Wolf, T., et al.: Transformers: State-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45. Association for Computational Linguistics, Online (Oct 2020), www.aclweb.org/anthology/2020.emnlp-demos.6
Wu, M., Norrish, M., Walder, C., Dezfouli, A.: Tacticzero: Learning to prove theorems from scratch with deep reinforcement learning. arXiv preprint arXiv:2102.09756 (2021)
Yang, K., Deng, J.: Learning to prove theorems via interacting with proof assistants. In: International Conference on Machine Learning, pp. 6984–6994. PMLR (2019)
Acknowledgement
This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. HR00112290064 and by the National Institute of Aeronautics. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the United States Government or DARPA.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yeh, E., Hitaj, B., Owre, S., Quemener, M., Shankar, N. (2023). CoProver: A Recommender System for Proof Construction. In: Dubois, C., Kerber, M. (eds) Intelligent Computer Mathematics. CICM 2023. Lecture Notes in Computer Science(), vol 14101. Springer, Cham. https://doi.org/10.1007/978-3-031-42753-4_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-42753-4_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-42752-7
Online ISBN: 978-3-031-42753-4
eBook Packages: Computer ScienceComputer Science (R0)