Improving Stateful Premise Selection with Transformers

Proroković, Krsto; Wand, Michael; Schmidhuber, Jürgen

doi:10.1007/978-3-030-81097-9_6

Krsto Proroković¹⁰,
Michael Wand¹⁰ &
Jürgen Schmidhuber¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12833))

Included in the following conference series:

International Conference on Intelligent Computer Mathematics

765 Accesses

Abstract

Premise selection is a fundamental task for automated reasoning in large theories. A recently proposed approach formulates premise selection as a sequence-to-sequence problem, called stateful premise selection. Given a theorem statement, the goal of a stateful premise selection method is to predict the set of premises that would be useful in proving it. In this work we use the Transformer architecture for learning the stateful premise selection method. We outperform the existing recurrent neural network baseline and improve upon the state of the art on a recently proposed dataset.

This work was supported by the ERC Advanced grant no. 742870. We would like to thank Kazuki Irie for constructive feedback on the manuscript as well as Róbert Csordás and Dieuwke Hupkes for useful advice about the Transformer architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The code for reproducing the results displayed here is available at https://github.com/krstopro/stateful-premise-selection-with-transformers.

References

Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)
Google Scholar
Grabowski, A., Kornilowicz, A., Naumowicz, A.: Mizar in a nutshell. J. Formaliz. Reason. 3(2), 153–245 (2010)
MathSciNet MATH Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Hoder, K., Voronkov, A.: Sine qua non for large theory reasoning. In: Bjørner, N., Sofronie-Stokkermans, V. (eds.) CADE 2011. LNCS (LNAI), vol. 6803, pp. 299–314. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22438-6_23
Chapter Google Scholar
Irving, G., Szegedy, C., Alemi, A.A., Een, N., Chollet, F., Urban, J.: Deepmath - deep sequence models for premise selection. In: Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29. Curran Associates, Inc. (2016). https://proceedings.neurips.cc/paper/2016/file/f197002b9a0853eca5e046d9ca4663d5-Paper.pdf
Kaliszyk, C., Rabe, F.: A survey of languages for formalizing mathematics. In: Benzmüller, C., Miller, B. (eds.) CICM 2020. LNCS (LNAI), vol. 12236, pp. 138–156. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-53518-6_9
Chapter MATH Google Scholar
Kaliszyk, C., Urban, J.: Learning-assisted automated reasoning with Flyspeck. J. Autom. Reason. 53(2), 173–213 (2014)
Article MathSciNet Google Scholar
Kaliszyk, C., Urban, J.: Mizar 40 for Mizar 40. J. Autom. Reason. 55(3), 245–256 (2015)
Article MathSciNet Google Scholar
Klein, G., Kim, Y., Deng, Y., Senellart, J., Rush, A.: OpenNMT: open-source toolkit for neural machine translation. In: Proceedings of ACL 2017, System Demonstrations, pp. 67–72. Association for Computational Linguistics, Vancouver, Canada (Jul 2017). https://www.aclweb.org/anthology/P17-4012
Kuhn, H.W.: The Hungarian method for the assignment problem. Naval Res. Logist. Q. 2(1–2), 83–97 (1955)
Article MathSciNet Google Scholar
Loos, S., Irving, G., Szegedy, C., Kaliszyk, C.: Deep network guided proof search. In: LPAR-21, 21st International Conference on Logic for Programming, Artificial Intelligence and Reasoning, pp. 85–105 (2017). http://arxiv.org/pdf/1701.06972.pdf. ISSN 2398–7340
Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1412–1421 (2015)
Google Scholar
Megill, N., Wheeler, D.A.: Metamath: A Computer Language for Mathematical Proofs (2019). http://us.metamath.org/downloads/metamath.pdf
Meng, J., Paulson, L.C.: Lightweight relevance filtering for machine-generated resolution problems. J. Appl. Log. 7(1), 41–57 (2009)
Article MathSciNet Google Scholar
Olsák, M., Kaliszyk, C., Urban, J.: Property invariant embedding for automated reasoning. In: Giacomo, G.D., et al. (eds.) ECAI 2020–24th European Conference on Artificial Intelligence, 29 Aug – 8 Sept 2020, Santiago de Compostela, Spain, Aug 29 – Sept 8, 2020 - Including 10th Conference on Prestigious Applications of Artificial Intelligence (PAIS 2020). Frontiers in Artificial Intelligence and Applications, vol. 325, pp. 1395–1402. IOS Press (2020). https://doi.org/10.3233/FAIA200244
Paliwal, A., Loos, S., Rabe, M., Bansal, K., Szegedy, C.: Graph representations for higher-order logic and theorem proving. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 2967–2974 (2020)
Google Scholar
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32, pp. 8024–8035. Curran Associates, Inc. (2019). http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
Piotrowski, B., Urban, J.: ATPboost: learning premise selection in binary setting with ATP feedback. In: Galmiche, D., Schulz, S., Sebastiani, R. (eds.) IJCAR 2018. LNCS (LNAI), vol. 10900, pp. 566–574. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-94205-6_37
Chapter Google Scholar
Piotrowski, B., Urban, J.: Stateful premise selection by recurrent neural networks. In: Albert, E., Kovacs, L. (eds.) LPAR23, LPAR-23: 23rd International Conference on Logic for Programming, Artificial Intelligence and Reasoning. EPiC Series in Computing, vol. 73, pp. 409–422. EasyChair (2020). 0). https://doi.org/10.29007/j5hd. https://easychair.org/publications/paper/g38n
Polu, S., Sutskever, I.: Generative language modeling for automated theorem proving. CoRR abs/2009.03393 (2020). https://arxiv.org/abs/2009.03393
Schlag, I., Irie, K., Schmidhuber, J.: Linear transformers are secretly fast weight memory systems. CoRR abs/2102.11174 (2021). https://arxiv.org/abs/2102.11174
Schmidhuber, J.: Reducing the ratio between learning complexity and number of time varying variables in fully recurrent nets. In: Gielen, S., Kappen, B. (eds.) ICANN 1993, pp. 460–463. Springer, London (1993). https://doi.org/10.1007/978-1-4471-2063-6_110
Chapter Google Scholar
Fermüller, C.G., Voronkov, A. (eds.): LPAR 2010. LNCS, vol. 6397. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-16242-8
Book MATH Google Scholar
Sutcliffe, G.: The TPTP world – infrastructure for automated reasoning. In: Clarke, E.M., Voronkov, A. (eds.) LPAR 2010. LNCS (LNAI), vol. 6355, pp. 1–12. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17511-4_1
Chapter Google Scholar
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. Adv. Neural Inf. Process. Syst. 27, 3104–3112 (2014)
Google Scholar
Tsivtsivadze, E., Urban, J., Geuvers, H., Heskes, T.: Semantic graph kernels for automated reasoning. In: Proceedings of the 2011 SIAM International Conference on Data Mining, pp. 795–803. SIAM (2011)
Google Scholar
Urban, J.: MPTP 0.2: design, implementation, and initial experiments. J. Autom. Reason. 37(1–2), 21–43 (2006)
Google Scholar
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Instituto Dalle Molle di Studi sull’Intelligenza Artificiale (IDSIA), USI & SUPSI, Lugano, Switzerland
Krsto Proroković, Michael Wand & Jürgen Schmidhuber

Authors

Krsto Proroković
View author publications
You can also search for this author in PubMed Google Scholar
Michael Wand
View author publications
You can also search for this author in PubMed Google Scholar
Jürgen Schmidhuber
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Krsto Proroković .

Editor information

Editors and Affiliations

Heriot-Watt University, Edinburgh, UK
Fairouz Kamareddine
University of Bologna, Bologna, Italy
Claudio Sacerdoti Coen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Proroković, K., Wand, M., Schmidhuber, J. (2021). Improving Stateful Premise Selection with Transformers. In: Kamareddine, F., Sacerdoti Coen, C. (eds) Intelligent Computer Mathematics. CICM 2021. Lecture Notes in Computer Science(), vol 12833. Springer, Cham. https://doi.org/10.1007/978-3-030-81097-9_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-81097-9_6
Published: 20 July 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-81096-2
Online ISBN: 978-3-030-81097-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics