Skip to main content
Log in

Axiom selection over large theory based on new first-order formula metrics

  • Published:
Applied Intelligence Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Axiom selection is a task that selects the most likely useful axioms from a large-scale axiom set for proving a given conjecture. Existing axiom selection methods either solely take shallow symbols into account or strongly dependent on previous successful proofs from homologous problems. To address these problems, we introduce a new metric to evaluate the dissimilarity between formulae and utilize it as an evaluator in the selection task. Firstly, we propose a substitution-based metric to compute the dissimilarity between terms. It is a pseudo-metric and can capture the in-depth syntactic difference trigged by both functional and variable subterms. We then extend it to atoms and prove the atom metric also to be a pseudo-metric. Treating formulae as atom sets, we define three kinds of dissimilarity metrics between formulae. Finally, we design and implement conjecture-oriented axiom selection methods based on newly proposed formula metrics. The experimental evaluation is conducted on the MPTP2078 benchmark and demonstrates dissimilarity-based axiom selection improves E prover’s performance. In the best case, it increases the ratio of successful proofs from 30.90% to 42.25%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Similar content being viewed by others

Notes

  1. The values of 1 and 2 for wv and wf are adopted from their use in E. The use of ln(x + 1) for g(x) is motivated by its common adoption as a continuous increasing function.

  2. https://github.com/JUrban/MPTP2078

  3. The command is ./eprover –satauto-schedule –free-numbers -s -R –delete-bad-limit= 2000000000 –definitional-cnf= 24 –print-statistics –print-version –proof-object –cpu-limit= 60 problem_file

  4. The command is ./vampire –mode axiom_selection –output_axiom_n ames on problem_file

  5. The command is ./eprover –satauto-schedule –free-numbers -s -R –delete-bad-limit= 2000000000 –definitional-cnf= 24 –print-statistics –print-version –proof-object –cpu-limit= 60 –sine problem_file

  6. The command is ./eprover –satauto-schedule –free-numbers -s -R –delete-bad-limit= 2000000000 –definitional-cnf= 24 –print-statistics –print-version –proof-object –cpu-limit= 10 problem_file

References

  1. Koubarakis M (2002) Querying Temporal Constraint Networks: A Unifying Approach. Appl Intell 17(3):297–311

    Article  Google Scholar 

  2. Sutcliffe G (2017) The TPTP Problem Library and Associated Infrastructure. From CNF to TH0, TPTP v6.4.0. J Autom Reas 59(4):483–502

    Article  MathSciNet  Google Scholar 

  3. McCune W (2005) Prover9 and Mace4. http://www.cs.unm.edu/~mccune/prover9/, Accessed December 14, 2018

  4. Kovács L, Voronkov A (2013) First-order theorem proving and Vampire. In: International Conference on Computer Aided Verification. Springer, pp 1–35

  5. Schulz S (2002) E–a brainiac theorem prover. Ai Commun 15(2, 3):111–126

    MATH  Google Scholar 

  6. Rudnicki P (1992) An overview of the Mizar project. In: Proceedings of the 1992 Workshop on Types for Proofs and Programs, pp 311–330

  7. Lenat D B (1995) CYC: A large-scale investment in knowledge infrastructure. Commun ACM 38(11):33–38

    Article  Google Scholar 

  8. Niles I, Pease A (2001) Towards a standard upper ontology. In: Proceedings of the international conference on Formal Ontology in Information Systems-Volume 2001. ACM, pp 2–9

  9. Kern C, Greenstreet M R (1999) Formal verification in hardware design: a survey. ACM Trans Des Autom Electron Syst (TODAES) 4(2):123–193

    Article  Google Scholar 

  10. Klein G, Elphinstone K, Heiser G, Andronick J, Cock D, Derrin P, Elkaduwe D, Engelhardt K, Kolanski R, Norrish M et al (2009) seL4: Formal verification of an OS kernel. In: Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles. ACM, pp 207–220

  11. Leroy X (2009) Formal verification of a realistic compiler. Commun ACM 52(7):107–115

    Article  Google Scholar 

  12. Sutcliffe G, Puzis Y (2007) Srass-a semantic relevance axiom selection system. In: International Conference on Automated Deduction. Springer, pp 295–310

  13. Pudlák P (2007) Semantic selection of premisses for automated theorem proving. ESARLT 257

  14. Roederer A, Puzis Y, Sutcliffe G (2009) Divvy: An ATP meta-system based on axiom relevance ordering. In: International Conference on Automated Deduction. Springer, pp 157–162

  15. Meng J, Paulson L C (2009) Lightweight relevance filtering for machine-generated resolution problems. J Appl Log 7(1):41– 57

    Article  MathSciNet  Google Scholar 

  16. Hoder K, Voronkov A (2011) Sine qua non for large theory reasoning. In: International Conference on Automated Deduction. Springer, pp 299–314

  17. Alama J, Heskes T, Kühlwein D, Tsivtsivadze E, Urban J (2014) Premise selection for mathematics by corpus analysis and kernel methods. J Autom Reason 52(2):191–213

    Article  MathSciNet  Google Scholar 

  18. Kaliszyk C, Urban J, Vyskocil J (2015) Efficient semantic features for automated reasoning over large theories. In: Twenty-Fourth International Joint Conference on Artificial Intelligence

  19. Piotrowski B, Urban J (2018) ATPboost: Learning premise selection in binary setting with atp feedback. In: International Joint Conference on Automated Reasoning. Springer, pp 566– 574

  20. Irving G, Szegedy C, Alemi A A, Eén N, Chollet F, Urban J (2016) Deepmath-deep sequence models for premise selection. In: Advances in Neural Information Processing Systems, pp 2235– 2243

  21. Wang M, Tang Y, Wang J, Deng J (2017) Premise selection for theorem proving by deep graph embedding. In: Advances in Neural Information Processing Systems, pp 2786–2796

  22. Crouse M, Abdelaziz I, Cornelio C, Thost V, Wu L, Forbus K, Fokoue A (2019) Improving Graph Neural Network Representations of Logical Formulae with Subgraph Pooling. arXiv:1911.06904

  23. Hutchinson A (1997) Metrics on terms and clauses. In: European Conference on Machine Learning. Springer, pp 138–145

  24. Nienhuys-Cheng S-H (1997) Distance between herbrand interpretations: A measure for approximations to a target concept. In: International Conference on Inductive Logic Programming. Springer, pp 213–226

  25. Ramon J, Bruynooghe M (1998) A framework for defining distances between first-order logic objects. In: International Conference on Inductive Logic Programming. Springer, pp 271–280

  26. Muggleton S (1991) Inductive logic programming. Gener Comput 8(4):295–318

    Article  Google Scholar 

  27. Bachmair L, Ganzinger H (2001) Resolution theorem proving. In: Handbook of automated reasoning. Elsevier, pp 19–99

  28. Plotkin G D (1970) A note on inductive generalization. Mach Intell 5(1):153–163

    MathSciNet  MATH  Google Scholar 

  29. Liu Q, Xu Y, He X (2020) Selecting premises by first-order formula dissimilarity. In: Developments Of Artificial Intelligence Technologies In Computation And Robotics-Proceedings Of The 14th International Flins Conference (Flins 2020), vol 12. World Scientific, pp 269

  30. McCune W W (1994) Otter 3.0 reference manual and guide. Technical Report. Argonne National Lab., IL

  31. Rudnicki P (1992) An Overview of the Mizar Project. In: Proceedings of the 1992 Workshop on Types for Proofs and Programs, pp 311–332

  32. Liu Q, Wu Z, Wang Z, Sutcliffe G Evaluation of Axiom Selection Techniques

  33. Jakubüv J, Urban J (2016) Extending E Prover with Similarity Based Clause Selection Strategies

  34. Sutcliffe G (2008) CASC-J4 the 4th IJCAR ATP system competition. In: International Joint Conference on Automated Reasoning. Springer, pp 457–458

Download references

Funding

This research were funded by the National Natural Science Foundation of China of grant numbers 61603307, 61673320, and 61473239 and the Ministry of Education in China Project of Humanities and Social Sciences of grant number 9YJCZH048.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qinghua Liu.

Ethics declarations

Competing interests

The authors declare no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

Appendix A

Proof Proof of Lemma 1

We need to prove that d satisfies all properties a pseudo-metric has if function S satisfies 1 - 6.

Let t1, t2 be arbitrary two terms. By condition 1, obviously,

$$d'(t_{1},t_{2})\geq_{2}(0,0).$$

If t2 = t1, t1 is the lgg of itself. The substitution mapping a term to itself is an empty substitution or a renaming substitution. By conditions 2 and 3,

$$d'(t_{1},t_{1})=_{2}(0,0).$$

According to the definition of d, the symmetry is also satisfied. That is,

$$d'(t_{1},t_{2})=_{2}d'(t_{2},t_{1}).$$

Finally, we prove that d satisfies the triangle inequality.

As shown in Fig. 2, t1, t2, and t3 are arbitrary three terms, lgg(t1,t2) = t4, lgg(t1,t3) = t5, lgg(t2,t3) = t6, lgg(t5,t6) = t7. Here, t3 must be a unified term of t5 and t6. For convenience, we use 𝜃ij to denote the substitution which maps from term ti to term tj.

Fig. 2
figure 2

Proof structure of triangle inequality

By the definition of d,

$$d'(t_{1},t_{2})=_{2}S'(\theta_{41})+S'(\theta_{42}),$$
$$d'(t_{1},t_{3})=_{2}S'(\theta_{51})+S'(\theta_{53}),$$
$$d'(t_{2},t_{3})=_{2}S'(\theta_{62})+S'(\theta_{63}).$$

By condition 5,

$$S'(\theta_{41})\leq_{2} s'(\theta_{71}), \quad S'(\theta_{42})\leq_{2} S'(\theta_{72}).$$

By condition 4,

$$S'(\theta_{71})\leq_{2} S'(\theta_{75})+S'(\theta_{51}),\quad S'(\theta_{72})\leq_{2} S'(\theta_{76})+S'(\theta_{62}).$$

By condition 6,

$$S'(\theta_{75})+S'(\theta_{76})\leq_{2} S'(\theta_{53})+S'(\theta_{63}).$$

Hence,

$$ \begin{aligned} d'(t_{1},t_{2})&=_{2}S'(\theta_{41})+S'(\theta_{42})\leq_{2} S'(\theta_{71})+S'(\theta_{72}) \\ &\leq_{2} S'(\theta_{75})+S'(\theta_{51})+S'(\theta_{76})+S'(\theta_{62}) \\ &\leq_{2}S'(\theta_{51})+S'(\theta_{53})+S'(\theta_{62})+S'(\theta_{63}) \\ &=_{2}d'(t_{1},t_{3})+d'(t_{2},t_{3}). \end{aligned} $$

Proof Proof of Theorem 1

By lemma 1, we only need to ensure that the function ST satisfies 1 - 6.

  1. 1.

    For any substitution 𝜃, obviously, by (3) and (4), we have

    $$S_{f}(\theta_{f})\geq0 \text{and} S_{v}(\theta_{v})\geq0.$$

    Hence, by (5),

    $$S_{T}(\theta)\geq_{2}(0,0).$$
  2. 2.

    For an empty substitution ε, both εf and εv are also empty substitutions. Hence, Sf(εf), Sv(εv) are empty sums, and we have

    $$S_{T}(\varepsilon)=_{2}(0,0).$$
  3. 3.

    For a renaming substitution η and t1η = t2, we have ηf = ε and ηv = η. Suppose that ηv = η = {X1Y1,...,XnYn}, where Y1, ..., Yn are distinct variables in t2 and ∀XiYiηv, \(occ_{t_{1}}^{+}(X_{i})=occ_{t_{2}}^{+}(Y_{i})\) and \(g(w_{v}(occ_{t_{2}}^{+}(Y_{i})-occ_{t1}^{+}(X_{i})))=g(0)=0\). Hence,

    $$ S_{T}(\eta)=_{2}(0,0).$$
  4. 4.

    As shown in Fig. 3 , for arbitrary three terms t1, t2 and t3, which satisfy:

    $$t_{1}\theta_{1}=t_{2}, t_{2}\theta_{2}=t_{3} \text{and} t_{1}\theta=t_{3}. $$
    Fig. 3
    figure 3

    Triangle structure

    To prove the inequality of ST(𝜃) ≤2ST(𝜃1) + ST(𝜃2), we would like to prove the conclusion of \(S_{f}(\theta _{f})\leq S_{f}(\theta _{1_{f}})+S_{f}(\theta _{2_{f}})\) at first.

    1. (a)

      𝜃f = . Hence, \(\theta _{1_{f}}=\boldsymbol {\emptyset }\) and \(\theta _{2_{f}}=\boldsymbol {\emptyset }\). We have

      \(S_{f}(\theta _{f})=S_{f}(\theta _{1_{f}})+S_{f}(\theta _{2_{f}})\).

    2. (b)

      𝜃f. Suppose that Ran(𝜃f) = {u1,...,um}(m ≥ 1). In 𝜃, ∀uiRan(𝜃f), let \(V_{t_{1}}(u_{i})=\{X_{i1}, ..., X_{ik}\}(k\geq 1)\), denoting the set of variables substituted by ui in t1. \(\forall X_{ij}\in V_{t_{1}}(u_{i})\), let \(occ_{t_{1}}(X_{ij})=n_{ij}(\geq 1)\) and \({\sum }_{j=1}^{k}n_{ij}=n_{i}\). Hence, Sf(𝜃f) also can be presented as:

      \(S_{f}(\theta _{f})={\sum }_{f\in \bigcup _{i=1}^{m}F(u_{i})}w_{f}(f)({\sum }_{i=1}^{m}n_{i}occ_{u_{i}}(f))\).

      For any \(f\in \bigcup _{i=1}^{m}F(u_{i})\), \({\sum }_{i=1}^{m}n_{i}occ_{u_{i}}(f)\) is considered as the number of new occurrences of f under 𝜃f. Obviously, the sum of the number of new occurrences of f under \(\theta _{1_{f}}\) and \(\theta _{2_{f}}\) equals to \({\sum }_{i=1}^{m}n_{i}occ_{u_{i}}(f)\). Hence,

      \(S_{f}(\theta _{f})=S_{f}(\theta _{1_{f}})+S_{f}(\theta _{2_{f}})\).

    Next, we start to prove \(S_{v}(\theta _{v})\leq S_{v}(\theta _{1_{v}})+S_{v}(\theta _{2_{v}})\).

    1. (a)

      𝜃v is a renaming substitution or an empty substitution. \(\theta _{1_{v}}\) and \(\theta _{2_{v}}\) are either empty substitution or renaming substitution (should not be empty at the same time). By the previous proof, we know

      \(S_{v}(\theta _{v})=S_{v}(\theta _{1_{v}})=S_{v}(\theta _{2_{v}})=0\).

      Hence,

      \(S_{v}(\theta _{v})=S_{v}(\theta _{1_{v}})+S_{v}(\theta _{2_{v}})\).

    2. (b)

      Otherwise, suppose that 𝜃v = {X1Z1,...,XnZn}. ∀XiZi𝜃v, let \(\theta _{v_{i}}=\{X_{i}\mapsto Z_{i}\}\), then \(\bigcup _{i=1}^{n}\theta _{v_{i}}=\theta _{v}\) and \(\theta _{v_{i}}\cap \theta _{v_{j}}=\boldsymbol {\emptyset }(i\neq j)\). Obviously,

      \(S_{v}(\theta _{v})={\sum }_{i=1}^{n}S_{v}(\theta _{v_{i}})\).

      For any \(\theta _{v_{i}}\), we have

      \(S_{v}(\theta _{v_{i}})=g(w_{v}(occ_{t_{3}}^{+}(Z_{i})-occ_{t_{1}}^{+}(X_{i})))\).

      It must exist substitutions \(\theta _{1_{v_{i}}}=\{X_{i}\mapsto Y_{i}\}\subseteq \theta _{1_{v}}\) (Yi may equal to Xi) and \(\theta _{2_{v_{i}}}=\{Y_{i}\mapsto Z_{i}\}\subseteq \theta _{2_{v}}\) (Yi may equal to Zi) such that

      \(X_{i}\theta _{1_{v_{i}}}\theta _{2_{v_{i}}}=Z_{i}\).

      If Yi = Xi or Yi = Zi, then \(\theta _{1_{v_{i}}}\) or \(\theta _{2_{v_{i}}}\) is an empty substitution. Obviously,

      \(occ_{t_{2}}^{+}(Y_{i})\geq occ_{t_{1}}^{+}(X_{i})\) and \(occ_{t_{3}}^{+}(Z_{i})\geq occ_{t_{2}}^{+}(Y_{i})\).

      By the properties of function g, we have

      \(g(w_{v}(occ_{t_{3}}^{+}(Z_{i})-occ_{t_{1}}^{+}(X_{i})))=g(w_{v}(occ_{t_{3}}^{+}(Z_{i})-occ_{t_{2}}^{+}(Y_{i})+occ_{t_{2}}^{+}(Y_{i})-occ_{t_{1}}^{+}(X_{i})))\leq g(w_{v}(occ_{t_{3}}^{+}(Z_{i})-occ_{t_{2}}^{+}(Y_{i})))+g(w_{v}(occ_{t_{2}}^{+}(Y_{i})-occ_{t_{1}}^{+}(X_{i})))\).

      Hence,

      \(S_{v}(\theta _{v_{i}})\leq S_{v}(\theta _{1_{v_{i}}})+S_{v}(\theta _{2_{v_{i}}})\) and \(S_{v}(\theta _{v})\leq S_{v}(\theta _{1_{v}})+S_{v}(\theta _{2_{v}})\).

    As a result,

    ST(𝜃) ≤2ST(𝜃1) + ST(𝜃2).

  5. 5.

    Under the same assumptions in 4, it is clear that \(S_{f}(\theta _{2_{f}})\leq S_{f}(\theta _{f})\).

    1. (a)

      \(S_{f}(\theta _{2_{f}})<S_{f}(\theta _{f})\). It is obvious that ST(𝜃2) <2ST(𝜃).

    2. (b)

      \(S_{f}(\theta _{2_{f}})=S_{f}(\theta _{f})\). Suppose that 𝜃v = {X1Z1,...,XnZn}. ∀XiZi𝜃v, there must exist substitutions \(\{X_{i}\mapsto Y_{i}^{\prime }\}\subseteq \theta _{1_{v}}\) and \(\{Y_{i}^{\prime }\mapsto Z_{i}\}\subseteq \theta _{2_{v}}\). Let \(\theta _{2_{v}}^{\prime }=\bigcup _{i}\{Y_{i}^{\prime }\mapsto Z_{i} | occ_{t_{3}}^{+}(Z_{i})\geq occ_{t_{2}}^{+}(Y_{i}^{\prime })\}\), \(\theta _{2_{v}}^{\prime \prime }=\bigcup _{j}\{Y_{j}^{\prime \prime }\mapsto Z_{j}^{\prime \prime } | Y_{j}^{\prime \prime } \in Ran(\theta _{1_{f}})\wedge occ_{t_{2}}^{+}(Y_{j}^{\prime \prime })=occ_{t_{3}}^{+}(Z_{j}) \}\). We can assert that \(\theta _{2_{v}}=\theta _{2_{v}}^{\prime }\cup \theta _{2_{v}}^{\prime \prime }\). Hence,

      \(S_{v}(\theta _{2_{v}})\leq S_{v}(\theta _{v})\).

    As a result,

    ST(𝜃2) ≤2ST(𝜃).

  6. 6.

    As shown in Fig. 4, t1 and t2 are two terms which can be unified, lgg(t1,t2) = t3, and t4 is a unified term of t1 and t2 (not need to be the most general unified term). For proving the conclusion of ST(ρ1) + ST(ρ2) ≤2ST(φ1) + ST(φ2), we start to prove \(S_{f}(\rho _{1_{f}})+S_{f}(\rho _{2_{f}})\leq S_{f}(\varphi _{1_{f}})+S_{f}(\varphi _{2_{f}})\) at first.

    1. (a)

      \(\rho _{1_{f}}=\boldsymbol {\emptyset }\) and \(\rho _{2_{f}}=\boldsymbol {\emptyset }\). Obviously,

      \(S_{f}(\rho _{1_{f}})+S_{f}(\rho _{2_{f}})\leq S_{f}(\varphi _{1_{f}})+S_{f}(\varphi _{2_{f}})\).

    2. (b)

      \(\rho _{1_{f}}\neq \boldsymbol {\emptyset }\) or \(\rho _{2_{f}}\neq \boldsymbol {\emptyset }\). If \(\rho _{1_{f}}\neq \boldsymbol {\emptyset }\), suppose that \(Ran(\rho _{1_{f}})=\{u_{1}^{\prime }, ..., u_{h}^{\prime }\}\) and \(Ran(\varphi _{2_{f}})=\{u_{1}, ..., u_{l}\}\). \(\forall u_{i}^{\prime }\in Ran(\rho _{1_{f}})\), let \(V_{t_{3}}(u_{i}^{\prime })=\{X_{i1}, ..., X_{ik}\}(k\geq 1)\) under ρ1. \(\forall X_{ij}\in V_{t_{3}}(u_{i}^{\prime })\), \(occ_{t_{3}}(X_{ij})=n_{ij}^{\prime }\) and \({\sum }_{j=1}^{k}n_{ij}^{\prime }=n_{i}^{\prime }\). \(\forall u_{r}\in Ran(\varphi _{2_{f}})\), let \(V_{t_{2}}(u_{r})=\{Z_{r1}, ..., Z_{rs}\}(s\geq 1)\) under φ2. \(\forall Z_{rj}\in V_{t_{2}}(u_{r})\), \(occ_{t_{2}}(Z_{rj})=n_{rj}\) and \({\sum }_{j=1}^{s}n_{rj}=n_{r}\). By the previous proof, we have \(S_{f}(\rho _{1_{f}})={\sum }_{f\in \bigcup _{i=1}^{h}F(u_{i}^{\prime })}w_{f}(f)({\sum }_{i=1}^{h}n_{i}^{\prime }occ_{u_{i}^{\prime }}(f))\), \(S_{f}(\varphi _{2_{f}})={\sum }_{f\in \bigcup _{r=1}^{l}F(u_{r})}w_{f}(f)({\sum }_{r=1}^{l}n_{r}occ_{u_{r}}(f))\).

      In ρ1, \(\forall X\in \bigcup _{i=1}^{m}V_{t_{3}}(u_{i}^{\prime })\), X is substituted by a functional term. Because t1 and t2 can be unified, X must be substituted by a variable in ρ2. Hence, \(\exists Z\in \bigcup _{r=1}^{l}V_{t_{2}}(u_{r})\), such that Xρ1φ1 = Xρ2φ2 = Zφ2.

      That is, \(\forall f\in \bigcup _{i=1}^{h}F(u_{i}^{\prime })\), there must exist some \(u_{r}\in Ran(\varphi _{2_{f}})\) such that fur. Hence,

      \(\bigcup _{i=1}^{h}F(u_{i}^{\prime })\subseteq \bigcup _{r=1}^{l}F(u_{r})\) \({\sum }_{i=1}^{h}n_{i}^{\prime }occ_{u_{i}^{\prime }}(f)\leq {\sum }_{r=1}^{l}n_{r}occ_{u_{r}}(f)\).

      As a consequence,

      \(S_{f}(\rho _{1_{f}})\leq S_{f}(\varphi _{2_{f}})\).

      In the same way, if \(\rho _{2_{f}}\neq \boldsymbol {\emptyset }\), we have

      \(S_{f}(\rho _{2_{f}})\leq S_{f}(\varphi _{1_{f}})\).

      In a conclusion,

      \(S_{f}(\rho _{1_{f}})+S_{f}(\rho _{2_{f}})\leq S_{f}(\varphi _{1_{f}})+S_{f}(\varphi _{2_{f}})\).

    Next, we would like to prove \(S_{v}(\rho _{1_{v}})+S_{v}(\rho _{2_{v}})\leq S_{v}(\varphi _{1_{v}})+S_{v}(\varphi _{2_{v}})\) when \(S_{f}(\rho _{1_{f}})+S_{f}(\rho _{2_{f}})=S_{f}(\varphi _{1_{f}})+S_{f}(\varphi _{2_{f}})\). In this case, we have

    \(S_{f}(\rho _{1_{f}})=S_{f}(\varphi _{2_{f}})\) and \(S_{f}(\rho _{2_{f}})=S_{f}(\varphi _{1_{f}})\),

    1. (a)

      \(S_{f}(\rho _{1_{f}})=S_{f}(\varphi _{2_{f}})=\boldsymbol {\emptyset }\) and \(S_{f}(\rho _{2_{f}})=S_{f}(\varphi _{1_{f}})=\boldsymbol {\emptyset }\). We have

      \(S_{v}(\rho _{1_{v}})\leq S_{v}(\varphi _{2_{v}})\), \(S_{v}(\rho _{2_{v}})\leq S_{v}(\varphi _{1_{v}})\).

      Hence,

      \(S_{v}(\rho _{1_{v}})+S_{v}(\rho _{2_{v}})\leq S_{v}(\varphi _{1_{v}})+S_{v}(\varphi _{2_{v}})\).

    2. (b)

      \(S_{f}(\rho _{1_{f}})=S_{f}(\varphi _{2_{f}}) \neq \boldsymbol {\emptyset }\) or \(S_{f}(\rho _{2_{f}})=S_{f}(\varphi _{1_{f}})\neq \boldsymbol {\emptyset }\). If \(S_{f}(\rho _{1_{f}})=S_{f}(\varphi _{2_{f}})\neq \boldsymbol {\emptyset }\), suppose that \(Ran(\rho _{1_{f}})=\{u_{1}^{\prime }, ..., u_{h}^{\prime }\}\). Let \(V_{t_{3}}(u_{i}^{\prime })=\{X_{i1}, ..., X_{ik}\}(k\geq 1)\) under ρ1, we can assert that \(\{X_{i1}\mapsto Y_{i1}, ..., X_{ik}\mapsto Y_{ik}\}\subseteq \rho _{2_{v}}\), where \(occ_{t_{3}}^{+}(X_{ij})=occ_{t_{2}}^{+}(Y_{ij})(1\leq j \leq k)\). For any singleton element \(\rho _{2_{v_{j}}}\in \rho _{2_{v}}/\bigcup _{i}\{X_{i1}\mapsto Y_{i1}, ..., X_{ik}\mapsto Y_{ik}\}\), Because t1 and t2 can be unified, there must exist a set \(\varphi _{1_{v_{j}}}\) such that \(S_{v}(\rho _{2_{v_{j}}})\leq S_{v}(\varphi _{1_{v_{j}}})\). Hence,

      \(S_{v}(\rho _{2_{v}})\leq S_{v}(\varphi _{1_{v}})\).

      In the same way, if \(S_{f}(\rho _{2_{f}})=S_{f}(\varphi _{1_{f}})\neq \boldsymbol {\emptyset }\), we have

      \(S_{v}(\rho _{1_{v}})\leq S_{v}(\varphi _{2_{v}})\).

      In a conclusion,

      \(S_{v}(\rho _{1_{v}})+S_{v}(\rho _{2_{v}})\leq S_{v}(\varphi _{1_{v}})+S_{v}(\varphi _{2_{v}})\).

    Hence,

    $$S_{T}(\rho_{1})+S_{T}(\rho_{2})\leq_{2} S_{T}(\varphi_{1})+S_{T}(\varphi_{2}).$$

Proof Proof of theorem 2

Given the function dT, dA must satisfy the properties 1 and 2 of pseudo-metric with no doubt. In the next proof, we only prove that dA satisfies the triangle inequality.

  1. 1.

    A1, A2 are compatible.

    1. (a)

      If A3 is is compatible with A1 and A2, the triangle inequality is satisfied obviously.

    2. (b)

      If A3 is incompatible with A1 and A2, we have

      \(d_{A}(A_{1},A_{3})=_{2}(+\infty ,+\infty )\), \(\qquad ~~~~~ d_{A}(A_{2},A_{3})=_{2}(+\infty ,+\infty )\),

      and \(d_{A}(A_{1},A_{3})+d_{A}(A_{2},A_{3})=_{2}(+\infty ,+\infty )\).

      Hence,

      \(d_{A}(A_{1},A_{2})<_{2}(+\infty ,+\infty )=_{2}d_{A}(A_{1},A_{3})+d_{A}(A_{2},A_{3})\).

  2. 2.

    A1, A2 are incompatible.

    1. (a)

      If A3 is incompatible with A1 and A2, we have

      dA(A1,A2) =2dA(A1,A3) + dA(A2,A3).

    2. (b)

      If A3 is compatible with A1 or A2 (A3 only compatible with one atom). Here, we assume that A3 is compatible with A1,

      \(d_{A}(A_{1},A_{3})<_{2}(+\infty ,+\infty )\), \(~~~~~d_{A}(A_{2},A_{3})=_{2}(+\infty ,+\infty )\),

      and

      \(d_{A}(A_{1},A_{3})+d_{A}(A_{2},A_{3})=_{2}(+\infty ,+\infty )\).

      Hence,

      dA(A1,A2) =2dA(A1,A3) + dA(A2,A3).

In a conclusion,

dA(A1,A2) ≤2dA(A1,A3) + dA(A2,A3). □

Proof Proof of Theorem 3

Given the atom metric dA, it is obvious that \(d_{F_{m}}\) satisfies the properties 1 and 2. Next, we continue to prove \(d_{F_{m}}\) satisfies the triangle inequality.

For a formula F with the corresponding atom set D, DrF = {A|∃BD,dA(A,B) ≤ r} is a set of atoms that are at most r far away from one atom in F. observe that

\(d_{F_{m}}(F_{1},F_{2})=min\{r|D_{1}\subseteq D_{r}F_{2} \wedge D_{2}\subseteq D_{r}F_{1}\}\)

Given a formula F3 with the corresponding atom set D3, if

\(D_{3}\subseteq D_{r}F_{1}\wedge D_{1}\subseteq D_{r}F_{3} \wedge D_{3}\subseteq D_{s}F_{2}\wedge D_{2}\subseteq D_{s}F_{3}\),

then

AD1,∃CD3, (dA(A,C) ≤ r ∧∃BD2,dA(C,B) ≤ s).

In this case, \(D_{1}\subseteq D_{r+s}F_{2}\). We can also prove \(D_{2}\subseteq D_{r+s}F_{1}\).

Fig. 4
figure 4

Parallelogram structure

Hence, if \(d_{F_{m}}(F_{1},F_{3})\leq r\) and \(d_{F_{m}}(F_{3},F_{2})\leq s\), then \(d_{F_{m}}(F_{1},F_{2})\leq r+s\), and consequently \(d_{F_{m}}(F_{1},F_{2}) \leq d_{F_{m}}(F_{1}, F_{3}) + d_{F_{m}}(F_{3}, F_{2})\). □

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Q., Xu, Y. Axiom selection over large theory based on new first-order formula metrics. Appl Intell 52, 1793–1807 (2022). https://doi.org/10.1007/s10489-021-02469-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02469-1

Keywords

Navigation