Axiom selection over large theory based on new first-order formula metrics

Liu, Qinghua; Xu, Yang

doi:10.1007/s10489-021-02469-1

Axiom selection over large theory based on new first-order formula metrics

Published: 28 May 2021

Volume 52, pages 1793–1807, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Axiom selection is a task that selects the most likely useful axioms from a large-scale axiom set for proving a given conjecture. Existing axiom selection methods either solely take shallow symbols into account or strongly dependent on previous successful proofs from homologous problems. To address these problems, we introduce a new metric to evaluate the dissimilarity between formulae and utilize it as an evaluator in the selection task. Firstly, we propose a substitution-based metric to compute the dissimilarity between terms. It is a pseudo-metric and can capture the in-depth syntactic difference trigged by both functional and variable subterms. We then extend it to atoms and prove the atom metric also to be a pseudo-metric. Treating formulae as atom sets, we define three kinds of dissimilarity metrics between formulae. Finally, we design and implement conjecture-oriented axiom selection methods based on newly proposed formula metrics. The experimental evaluation is conducted on the MPTP2078 benchmark and demonstrates dissimilarity-based axiom selection improves E prover’s performance. In the best case, it increases the ratio of successful proofs from 30.90% to 42.25%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Extending E Prover with Similarity Based Clause Selection Strategies

Efficient Axiom Pinpointing with EL2MCS

Names Are Not Just Sound and Smoke: Word Embeddings for Axiom Selection

Notes

The values of 1 and 2 for w_v and w_f are adopted from their use in E. The use of ln(x + 1) for g(x) is motivated by its common adoption as a continuous increasing function.
https://github.com/JUrban/MPTP2078
The command is ./eprover –satauto-schedule –free-numbers -s -R –delete-bad-limit= 2000000000 –definitional-cnf= 24 –print-statistics –print-version –proof-object –cpu-limit= 60 problem_file
The command is ./vampire –mode axiom_selection –output_axiom_n ames on problem_file
The command is ./eprover –satauto-schedule –free-numbers -s -R –delete-bad-limit= 2000000000 –definitional-cnf= 24 –print-statistics –print-version –proof-object –cpu-limit= 60 –sine problem_file
The command is ./eprover –satauto-schedule –free-numbers -s -R –delete-bad-limit= 2000000000 –definitional-cnf= 24 –print-statistics –print-version –proof-object –cpu-limit= 10 problem_file

References

Koubarakis M (2002) Querying Temporal Constraint Networks: A Unifying Approach. Appl Intell 17(3):297–311
Article Google Scholar
Sutcliffe G (2017) The TPTP Problem Library and Associated Infrastructure. From CNF to TH0, TPTP v6.4.0. J Autom Reas 59(4):483–502
Article MathSciNet Google Scholar
McCune W (2005) Prover9 and Mace4. http://www.cs.unm.edu/~mccune/prover9/, Accessed December 14, 2018
Kovács L, Voronkov A (2013) First-order theorem proving and Vampire. In: International Conference on Computer Aided Verification. Springer, pp 1–35
Schulz S (2002) E–a brainiac theorem prover. Ai Commun 15(2, 3):111–126
MATH Google Scholar
Rudnicki P (1992) An overview of the Mizar project. In: Proceedings of the 1992 Workshop on Types for Proofs and Programs, pp 311–330
Lenat D B (1995) CYC: A large-scale investment in knowledge infrastructure. Commun ACM 38(11):33–38
Article Google Scholar
Niles I, Pease A (2001) Towards a standard upper ontology. In: Proceedings of the international conference on Formal Ontology in Information Systems-Volume 2001. ACM, pp 2–9
Kern C, Greenstreet M R (1999) Formal verification in hardware design: a survey. ACM Trans Des Autom Electron Syst (TODAES) 4(2):123–193
Article Google Scholar
Klein G, Elphinstone K, Heiser G, Andronick J, Cock D, Derrin P, Elkaduwe D, Engelhardt K, Kolanski R, Norrish M et al (2009) seL4: Formal verification of an OS kernel. In: Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles. ACM, pp 207–220
Leroy X (2009) Formal verification of a realistic compiler. Commun ACM 52(7):107–115
Article Google Scholar
Sutcliffe G, Puzis Y (2007) Srass-a semantic relevance axiom selection system. In: International Conference on Automated Deduction. Springer, pp 295–310
Pudlák P (2007) Semantic selection of premisses for automated theorem proving. ESARLT 257
Roederer A, Puzis Y, Sutcliffe G (2009) Divvy: An ATP meta-system based on axiom relevance ordering. In: International Conference on Automated Deduction. Springer, pp 157–162
Meng J, Paulson L C (2009) Lightweight relevance filtering for machine-generated resolution problems. J Appl Log 7(1):41– 57
Article MathSciNet Google Scholar
Hoder K, Voronkov A (2011) Sine qua non for large theory reasoning. In: International Conference on Automated Deduction. Springer, pp 299–314
Alama J, Heskes T, Kühlwein D, Tsivtsivadze E, Urban J (2014) Premise selection for mathematics by corpus analysis and kernel methods. J Autom Reason 52(2):191–213
Article MathSciNet Google Scholar
Kaliszyk C, Urban J, Vyskocil J (2015) Efficient semantic features for automated reasoning over large theories. In: Twenty-Fourth International Joint Conference on Artificial Intelligence
Piotrowski B, Urban J (2018) ATPboost: Learning premise selection in binary setting with atp feedback. In: International Joint Conference on Automated Reasoning. Springer, pp 566– 574
Irving G, Szegedy C, Alemi A A, Eén N, Chollet F, Urban J (2016) Deepmath-deep sequence models for premise selection. In: Advances in Neural Information Processing Systems, pp 2235– 2243
Wang M, Tang Y, Wang J, Deng J (2017) Premise selection for theorem proving by deep graph embedding. In: Advances in Neural Information Processing Systems, pp 2786–2796
Crouse M, Abdelaziz I, Cornelio C, Thost V, Wu L, Forbus K, Fokoue A (2019) Improving Graph Neural Network Representations of Logical Formulae with Subgraph Pooling. arXiv:1911.06904
Hutchinson A (1997) Metrics on terms and clauses. In: European Conference on Machine Learning. Springer, pp 138–145
Nienhuys-Cheng S-H (1997) Distance between herbrand interpretations: A measure for approximations to a target concept. In: International Conference on Inductive Logic Programming. Springer, pp 213–226
Ramon J, Bruynooghe M (1998) A framework for defining distances between first-order logic objects. In: International Conference on Inductive Logic Programming. Springer, pp 271–280
Muggleton S (1991) Inductive logic programming. Gener Comput 8(4):295–318
Article Google Scholar
Bachmair L, Ganzinger H (2001) Resolution theorem proving. In: Handbook of automated reasoning. Elsevier, pp 19–99
Plotkin G D (1970) A note on inductive generalization. Mach Intell 5(1):153–163
MathSciNet MATH Google Scholar
Liu Q, Xu Y, He X (2020) Selecting premises by first-order formula dissimilarity. In: Developments Of Artificial Intelligence Technologies In Computation And Robotics-Proceedings Of The 14th International Flins Conference (Flins 2020), vol 12. World Scientific, pp 269
McCune W W (1994) Otter 3.0 reference manual and guide. Technical Report. Argonne National Lab., IL
Rudnicki P (1992) An Overview of the Mizar Project. In: Proceedings of the 1992 Workshop on Types for Proofs and Programs, pp 311–332
Liu Q, Wu Z, Wang Z, Sutcliffe G Evaluation of Axiom Selection Techniques
Jakubüv J, Urban J (2016) Extending E Prover with Similarity Based Clause Selection Strategies
Sutcliffe G (2008) CASC-J4 the 4th IJCAR ATP system competition. In: International Joint Conference on Automated Reasoning. Springer, pp 457–458

Download references

Funding

This research were funded by the National Natural Science Foundation of China of grant numbers 61603307, 61673320, and 61473239 and the Ministry of Education in China Project of Humanities and Social Sciences of grant number 9YJCZH048.

Author information

Authors and Affiliations

School of Information Science and Technology, Southwest Jiaotong University, Chengdu, Sichuan Province, China
Qinghua Liu
School of Mathematics, Southwest Jiaotong University, Chengdu, Sichuan Province, China
Yang Xu

Authors

Qinghua Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yang Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qinghua Liu.

Ethics declarations

Competing interests

The authors declare no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

Proof Proof of Lemma 1

We need to prove that d^′ satisfies all properties a pseudo-metric has if function S^′ satisfies 1 - 6.

Let t₁, t₂ be arbitrary two terms. By condition 1, obviously,

$$d'(t_{1},t_{2})\geq_{2}(0,0).$$

If t₂ = t₁, t₁ is the lgg of itself. The substitution mapping a term to itself is an empty substitution or a renaming substitution. By conditions 2 and 3,

$$d'(t_{1},t_{1})=_{2}(0,0).$$

According to the definition of d^′, the symmetry is also satisfied. That is,

$$d'(t_{1},t_{2})=_{2}d'(t_{2},t_{1}).$$

Finally, we prove that d^′ satisfies the triangle inequality.

As shown in Fig. 2, t₁, t₂, and t₃ are arbitrary three terms, lgg(t₁,t₂) = t₄, lgg(t₁,t₃) = t₅, lgg(t₂,t₃) = t₆, lgg(t₅,t₆) = t₇. Here, t₃ must be a unified term of t₅ and t₆. For convenience, we use 𝜃_ij to denote the substitution which maps from term t_i to term t_j.

By the definition of d^′,

$$d'(t_{1},t_{2})=_{2}S'(\theta_{41})+S'(\theta_{42}),$$

$$d'(t_{1},t_{3})=_{2}S'(\theta_{51})+S'(\theta_{53}),$$

$$d'(t_{2},t_{3})=_{2}S'(\theta_{62})+S'(\theta_{63}).$$

By condition 5,

$$S'(\theta_{41})\leq_{2} s'(\theta_{71}), \quad S'(\theta_{42})\leq_{2} S'(\theta_{72}).$$

By condition 4,

$$S'(\theta_{71})\leq_{2} S'(\theta_{75})+S'(\theta_{51}),\quad S'(\theta_{72})\leq_{2} S'(\theta_{76})+S'(\theta_{62}).$$

By condition 6,

$$S'(\theta_{75})+S'(\theta_{76})\leq_{2} S'(\theta_{53})+S'(\theta_{63}).$$

Hence,

$$ \begin{aligned} d'(t_{1},t_{2})&=_{2}S'(\theta_{41})+S'(\theta_{42})\leq_{2} S'(\theta_{71})+S'(\theta_{72}) \\ &\leq_{2} S'(\theta_{75})+S'(\theta_{51})+S'(\theta_{76})+S'(\theta_{62}) \\ &\leq_{2}S'(\theta_{51})+S'(\theta_{53})+S'(\theta_{62})+S'(\theta_{63}) \\ &=_{2}d'(t_{1},t_{3})+d'(t_{2},t_{3}). \end{aligned} $$

□

Proof Proof of Theorem 1

By lemma 1, we only need to ensure that the function S_T satisfies 1 - 6.

1.
For any substitution 𝜃, obviously, by (3) and (4), we have
$$S_{f}(\theta_{f})\geq0 \text{and} S_{v}(\theta_{v})\geq0.$$

Hence, by (5),
$$S_{T}(\theta)\geq_{2}(0,0).$$
2.
For an empty substitution ε, both ε_f and ε_v are also empty substitutions. Hence, S_f(ε_f), S_v(ε_v) are empty sums, and we have
$$S_{T}(\varepsilon)=_{2}(0,0).$$
3.
For a renaming substitution η and t₁η = t₂, we have η_f = ε and η_v = η. Suppose that η_v = η = {X₁↦Y₁,...,X_n↦Y_n}, where Y₁, ..., Y_n are distinct variables in t₂ and ∀X_i↦Y_i ∈ η_v, $occ_{t_{1}}^{+}(X_{i})=occ_{t_{2}}^{+}(Y_{i})$ and $g(w_{v}(occ_{t_{2}}^{+}(Y_{i})-occ_{t1}^{+}(X_{i})))=g(0)=0$. Hence,
$$ S_{T}(\eta)=_{2}(0,0).$$
4.
As shown in Fig. 3 , for arbitrary three terms t₁, t₂ and t₃, which satisfy:
$$t_{1}\theta_{1}=t_{2}, t_{2}\theta_{2}=t_{3} \text{and} t_{1}\theta=t_{3}. $$
Fig. 3
Triangle structure
Full size image

To prove the inequality of S_T(𝜃) ≤₂S_T(𝜃₁) + S_T(𝜃₂), we would like to prove the conclusion of $S_{f}(\theta _{f})\leq S_{f}(\theta _{1_{f}})+S_{f}(\theta _{2_{f}})$ at first.
1. (a)
  𝜃_f = ∅. Hence, $\theta _{1_{f}}=\boldsymbol {\emptyset }$ and $\theta _{2_{f}}=\boldsymbol {\emptyset }$. We have
  
  $S_{f}(\theta _{f})=S_{f}(\theta _{1_{f}})+S_{f}(\theta _{2_{f}})$.
2. (b)
  𝜃_f≠∅. Suppose that Ran(𝜃_f) = {u₁,...,u_m}(m ≥ 1). In 𝜃, ∀u_i ∈ Ran(𝜃_f), let $V_{t_{1}}(u_{i})=\{X_{i1}, ..., X_{ik}\}(k\geq 1)$, denoting the set of variables substituted by u_i in t₁. $\forall X_{ij}\in V_{t_{1}}(u_{i})$, let $occ_{t_{1}}(X_{ij})=n_{ij}(\geq 1)$ and ${\sum }_{j=1}^{k}n_{ij}=n_{i}$. Hence, S_f(𝜃_f) also can be presented as:
  
  $S_{f}(\theta _{f})={\sum }_{f\in \bigcup _{i=1}^{m}F(u_{i})}w_{f}(f)({\sum }_{i=1}^{m}n_{i}occ_{u_{i}}(f))$.
  
  For any $f\in \bigcup _{i=1}^{m}F(u_{i})$, ${\sum }_{i=1}^{m}n_{i}occ_{u_{i}}(f)$ is considered as the number of new occurrences of f under 𝜃_f. Obviously, the sum of the number of new occurrences of f under $\theta _{1_{f}}$ and $\theta _{2_{f}}$ equals to ${\sum }_{i=1}^{m}n_{i}occ_{u_{i}}(f)$. Hence,
  
  $S_{f}(\theta _{f})=S_{f}(\theta _{1_{f}})+S_{f}(\theta _{2_{f}})$.
Next, we start to prove $S_{v}(\theta _{v})\leq S_{v}(\theta _{1_{v}})+S_{v}(\theta _{2_{v}})$.
1. (a)
  𝜃_v is a renaming substitution or an empty substitution. $\theta _{1_{v}}$ and $\theta _{2_{v}}$ are either empty substitution or renaming substitution (should not be empty at the same time). By the previous proof, we know
  
  $S_{v}(\theta _{v})=S_{v}(\theta _{1_{v}})=S_{v}(\theta _{2_{v}})=0$.
  
  Hence,
  
  $S_{v}(\theta _{v})=S_{v}(\theta _{1_{v}})+S_{v}(\theta _{2_{v}})$.
2. (b)
  Otherwise, suppose that 𝜃_v = {X₁↦Z₁,...,X_n↦Z_n}. ∀X_i↦Z_i ∈ 𝜃_v, let $\theta _{v_{i}}=\{X_{i}\mapsto Z_{i}\}$, then $\bigcup _{i=1}^{n}\theta _{v_{i}}=\theta _{v}$ and $\theta _{v_{i}}\cap \theta _{v_{j}}=\boldsymbol {\emptyset }(i\neq j)$. Obviously,
  
  $S_{v}(\theta _{v})={\sum }_{i=1}^{n}S_{v}(\theta _{v_{i}})$.
  
  For any $\theta _{v_{i}}$, we have
  
  $S_{v}(\theta _{v_{i}})=g(w_{v}(occ_{t_{3}}^{+}(Z_{i})-occ_{t_{1}}^{+}(X_{i})))$.
  
  It must exist substitutions $\theta _{1_{v_{i}}}=\{X_{i}\mapsto Y_{i}\}\subseteq \theta _{1_{v}}$ (Y_i may equal to X_i) and $\theta _{2_{v_{i}}}=\{Y_{i}\mapsto Z_{i}\}\subseteq \theta _{2_{v}}$ (Y_i may equal to Z_i) such that
  
  $X_{i}\theta _{1_{v_{i}}}\theta _{2_{v_{i}}}=Z_{i}$.
  
  If Y_i = X_i or Y_i = Z_i, then $\theta _{1_{v_{i}}}$ or $\theta _{2_{v_{i}}}$ is an empty substitution. Obviously,
  
  $occ_{t_{2}}^{+}(Y_{i})\geq occ_{t_{1}}^{+}(X_{i})$ and $occ_{t_{3}}^{+}(Z_{i})\geq occ_{t_{2}}^{+}(Y_{i})$.
  
  By the properties of function g, we have
  
  $g(w_{v}(occ_{t_{3}}^{+}(Z_{i})-occ_{t_{1}}^{+}(X_{i})))=g(w_{v}(occ_{t_{3}}^{+}(Z_{i})-occ_{t_{2}}^{+}(Y_{i})+occ_{t_{2}}^{+}(Y_{i})-occ_{t_{1}}^{+}(X_{i})))\leq g(w_{v}(occ_{t_{3}}^{+}(Z_{i})-occ_{t_{2}}^{+}(Y_{i})))+g(w_{v}(occ_{t_{2}}^{+}(Y_{i})-occ_{t_{1}}^{+}(X_{i})))$.
  
  Hence,
  
  $S_{v}(\theta _{v_{i}})\leq S_{v}(\theta _{1_{v_{i}}})+S_{v}(\theta _{2_{v_{i}}})$ and $S_{v}(\theta _{v})\leq S_{v}(\theta _{1_{v}})+S_{v}(\theta _{2_{v}})$.
As a result,

S_T(𝜃) ≤₂S_T(𝜃₁) + S_T(𝜃₂).
5.
Under the same assumptions in 4, it is clear that $S_{f}(\theta _{2_{f}})\leq S_{f}(\theta _{f})$.
1. (a)
  $S_{f}(\theta _{2_{f}})<S_{f}(\theta _{f})$. It is obvious that S_T(𝜃₂) <₂S_T(𝜃).
2. (b)
  $S_{f}(\theta _{2_{f}})=S_{f}(\theta _{f})$. Suppose that 𝜃_v = {X₁↦Z₁,...,X_n↦Z_n}. ∀X_i↦Z_i ∈ 𝜃_v, there must exist substitutions $\{X_{i}\mapsto Y_{i}^{\prime }\}\subseteq \theta _{1_{v}}$ and $\{Y_{i}^{\prime }\mapsto Z_{i}\}\subseteq \theta _{2_{v}}$. Let $\theta _{2_{v}}^{\prime }=\bigcup _{i}\{Y_{i}^{\prime }\mapsto Z_{i} | occ_{t_{3}}^{+}(Z_{i})\geq occ_{t_{2}}^{+}(Y_{i}^{\prime })\}$, $\theta _{2_{v}}^{\prime \prime }=\bigcup _{j}\{Y_{j}^{\prime \prime }\mapsto Z_{j}^{\prime \prime } | Y_{j}^{\prime \prime } \in Ran(\theta _{1_{f}})\wedge occ_{t_{2}}^{+}(Y_{j}^{\prime \prime })=occ_{t_{3}}^{+}(Z_{j}) \}$. We can assert that $\theta _{2_{v}}=\theta _{2_{v}}^{\prime }\cup \theta _{2_{v}}^{\prime \prime }$. Hence,
  
  $S_{v}(\theta _{2_{v}})\leq S_{v}(\theta _{v})$.
As a result,

S_T(𝜃₂) ≤₂S_T(𝜃).
6.
As shown in Fig. 4, t₁ and t₂ are two terms which can be unified, lgg(t₁,t₂) = t₃, and t₄ is a unified term of t₁ and t₂ (not need to be the most general unified term). For proving the conclusion of S_T(ρ₁) + S_T(ρ₂) ≤₂S_T(φ₁) + S_T(φ₂), we start to prove $S_{f}(\rho _{1_{f}})+S_{f}(\rho _{2_{f}})\leq S_{f}(\varphi _{1_{f}})+S_{f}(\varphi _{2_{f}})$ at first.
1. (a)
  $\rho _{1_{f}}=\boldsymbol {\emptyset }$ and $\rho _{2_{f}}=\boldsymbol {\emptyset }$. Obviously,
  
  $S_{f}(\rho _{1_{f}})+S_{f}(\rho _{2_{f}})\leq S_{f}(\varphi _{1_{f}})+S_{f}(\varphi _{2_{f}})$.
2. (b)
  $\rho _{1_{f}}\neq \boldsymbol {\emptyset }$ or $\rho _{2_{f}}\neq \boldsymbol {\emptyset }$. If $\rho _{1_{f}}\neq \boldsymbol {\emptyset }$, suppose that $Ran(\rho _{1_{f}})=\{u_{1}^{\prime }, ..., u_{h}^{\prime }\}$ and $Ran(\varphi _{2_{f}})=\{u_{1}, ..., u_{l}\}$. $\forall u_{i}^{\prime }\in Ran(\rho _{1_{f}})$, let $V_{t_{3}}(u_{i}^{\prime })=\{X_{i1}, ..., X_{ik}\}(k\geq 1)$ under ρ₁. $\forall X_{ij}\in V_{t_{3}}(u_{i}^{\prime })$, $occ_{t_{3}}(X_{ij})=n_{ij}^{\prime }$ and ${\sum }_{j=1}^{k}n_{ij}^{\prime }=n_{i}^{\prime }$. $\forall u_{r}\in Ran(\varphi _{2_{f}})$, let $V_{t_{2}}(u_{r})=\{Z_{r1}, ..., Z_{rs}\}(s\geq 1)$ under φ₂. $\forall Z_{rj}\in V_{t_{2}}(u_{r})$, $occ_{t_{2}}(Z_{rj})=n_{rj}$ and ${\sum }_{j=1}^{s}n_{rj}=n_{r}$. By the previous proof, we have $S_{f}(\rho _{1_{f}})={\sum }_{f\in \bigcup _{i=1}^{h}F(u_{i}^{\prime })}w_{f}(f)({\sum }_{i=1}^{h}n_{i}^{\prime }occ_{u_{i}^{\prime }}(f))$, $S_{f}(\varphi _{2_{f}})={\sum }_{f\in \bigcup _{r=1}^{l}F(u_{r})}w_{f}(f)({\sum }_{r=1}^{l}n_{r}occ_{u_{r}}(f))$.
  
  In ρ₁, $\forall X\in \bigcup _{i=1}^{m}V_{t_{3}}(u_{i}^{\prime })$, X is substituted by a functional term. Because t₁ and t₂ can be unified, X must be substituted by a variable in ρ₂. Hence, $\exists Z\in \bigcup _{r=1}^{l}V_{t_{2}}(u_{r})$, such that Xρ₁φ₁ = Xρ₂φ₂ = Zφ₂.
  
  That is, $\forall f\in \bigcup _{i=1}^{h}F(u_{i}^{\prime })$, there must exist some $u_{r}\in Ran(\varphi _{2_{f}})$ such that f ∈ u_r. Hence,
  
  $\bigcup _{i=1}^{h}F(u_{i}^{\prime })\subseteq \bigcup _{r=1}^{l}F(u_{r})$ ${\sum }_{i=1}^{h}n_{i}^{\prime }occ_{u_{i}^{\prime }}(f)\leq {\sum }_{r=1}^{l}n_{r}occ_{u_{r}}(f)$.
  
  As a consequence,
  
  $S_{f}(\rho _{1_{f}})\leq S_{f}(\varphi _{2_{f}})$.
  
  In the same way, if $\rho _{2_{f}}\neq \boldsymbol {\emptyset }$, we have
  
  $S_{f}(\rho _{2_{f}})\leq S_{f}(\varphi _{1_{f}})$.
  
  In a conclusion,
  
  $S_{f}(\rho _{1_{f}})+S_{f}(\rho _{2_{f}})\leq S_{f}(\varphi _{1_{f}})+S_{f}(\varphi _{2_{f}})$.
Next, we would like to prove $S_{v}(\rho _{1_{v}})+S_{v}(\rho _{2_{v}})\leq S_{v}(\varphi _{1_{v}})+S_{v}(\varphi _{2_{v}})$ when $S_{f}(\rho _{1_{f}})+S_{f}(\rho _{2_{f}})=S_{f}(\varphi _{1_{f}})+S_{f}(\varphi _{2_{f}})$. In this case, we have

$S_{f}(\rho _{1_{f}})=S_{f}(\varphi _{2_{f}})$ and $S_{f}(\rho _{2_{f}})=S_{f}(\varphi _{1_{f}})$,
1. (a)
  $S_{f}(\rho _{1_{f}})=S_{f}(\varphi _{2_{f}})=\boldsymbol {\emptyset }$ and $S_{f}(\rho _{2_{f}})=S_{f}(\varphi _{1_{f}})=\boldsymbol {\emptyset }$. We have
  
  $S_{v}(\rho _{1_{v}})\leq S_{v}(\varphi _{2_{v}})$, $S_{v}(\rho _{2_{v}})\leq S_{v}(\varphi _{1_{v}})$.
  
  Hence,
  
  $S_{v}(\rho _{1_{v}})+S_{v}(\rho _{2_{v}})\leq S_{v}(\varphi _{1_{v}})+S_{v}(\varphi _{2_{v}})$.
2. (b)
  $S_{f}(\rho _{1_{f}})=S_{f}(\varphi _{2_{f}}) \neq \boldsymbol {\emptyset }$ or $S_{f}(\rho _{2_{f}})=S_{f}(\varphi _{1_{f}})\neq \boldsymbol {\emptyset }$. If $S_{f}(\rho _{1_{f}})=S_{f}(\varphi _{2_{f}})\neq \boldsymbol {\emptyset }$, suppose that $Ran(\rho _{1_{f}})=\{u_{1}^{\prime }, ..., u_{h}^{\prime }\}$. Let $V_{t_{3}}(u_{i}^{\prime })=\{X_{i1}, ..., X_{ik}\}(k\geq 1)$ under ρ₁, we can assert that $\{X_{i1}\mapsto Y_{i1}, ..., X_{ik}\mapsto Y_{ik}\}\subseteq \rho _{2_{v}}$, where $occ_{t_{3}}^{+}(X_{ij})=occ_{t_{2}}^{+}(Y_{ij})(1\leq j \leq k)$. For any singleton element $\rho _{2_{v_{j}}}\in \rho _{2_{v}}/\bigcup _{i}\{X_{i1}\mapsto Y_{i1}, ..., X_{ik}\mapsto Y_{ik}\}$, Because t₁ and t₂ can be unified, there must exist a set $\varphi _{1_{v_{j}}}$ such that $S_{v}(\rho _{2_{v_{j}}})\leq S_{v}(\varphi _{1_{v_{j}}})$. Hence,
  
  $S_{v}(\rho _{2_{v}})\leq S_{v}(\varphi _{1_{v}})$.
  
  In the same way, if $S_{f}(\rho _{2_{f}})=S_{f}(\varphi _{1_{f}})\neq \boldsymbol {\emptyset }$, we have
  
  $S_{v}(\rho _{1_{v}})\leq S_{v}(\varphi _{2_{v}})$.
  
  In a conclusion,
  
  $S_{v}(\rho _{1_{v}})+S_{v}(\rho _{2_{v}})\leq S_{v}(\varphi _{1_{v}})+S_{v}(\varphi _{2_{v}})$.
Hence,
$$S_{T}(\rho_{1})+S_{T}(\rho_{2})\leq_{2} S_{T}(\varphi_{1})+S_{T}(\varphi_{2}).$$

□

Proof Proof of theorem 2

Given the function d_T, d_A must satisfy the properties 1 and 2 of pseudo-metric with no doubt. In the next proof, we only prove that d_A satisfies the triangle inequality.

1.
A₁, A₂ are compatible.
1. (a)
  If A₃ is is compatible with A₁ and A₂, the triangle inequality is satisfied obviously.
2. (b)
  If A₃ is incompatible with A₁ and A₂, we have
  
  $d_{A}(A_{1},A_{3})=_{2}(+\infty ,+\infty )$, $\qquad ~~~~~ d_{A}(A_{2},A_{3})=_{2}(+\infty ,+\infty )$,
  
  and $d_{A}(A_{1},A_{3})+d_{A}(A_{2},A_{3})=_{2}(+\infty ,+\infty )$.
  
  Hence,
  
  $d_{A}(A_{1},A_{2})<_{2}(+\infty ,+\infty )=_{2}d_{A}(A_{1},A_{3})+d_{A}(A_{2},A_{3})$.
2.
A₁, A₂ are incompatible.
1. (a)
  If A₃ is incompatible with A₁ and A₂, we have
  
  d_A(A₁,A₂) =₂d_A(A₁,A₃) + d_A(A₂,A₃).
2. (b)
  If A₃ is compatible with A₁ or A₂ (A₃ only compatible with one atom). Here, we assume that A₃ is compatible with A₁,
  
  $d_{A}(A_{1},A_{3})<_{2}(+\infty ,+\infty )$, $~~~~~d_{A}(A_{2},A_{3})=_{2}(+\infty ,+\infty )$,
  
  and
  
  $d_{A}(A_{1},A_{3})+d_{A}(A_{2},A_{3})=_{2}(+\infty ,+\infty )$.
  
  Hence,
  
  d_A(A₁,A₂) =₂d_A(A₁,A₃) + d_A(A₂,A₃).

In a conclusion,

d_A(A₁,A₂) ≤₂d_A(A₁,A₃) + d_A(A₂,A₃). □

Proof Proof of Theorem 3

Given the atom metric d_A, it is obvious that $d_{F_{m}}$ satisfies the properties 1 and 2. Next, we continue to prove $d_{F_{m}}$ satisfies the triangle inequality.

For a formula F with the corresponding atom set D, D_rF = {A|∃B ∈ D,d_A(A,B) ≤ r} is a set of atoms that are at most r far away from one atom in F. observe that

$d_{F_{m}}(F_{1},F_{2})=min\{r|D_{1}\subseteq D_{r}F_{2} \wedge D_{2}\subseteq D_{r}F_{1}\}$

Given a formula F₃ with the corresponding atom set D₃, if

$D_{3}\subseteq D_{r}F_{1}\wedge D_{1}\subseteq D_{r}F_{3} \wedge D_{3}\subseteq D_{s}F_{2}\wedge D_{2}\subseteq D_{s}F_{3}$,

then

∀A ∈ D₁,∃C ∈ D₃, (d_A(A,C) ≤ r ∧∃B ∈ D₂,d_A(C,B) ≤ s).

In this case, $D_{1}\subseteq D_{r+s}F_{2}$. We can also prove $D_{2}\subseteq D_{r+s}F_{1}$.

Hence, if $d_{F_{m}}(F_{1},F_{3})\leq r$ and $d_{F_{m}}(F_{3},F_{2})\leq s$, then $d_{F_{m}}(F_{1},F_{2})\leq r+s$, and consequently $d_{F_{m}}(F_{1},F_{2}) \leq d_{F_{m}}(F_{1}, F_{3}) + d_{F_{m}}(F_{3}, F_{2})$. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, Q., Xu, Y. Axiom selection over large theory based on new first-order formula metrics. Appl Intell 52, 1793–1807 (2022). https://doi.org/10.1007/s10489-021-02469-1

Download citation

Accepted: 21 April 2021
Published: 28 May 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s10489-021-02469-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Axiom selection over large theory based on new first-order formula metrics

Abstract

Access this article

Similar content being viewed by others