Skip to main content

Metric Logic Program Explanations for Complex Separator Functions

  • Conference paper
  • First Online:
Scalable Uncertainty Management (SUM 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9858))

Included in the following conference series:

Abstract

There are many classifiers that treat entities to be classified as points in a high-dimensional vector space and then compute a separator S between entities in class \(+1\) from those in class \(-1\). However, such classifiers are usually very hard to explain in plain English to domain experts. We propose Metric Logic Programs (MLPs) which are a fragment of constraint logic programs as a new paradigm for explaining S. We present multiple measures of quality of an MLP and define the problem of finding an MLP-Explanation of S and show that it - and various related problems - are NP-hard. We present the MLP_Extract algorithm to extract MLP explanations for S. We show that while our algorithms provide more succinct, simpler, and higher fidelity explanations than association rules that are less expressive, our algorithms do require additional run-time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The proofs of all theorems can be found in the Appendix at the end of the paper.

  2. 2.

    Though metric constraints can be expressed via interval constraints, it would require n interval constraints to express it. This would yield a very long rule body, losing our desire to have succinct rules. We use metric constraints to gain succinctness.

  3. 3.

    MLP-E may not be in NP due to the overlap constraint. After guessing the set E of ML-rules in non-deterministic polynomial time, we need an oracle to check overlap, but this is at least \(\# P\)-hard by Theorem 1. We leave this as an open problem.

  4. 4.

    uco is the maximum number of points in \(\overline{W}\), over all constraints.

  5. 5.

    Tables 3–9 in the online Appendix [1] show the results for each data set.

  6. 6.

    Detailed results are reported in Tables 10–15 in the online Appendix [1].

References

  1. Online appendix (2016). https://sites.google.com/site/mlpextraction/

  2. Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. SIGMOD Rec. 22(2), 207–216 (1993)

    Article  Google Scholar 

  3. Barakat, N.H., Bradley, A.P.: Rule extraction from support vector machines: a review. Neurocomputing 74(1–3), 178–190 (2010)

    Article  Google Scholar 

  4. Beldiceanu, N., Carlsson, M., Flener, P., Pearson, J.: On the reification of global constraints. Constraints 18(1), 1–6 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  5. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth and Brooks, Monterey (1984)

    MATH  Google Scholar 

  6. Cohen, W.W.: Fast effective rule induction. In: ICML, pp. 115–123 (1995)

    Google Scholar 

  7. Craven, M.W., Shavlik, J.W.: Extracting tree-structured representations of trained networks. Adv. Neural Inf. Process. Syst. 8, 24–30 (1996)

    Google Scholar 

  8. Diederich, J. (ed.): Rule Extraction from Support Vector Machines. Studies in Computational Intelligence, vol. 80. Springer, Heidelberg (2008)

    MATH  Google Scholar 

  9. Dyer, M.E., Frieze, A.M.: On the complexity of computing the volume of a polyhedron. SIAM J. Comput. 17(5), 967–974 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  10. Eiter, T., Gottlob, G.: The complexity of logic-based abduction. J. ACM (JACM) 42(1), 3–42 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  11. Jaffar, J., Lassez, J.-L.: Constraint logic programming. In: POPL, pp. 111–119 (1987)

    Google Scholar 

  12. Kakas, A.C., Michael, A., Mourlas, C.: ACLP: abductive constraint logic programming. J. Log. Program. 44(1), 129–177 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  13. Lloyd, J.W.: Foundations of Logic Programming. Springer, New York (1987)

    Book  MATH  Google Scholar 

  14. Martens, D., Baesens, B., Van Gestel, T.: Decompositional rule extraction from support vector machines by active learning. IEEE Trans. Knowl. Data Eng. 21(2), 178–191 (2009)

    Article  Google Scholar 

  15. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., Los Altos (1993)

    Google Scholar 

  16. Reiter, R.: On closed world data bases. In: Ginsberg, M.L. (ed.) Readings in Nonmonotonic Reasoning, pp. 300–310. Kaufmann, Los Altos (1987)

    Google Scholar 

  17. Schmitz, G.P., Aldrich, C., Gouws, F.S.: ANN-DT: an algorithm for extraction of decision trees from artificial neural networks. IEEE Trans. Neural Netw. 10(6), 1392–1401 (1999)

    Article  Google Scholar 

  18. Zhu, P., Qinghua, H.: Rule extraction from support vector machines based on consistent region covering reduction. Knowl. Based Syst. 42, 1–8 (2013)

    Article  Google Scholar 

Download references

Acknowledgements

Parts of this work were supported by ONR grant N000141612739 and ARO grant W911NF1610342.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesca Spezzano .

Editor information

Editors and Affiliations

Appendix: Proofs

Appendix: Proofs

Proof

of Theorem 1 (Sketch). Dyer [9] proved that computing the volume of the polytope \(P=\{ \mathbf x \in \mathbb {R}^n \ | \ {\mathbf {a}}^T\mathbf x \le c, 0 \le \mathbf x [i] \le 1, i=1,\ldots ,n \}\), where \({\mathbf {a}}\) is a vector of integers greater than 0, is \(\#P\)-hard. Let us assume to have a hyper-rectangle R where \( 0 \le \mathbf x [i] \le 1\), \(\forall i \in D\), and a hyper-plane of the form \(\mathbf w ^T\phi (\mathbf x ) + b = 0\), where \(\mathbf w ={\mathbf {a}}\) and \(b=-c-1\). Since the volume of R is equal to 1, then, computing \({\mathbf {ov}}(R,y)=\frac{V_{\overline{y}}(R)}{V(R)}=V_{\overline{y}}(R)\) where \(V_{\overline{y}}(R)\) represents the volume of \({\mathbf {ov}}(R,y)\) is the same of computing the volume of the polytope P.

Proof

of Theorem 2 (Sketch). The result is proven with a reduction from the 3-colorability problem, that is proven be NP-hard, to a decision version of the MLP-Explanation problem that is the following: verify whether there exists at most k MLRs that satisfy all the four constraints of the MLP-Explanation problem. Given a graph \(G=(V,E)\), the 3-colorability problem consists into verifying whether there exists a color assignment to node s.t. two neighbors node are not colored with the same color.

The reduction is the following. We set (i) \(k=3\), (ii) \(n=|V|\), i.e. we have one dimension for each vertex, (iii) \(T^y = \{ \mathbf u _i \in \{ 0,1\}^n \ | \ \mathbf u _i[i]=1 \wedge i\in D : \ (\forall j\in D \wedge j\ne i \Rightarrow \mathbf u _i[j]=0) \}\) to represent all vertices \(v_i \in V\), (iv) \(T^{\overline{y}} = \{ \mathbf u \in \{ 0,1\}^n \ | (v_i,v_j) \in E \wedge \mathbf u [i]=1 \wedge \mathbf u [j]=1 \wedge (\forall h \in D : h \ne i,j \Rightarrow \mathbf u [h]=0) \}\) to represent the set E of edges, and (v) \(m=0\). Because of the first MLP-Explanation problem constraint, and \(k=3\), a node must be covered by at least one rule of the three, so each rule assigns a color to each rule.

For instance, suppose we have in G only two vertices \(v_1\) and \(v_2\) and one edge \((v_1,v_2)\). Then, \(T^y = \{[1,0],[0,1]\}\) and \(T^{\overline{y}} = \{[1,1]\}\). Considering now the rule \(0 \le \mathbf x [1] \le 1 \wedge 0 \le \mathbf x [2] \le 1 \rightarrow y\), we have that it covers \(T^y \cup T^{\overline{y}}\).

Then, it is not possible that there exist two nodes covered by the same rule that are connected by an edge, otherwise a point in \(T^{\overline{y}}\) is covered by the rule. This implies that it is not possible to use the same color to color two nodes connected by an edge.

Note that it is possible that the same node can be colored in two different ways, but to obtain a feasible coloring it is sufficient choosing one of the two color without violate the constraints. It follows that the decision version of the MLP-Explanation problem is at least hard as the 3-colorability, and this means that the MLP-Explanation problem is NP-hard.

Proof

of Theorem 3 (Sketch). We prove the theorem via a reduction from maximal independent set problem proven to be NP-hard. We recall that, given a graph \(G=(V,E)\) the maximal independent set problem consists into find a set of vertices \(V' \subseteq V\) s.t. \(|V'| \ge k\) and there are no edges in E between any pair of vertices in \(V'\). The reduction is the following. We set (i) \(n=|V|\), i.e. we have one dimension for each vertex, (ii) \(W = \{ \mathbf u _i \in \{ 0,1\}^n \ | \ \mathbf u _i[i]=1 \wedge i\in D : \ (\forall j\in D \wedge j\ne i \Rightarrow \mathbf u _i[j]=0) \}\) to represent all vertices \(v_i \in V\), (iii) \(\overline{W} = \{ \mathbf u \in \{ 0,1\}^n \ | (v_i,v_j) \in E \wedge \mathbf u [i]=1 \wedge \mathbf u [j]=1 \wedge (\forall h \in D : h \ne i,j \Rightarrow \mathbf u [h]=0) \}\) to represent the set E of edges, and (iv) \(m=0\). Then, our problem is to find a rule, if there exists, that covers at least k points in W and no point in \(\overline{W}\). In this reduction we do not consider the MLP-Explanation problem constraints 2 and 3. Note that if a rule covers two vertices then it automatically cover also the edge between them if it exist. For instance, suppose we have in G only two vertices \(v_1\) and \(v_2\) and one edge \((v_1,v_2)\). Then, \(W = \{[1,0],[0,1]\}\) and \(\overline{W} = \{[1,1]\}\). Considering now the rule \(0 \le \mathbf x [1] \le 1 \wedge 0 \le \mathbf x [2] \le 1 \rightarrow y\), we have that it covers \(W \cup \overline{W}\). It follows that finding a rule that covers at least k points in W and does not cover any point in \(\overline{W}\) is at least hard as the independent set problem.

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Kumar, S., Serra, E., Spezzano, F., Subrahmanian, V.S. (2016). Metric Logic Program Explanations for Complex Separator Functions. In: Schockaert, S., Senellart, P. (eds) Scalable Uncertainty Management. SUM 2016. Lecture Notes in Computer Science(), vol 9858. Springer, Cham. https://doi.org/10.1007/978-3-319-45856-4_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-45856-4_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-45855-7

  • Online ISBN: 978-3-319-45856-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics