Abstract
There are many classifiers that treat entities to be classified as points in a high-dimensional vector space and then compute a separator S between entities in class \(+1\) from those in class \(-1\). However, such classifiers are usually very hard to explain in plain English to domain experts. We propose Metric Logic Programs (MLPs) which are a fragment of constraint logic programs as a new paradigm for explaining S. We present multiple measures of quality of an MLP and define the problem of finding an MLP-Explanation of S and show that it - and various related problems - are NP-hard. We present the MLP_Extract algorithm to extract MLP explanations for S. We show that while our algorithms provide more succinct, simpler, and higher fidelity explanations than association rules that are less expressive, our algorithms do require additional run-time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The proofs of all theorems can be found in the Appendix at the end of the paper.
- 2.
Though metric constraints can be expressed via interval constraints, it would require n interval constraints to express it. This would yield a very long rule body, losing our desire to have succinct rules. We use metric constraints to gain succinctness.
- 3.
MLP-E may not be in NP due to the overlap constraint. After guessing the set E of ML-rules in non-deterministic polynomial time, we need an oracle to check overlap, but this is at least \(\# P\)-hard by Theorem 1. We leave this as an open problem.
- 4.
uco is the maximum number of points in \(\overline{W}\), over all constraints.
- 5.
Tables 3–9 in the online Appendix [1] show the results for each data set.
- 6.
Detailed results are reported in Tables 10–15 in the online Appendix [1].
References
Online appendix (2016). https://sites.google.com/site/mlpextraction/
Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. SIGMOD Rec. 22(2), 207–216 (1993)
Barakat, N.H., Bradley, A.P.: Rule extraction from support vector machines: a review. Neurocomputing 74(1–3), 178–190 (2010)
Beldiceanu, N., Carlsson, M., Flener, P., Pearson, J.: On the reification of global constraints. Constraints 18(1), 1–6 (2013)
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth and Brooks, Monterey (1984)
Cohen, W.W.: Fast effective rule induction. In: ICML, pp. 115–123 (1995)
Craven, M.W., Shavlik, J.W.: Extracting tree-structured representations of trained networks. Adv. Neural Inf. Process. Syst. 8, 24–30 (1996)
Diederich, J. (ed.): Rule Extraction from Support Vector Machines. Studies in Computational Intelligence, vol. 80. Springer, Heidelberg (2008)
Dyer, M.E., Frieze, A.M.: On the complexity of computing the volume of a polyhedron. SIAM J. Comput. 17(5), 967–974 (1988)
Eiter, T., Gottlob, G.: The complexity of logic-based abduction. J. ACM (JACM) 42(1), 3–42 (1995)
Jaffar, J., Lassez, J.-L.: Constraint logic programming. In: POPL, pp. 111–119 (1987)
Kakas, A.C., Michael, A., Mourlas, C.: ACLP: abductive constraint logic programming. J. Log. Program. 44(1), 129–177 (2000)
Lloyd, J.W.: Foundations of Logic Programming. Springer, New York (1987)
Martens, D., Baesens, B., Van Gestel, T.: Decompositional rule extraction from support vector machines by active learning. IEEE Trans. Knowl. Data Eng. 21(2), 178–191 (2009)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., Los Altos (1993)
Reiter, R.: On closed world data bases. In: Ginsberg, M.L. (ed.) Readings in Nonmonotonic Reasoning, pp. 300–310. Kaufmann, Los Altos (1987)
Schmitz, G.P., Aldrich, C., Gouws, F.S.: ANN-DT: an algorithm for extraction of decision trees from artificial neural networks. IEEE Trans. Neural Netw. 10(6), 1392–1401 (1999)
Zhu, P., Qinghua, H.: Rule extraction from support vector machines based on consistent region covering reduction. Knowl. Based Syst. 42, 1–8 (2013)
Acknowledgements
Parts of this work were supported by ONR grant N000141612739 and ARO grant W911NF1610342.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix: Proofs
Appendix: Proofs
Proof
of Theorem 1 (Sketch). Dyer [9] proved that computing the volume of the polytope \(P=\{ \mathbf x \in \mathbb {R}^n \ | \ {\mathbf {a}}^T\mathbf x \le c, 0 \le \mathbf x [i] \le 1, i=1,\ldots ,n \}\), where \({\mathbf {a}}\) is a vector of integers greater than 0, is \(\#P\)-hard. Let us assume to have a hyper-rectangle R where \( 0 \le \mathbf x [i] \le 1\), \(\forall i \in D\), and a hyper-plane of the form \(\mathbf w ^T\phi (\mathbf x ) + b = 0\), where \(\mathbf w ={\mathbf {a}}\) and \(b=-c-1\). Since the volume of R is equal to 1, then, computing \({\mathbf {ov}}(R,y)=\frac{V_{\overline{y}}(R)}{V(R)}=V_{\overline{y}}(R)\) where \(V_{\overline{y}}(R)\) represents the volume of \({\mathbf {ov}}(R,y)\) is the same of computing the volume of the polytope P.
Proof
of Theorem 2 (Sketch). The result is proven with a reduction from the 3-colorability problem, that is proven be NP-hard, to a decision version of the MLP-Explanation problem that is the following: verify whether there exists at most k MLRs that satisfy all the four constraints of the MLP-Explanation problem. Given a graph \(G=(V,E)\), the 3-colorability problem consists into verifying whether there exists a color assignment to node s.t. two neighbors node are not colored with the same color.
The reduction is the following. We set (i) \(k=3\), (ii) \(n=|V|\), i.e. we have one dimension for each vertex, (iii) \(T^y = \{ \mathbf u _i \in \{ 0,1\}^n \ | \ \mathbf u _i[i]=1 \wedge i\in D : \ (\forall j\in D \wedge j\ne i \Rightarrow \mathbf u _i[j]=0) \}\) to represent all vertices \(v_i \in V\), (iv) \(T^{\overline{y}} = \{ \mathbf u \in \{ 0,1\}^n \ | (v_i,v_j) \in E \wedge \mathbf u [i]=1 \wedge \mathbf u [j]=1 \wedge (\forall h \in D : h \ne i,j \Rightarrow \mathbf u [h]=0) \}\) to represent the set E of edges, and (v) \(m=0\). Because of the first MLP-Explanation problem constraint, and \(k=3\), a node must be covered by at least one rule of the three, so each rule assigns a color to each rule.
For instance, suppose we have in G only two vertices \(v_1\) and \(v_2\) and one edge \((v_1,v_2)\). Then, \(T^y = \{[1,0],[0,1]\}\) and \(T^{\overline{y}} = \{[1,1]\}\). Considering now the rule \(0 \le \mathbf x [1] \le 1 \wedge 0 \le \mathbf x [2] \le 1 \rightarrow y\), we have that it covers \(T^y \cup T^{\overline{y}}\).
Then, it is not possible that there exist two nodes covered by the same rule that are connected by an edge, otherwise a point in \(T^{\overline{y}}\) is covered by the rule. This implies that it is not possible to use the same color to color two nodes connected by an edge.
Note that it is possible that the same node can be colored in two different ways, but to obtain a feasible coloring it is sufficient choosing one of the two color without violate the constraints. It follows that the decision version of the MLP-Explanation problem is at least hard as the 3-colorability, and this means that the MLP-Explanation problem is NP-hard.
Proof
of Theorem 3 (Sketch). We prove the theorem via a reduction from maximal independent set problem proven to be NP-hard. We recall that, given a graph \(G=(V,E)\) the maximal independent set problem consists into find a set of vertices \(V' \subseteq V\) s.t. \(|V'| \ge k\) and there are no edges in E between any pair of vertices in \(V'\). The reduction is the following. We set (i) \(n=|V|\), i.e. we have one dimension for each vertex, (ii) \(W = \{ \mathbf u _i \in \{ 0,1\}^n \ | \ \mathbf u _i[i]=1 \wedge i\in D : \ (\forall j\in D \wedge j\ne i \Rightarrow \mathbf u _i[j]=0) \}\) to represent all vertices \(v_i \in V\), (iii) \(\overline{W} = \{ \mathbf u \in \{ 0,1\}^n \ | (v_i,v_j) \in E \wedge \mathbf u [i]=1 \wedge \mathbf u [j]=1 \wedge (\forall h \in D : h \ne i,j \Rightarrow \mathbf u [h]=0) \}\) to represent the set E of edges, and (iv) \(m=0\). Then, our problem is to find a rule, if there exists, that covers at least k points in W and no point in \(\overline{W}\). In this reduction we do not consider the MLP-Explanation problem constraints 2 and 3. Note that if a rule covers two vertices then it automatically cover also the edge between them if it exist. For instance, suppose we have in G only two vertices \(v_1\) and \(v_2\) and one edge \((v_1,v_2)\). Then, \(W = \{[1,0],[0,1]\}\) and \(\overline{W} = \{[1,1]\}\). Considering now the rule \(0 \le \mathbf x [1] \le 1 \wedge 0 \le \mathbf x [2] \le 1 \rightarrow y\), we have that it covers \(W \cup \overline{W}\). It follows that finding a rule that covers at least k points in W and does not cover any point in \(\overline{W}\) is at least hard as the independent set problem.
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Kumar, S., Serra, E., Spezzano, F., Subrahmanian, V.S. (2016). Metric Logic Program Explanations for Complex Separator Functions. In: Schockaert, S., Senellart, P. (eds) Scalable Uncertainty Management. SUM 2016. Lecture Notes in Computer Science(), vol 9858. Springer, Cham. https://doi.org/10.1007/978-3-319-45856-4_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-45856-4_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45855-7
Online ISBN: 978-3-319-45856-4
eBook Packages: Computer ScienceComputer Science (R0)