Abstract
A rule set is a type of classifier that, given attributes \(X\), predicts a target \(Y\). Its main advantage over other types of classifiers is its simplicity and interpretability. A practical challenge is that the end user of a rule set does not always know in advance which target will need to be predicted. One way to deal with this is to learn a multi-directional rule set, which can predict any attribute from all others. An individual rule in such a multi-directional rule set can have multiple targets in its head, and thus be used to predict any one of these. Compared to the naive approach of learning one rule set for each possible target and merging them, a multi-directional rule set containing multi-target rules is potentially smaller and more interpretable. Training a multi-directional rule set involves two key steps: generating candidate rules and selecting rules. However, the best way to tackle these steps remains an open question. In this paper, we investigate the effect of using Random Forests as candidate rule generators and propose two new approaches for selecting rules with multi-target heads: MIDS, a generalization of the recent single-target IDS approach, and RR, a new simple algorithm focusing only on predictive performance. Our experiments indicate that (1) using multi-target rules leads to smaller rule sets with a similar predictive performance, (2) using Forest-derived rules instead of association rules leads to rule sets of similar quality, and (3) RR outperforms MIDS, underlining the usefulness of simple selection objectives.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Note that while IDS specifies it uses association rules, no modifications are necessary to support rules derived from decision trees. Any rule type for which the coverage and overlap can be calculated is supported.
- 2.
A better denominator is to use \(\sum _{r\in \mathcal {R}_{cand}} length (r)\) instead of \(L_ max \cdot |\mathcal {R}_{cand}|\).
- 3.
A stricter upper bound is \( \frac{N}{2} \cdot |R_{cand,X_j}| \cdot (|R_{cand,X_j}| - 1)\).
- 4.
References
Borgelt, C.: Frequent item set mining. Wiley Interdisc. Rev. Data Min. Knowl. Discov. 2(6), 437–456 (2012)
Bringmann, B., Nijssen, S., Zimmermann, A.: Pattern-Based Classification: A Unifying Perspective (2011)
Buchbinder, N., Feldman, M., Naor, J.S., Schwartz, R.: A tight linear time (1/2)-approximation for unconstrained submodular maximization. SIAM J. Comput. 44(5), 1384–1402 (2015)
Feige, U., Mirrokni, V.S., Vondrák, J.: Maximizing non-monotone submodular functions. SIAM J. Comput. 40(4), 1133–1153 (2011)
Fürnkranz, J., Gamberger, D., Lavrač, N.: Foundations of Rule Learning. Cognitive Technologies, p. XVIII, 334. Springer, Berlin, Heidelberg (2014). https://doi.org/10.1007/978-3-540-75197-7
Fürnkranz, J., Knobbe, A.: Guest editorial: global modeling using local patterns. Data Min. Knowl. Discov. 21(1), 1–8 (2010)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. SIGMOD Rec. 29(2), 1–12 (2000)
Ignatiev, A., Pereira, F., Narodytska, N., Marques-Silva, J.: A SAT-based approach to learn explainable decision sets. In: Galmiche, D., Schulz, S., Sebastiani, R. (eds.) IJCAR 2018, vol. 10900, pp. 627–645. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-94205-6_41
Kliegr, T.: Quantitative CBA: Small and Comprehensible Association Rule Classification Models, pp. 1–24 (2017)
Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. Adaptive Computation and Machine Learning. The MIT Press, Cambridge (2009)
Lakkaraju, H., Bach, S.H., Leskovec, J.: Interpretable decision sets: a joint framework for description and prediction. In: 22nd International Conference on Knowledge Discovery and Data Mining. KDD’16, pp. 1675–1684. ACM (2016)
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Agrawal, R., Stolorz, P.E., Piatetsky-Shapiro, G. (eds.) 4th International Conference on Knowledge Discovery and Data Mining. KDD’98, pp. 80–86. AAAI Press, New York (1998)
Quinlan, J.R.: Generating production rules from decision trees. In: McDermott, J.P. (ed.) 10th International Joint Conference on Artificial Intelligence, pp. 304–307. Morgan Kaufmann, Los Altos (1987)
Van Wolputte, E., Korneva, E., Blockeel, H.: MERCS: multi-directional ensembles of regression and classification trees. In: 32nd AAAI Conference on Artificial Intelligence, pp. 4276–4283 (2018)
Waegeman, W., Dembczyński, K., Hüllermeier, E.: Multi-target prediction: a unifying view on problems and methods. Data Min. Knowl. Discov. 33(2), 293–324 (2018). https://doi.org/10.1007/s10618-018-0595-5
Ženko, B., Džeroski, S.: Learning classification rules for multiple target attributes. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS, vol. 5012, pp. 454–465. Springer, Berlin, Heidelberg (2008). https://doi.org/10.1007/978-3-540-68125-0_40
Zimmermann, A., De Raedt, L.: CorClass: correlated association rule mining for classification. In: Suzuki, E., Arikawa, S. (eds.) DS 2004. Lecture Notes in Computer Science, vol. 3245, pp. 60–72. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30214-8_5
Acknowledgments
This research received funding from the KU Leuven Research Fund (C14/17/070, “SIRV”) and the Flemish Government under the “Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen” programme.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Schouterden, J., Davis, J., Blockeel, H. (2020). Multi-directional Rule Set Learning. In: Appice, A., Tsoumakas, G., Manolopoulos, Y., Matwin, S. (eds) Discovery Science. DS 2020. Lecture Notes in Computer Science(), vol 12323. Springer, Cham. https://doi.org/10.1007/978-3-030-61527-7_34
Download citation
DOI: https://doi.org/10.1007/978-3-030-61527-7_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61526-0
Online ISBN: 978-3-030-61527-7
eBook Packages: Computer ScienceComputer Science (R0)