Experimental Investigation of Pruning Methods for Relational Pattern Discovery

Weber, Irene

doi:10.1007/3-540-36468-4_20

Experimental Investigation of Pruning Methods for Relational Pattern Discovery

Irene Weber³

Conference paper
First Online: 01 January 2003

293 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2583))

Abstract

Finding all interesting patterns in a database is a data mining task that typically requires a complete search through the hypothesis space. Several ILP systems address this task, e.g., [Deh98],[Wro97],[FL01]. Safe pruning techniques that reduce the size of the hypothesis space without the risk of missing interesting patterns are very important for this task. This paper is concerned with the effectiveness of pruning techniques in this setting. The addressed pruning techniques are (1) optimum estimates, (2) a pruning technique based on subset tests that is derived from the Apriori search algorithm, (3) pruning based on taxonomies, and (4) to consider only most general patterns as interesting. Methods (1) to (3) are safe pruning techniques that find all interesting patterns; method (4) reduces the number of accepted patterns. The effect of these pruning methods is investigated by experiments within a range of different specific task settings and two databases.

Experimental results indicate that optimum estimates and Apriori-style pruning are effective and reliable pruning techniques that produce little additional cost. The effect of taxonomies for pruning is smaller, and it varies over different task settings. In the experiments, the restriction to most general patterns considerably reduces the search costs as well as the set of accepted patterns.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

R. Agrawal, H. Mannila, R. Srikant, H Toivonen, and I. Verkamo. Fast discovery of association rules. In U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining. MIT Press, Cambridge, MA, 1996.
Google Scholar
Luc Dehaspe. Frequent pattern discovery in first-order logic. PhD thesis, K.U.Leuven, December 1998.
Google Scholar
L. Fleury, C. Djeraba, J. Philippe, and H. Briand. Contribution of the implication intensity in rules evaluations for knowledge discovery in databases. In Y. Kodrato., G. Nakhaeizadeh, and C. Taylor, editors, Workshop Notes of the ECML-95 Workshop Statistics, Machine Learning and Knowledge Discovery in Databases, 1995.
Google Scholar
P. Flach and N. Lachiche. Confirmation-guided discovery of first-order rules with Tertius. Machine Learning, 42(1/2):61–95, 2001.
Article MATH Google Scholar
R. Gras and A. Larher. L’implication statistique, une nouvelle méthode d’analyse de données. Mathématique, Informatique et Sciences Humaines, (120), 1993.
Google Scholar
S. Rapp. Automatic labeling of German prosody. In Proc. of Int. Conference on Spoken Language Processing (ICSLP’98), 1998.
Google Scholar
I. Weber. Level-wise search and pruning strategies for first-order hypothesis spaces. Journal of Intelligent Information Systems, 14(2–3):217–239, 2000.
Article Google Scholar
Stefan Wrobel. An algorithm for multi-relational discovery of subgroups. In J. Komorowski and J. Zytkow, editors, Proc. First European Symposium on Principles of Knowledge Discovery and Data Mining. Springer, 1997.
Google Scholar

Download references

Author information

Authors and Affiliations

Institut für Informatik, Universität Stuttgart, Breitwiesenstr. 20-22, 70565, Stuttgart, Germany
Irene Weber

Authors

Irene Weber
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Technology and Engineering, University of Ottawa, 800 King Edward Ave., K1N 6N5, Ottawa, ON, Canada
Stan Matwin
School of Computer Science and Engineering, University of New South Wales, 2052, Sydney, NSW, Australia
Claude Sammut

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Weber, I. (2003). Experimental Investigation of Pruning Methods for Relational Pattern Discovery. In: Matwin, S., Sammut, C. (eds) Inductive Logic Programming. ILP 2002. Lecture Notes in Computer Science(), vol 2583. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36468-4_20

Download citation

DOI: https://doi.org/10.1007/3-540-36468-4_20
Published: 14 March 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00567-4
Online ISBN: 978-3-540-36468-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics