Skip to main content
Log in

Levelwise Search and Pruning Strategies for First-Order Hypothesis Spaces

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

The discovery of interesting patterns in relational databases is an important data mining task. This paper is concerned with the development of a search algorithm for first-order hypothesis spaces adopting an important pruning technique (termed subset pruning here) from association rule mining in a first-order setting. The basic search algorithm is extended by so-called requires and excludes constraints allowing to declare prior knowledge about the data, such as mutual exclusion or generalization relationships among attributes, so that it can be exploited for further structuring and restricting the search space. Furthermore, it is illustrated how to process taxonomies and numerical attributes in the search algorithm.

Several task settings using different interestingness criteria and search modes with corresponding pruning criteria are described. Three settings serve as test beds for evaluation of the proposed approach. The experimental evaluation shows that the impact of subset pruning is significant, since it reduces the number of hypothesis evaluations in many cases by about 50%. The impact of generalization relationships is shown to be less effective in our experimental set-up.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., and Verkamo, I. (1996). Fast Discovery of Association Rules. In U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy (Eds.), Advances in Knowledge Discovery and Data Mining. Cambridge, MA: MIT Press.

    Google Scholar 

  • De Raedt, L. and Dehaspe, L. (1997). Clausal Discovery, Machine Learning, 26, 99–146.

    Google Scholar 

  • Dehaspe, L. and De Raedt, L. (1997). Mining Association Rules in Multiple Relations. In N. Lavrac and S. Dzeroski (Eds.), Proc. Seventh International Workshop on Inductive Logic Programming (ILP'97), volume 1297 of Lecture Notes in Artificial Intelligence. Springer Verlag.

  • Dougherty, J., Kohavi, R., and Sahami, M. (1995). Supervised and Unsupervised Discretization of Continuous Attributes. In Proceedings of the 12th International Conference on Machine Learning (ICML-95).

  • Fleury, L., Djeraba, C., Philippe, J., and Briand, H. (1995). Contribution of the Implication Intensity in Rules Evaluations for Knowledge Discovery in Databases. InY.Kodratoff, G. Nakhaeizadeh, and C. Taylor (Eds.),Workshop Notes of the ECML-95 Workshop Statistics, Machine Learning and Knowledge Discovery in Databases.

  • Kietz, J.-U. and Lübbe, M. (1994). An Efficient Subsumption Algorithm for Inductive Logic Programming. In S. Wrobel (Ed.), Proceedings of the 4th International Workshop on Inductive Logic Programming, volume 237 of GMD-Studien (pp. 97–106). Gesellschaft für Mathematik und Datenverarbeitung MBH.

  • Kietz, J.-U. and Wrobel, S. (1991). Controlling the Complexity of Learning in Logic Through Syntactic and Task-Oriented Models. In S. Muggleton (Ed.), Proceedings of the 1st International Workshop on Inductive Logic Programming (pp. 107–126).

  • Klösgen, W. (1996). Explora: A Multipattern and Multistrategy Discovery Assistant. In U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy (Eds.), Advances in Knowledge Discovery and Data Mining. Cambridge, MA: MIT Press.

    Google Scholar 

  • Klösgen,W. (1999). Applications and Research Problems of Subgroup Mining. In Proc. ISMIS'99, number 1609 in Lecture Notes in Artificial Intelligence. Springer-Verlag.

  • Lindner, G. and Morik, K. (1995). Coupling a Relational Learning Algorithm with a Database System. In Y. Kodratoff, G. Nakhaeizadeh, and C. Taylor (Eds.), Workshop Notes of the ECML-95 Workshop Statistics, Machine Learning and Knowledge Discovery in Databases.

  • Mannila, H. and Toivonen, H. (1997). Levelwise Search and Borders of Theories in Knowledge Discovery. Technical Report C-1997–8, University of Helsinki, Department of Computer Science.

  • Muggleton, S. (1995). Inverse Entailment and Progol, New Generation Computing, Special issue on Inductive Logic Programming, 13(3–4), 245–286.

    Google Scholar 

  • Nédellec, C., Rouveirol, C., Adé, H., Bergadano, F., and Tausend, B. (1996). Declarative Bias in ILP. In L. De Raedt (Ed.), Advances in Inductive Logic Programming (pp. 82–103). IOS Press.

  • Srikant, R. and Agrawal, R. (1995). Mining Generalized Association Rules. In Proc. of the VDLB Conference, Zurich, Switzerland.

  • Srikant, R. and Agrawal, R. (1996). Mining Quantitative Association Rules in Large Relational Tables. In Proc. of the ACM Sigmod Conference on Management of Data, Montreal, Canada.

  • Srikant, R., Vu, Q., and Agrawal, R. (1997). Mining Association Rules with Item Constraints. In Proc. of the 3rd Int. Conference on Knowledge Discovery in Databases and Data Mining.

  • Tsur, D., Ullman, J.D., Abiteboul, S., Clifton, C., Motwani, R., Nestorov, S., and Rosenthal, A. (1998). Query Flocks: A Generalization of Association Rule Mining. In Proc. of the ACM Sigmod98.

  • Weber, I. (1997). Discovery of First-Order Regularities in a Relational Database Using Offline Candidate Determination. In N. Lavrac and S. Dzeroski (Eds.), Proc. Seventh International Workshop on Inductive Logic Programming (ILP'97), volume 1297 of Lecture Notes in Artificial Intelligence. Springer Verlag.

  • Wrobel, S. (1997). An Algorithm for Multi-Relational Discovery of Subgroups. In J. Komorowski and J. Zytkow (Eds.), Proc. First European Symposium on Principles of Knowledge Discovery and Data Mining. Springer.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Weber, I. Levelwise Search and Pruning Strategies for First-Order Hypothesis Spaces. Journal of Intelligent Information Systems 14, 217–239 (2000). https://doi.org/10.1023/A:1008740003826

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008740003826

Navigation