Skip to main content

Learning Predictive Clustering Rules

  • Conference paper
Knowledge Discovery in Inductive Databases (KDID 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3933))

Included in the following conference series:

  • 489 Accesses

Abstract

The two most commonly addressed data mining tasks are predictive modelling and clustering. Here we address the task of predictive clustering, which contains elements of both and generalizes them to some extent. Predictive clustering has been mainly evaluated in the context of trees. In this paper, we extend predictive clustering toward rules. Each cluster is described by a rule and different clusters are allowed to overlap since the sets of examples covered by different rules do not need to be disjoint. We propose a system for learning these predictive clustering rules, which is based on a heuristic sequential covering algorithm. The heuristic takes into account both the precision of the rules (compactness w.r.t. the target space) and the compactness w.r.t. the input space, and the two can be traded-off by means of a parameter. We evaluate our system in the context of several multi-objective classification problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Blockeel, H.: Top-down induction of first order logical decision trees. PhD thesis, Department of Computer Science, Katholieke Universiteit, Leuven (1998)

    Google Scholar 

  2. Blockeel, H., De Raedt, L., Ramon, J.: Top-down induction of clustering trees. In: Proceedings of the 15th International Conference on Machine Learning, pp. 55–63. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  3. Blockeel, H., Struyf, J.: Efficient algorithms for decision tree crossvalidation. Journal of Machine Learning Research 3, 621–650 (December 2002)

    MATH  Google Scholar 

  4. Clark, P., Niblett, T.: The CN2 Induction Algorithm. Machine Learning 3, 261–283 (1989)

    Google Scholar 

  5. Džeroski, S., Demšar, D., Grbović, J.: Predicting chemical parameters of river water quality from bioindicator data. Applied Intelligence 13(1), 7–17 (2000)

    Article  Google Scholar 

  6. Džeroski, S., Blockeel, H., Grbović: Predicting river water communities with logical decision trees. In: Presented at the Third European Ecological Modelling Conference, Zagreb, Croatia (2001)

    Google Scholar 

  7. Flach, P., Lavrač, N.: Rule induction. In: Berthold, M., Hand, D.J. (eds.) Intelligent Data Analysis, pp. 229–267. Springer, Heidelberg (1999)

    Google Scholar 

  8. Karalič, A., Bratko, I.: First Order Regression. Machine Learning 26, 147–176 (1997)

    Article  MATH  Google Scholar 

  9. Kaufman, L., Rousseeuw, P.J.: Finding groups in data: An introduction to cluster analysis. John Wiley & Sons, Chichester (1990)

    Book  MATH  Google Scholar 

  10. Langley, P.: Elements of Machine Learning. Morgan Kaufmann, San Francisco (1996)

    Google Scholar 

  11. Lavrač, N., Kavšek, B., Flach, P., Todorovski, L.: Subgroup discovery with CN2-SD. Journal of Machine Learning Research 5, 153–188 (February 2004)

    MathSciNet  Google Scholar 

  12. Michalski, R.S.: Knowledge acquisition through conceptual clustering: A theoretical framework and algorithm for partitioning data into conjunctive concepts. International Journal of Policy Analysis and Information Systems 4, 219–243 (1980)

    MathSciNet  Google Scholar 

  13. Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases. University of California, Irvine (1998)

    Google Scholar 

  14. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  15. Sese, J., Morishita, S.: Itemset Classified Clustering. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS, vol. 3202, pp. 398–409. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  16. Sese, J., Kurokawa, Y., Kato, K., Monden, M., Morishita, S.: Constrained clusters of gene expression profiles with pathological features. Bioinformatics (2004)

    Google Scholar 

  17. Struyf, J., Dzeroski, S.: Constraint based induction of multi-objective regression trees. In: Bonchi, F., Boulicaut, J.-F. (eds.) KDID 2005. LNCS, vol. 3933, pp. 110–121. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  18. Struyf, J., Dzeroski, S., Blockeel, H., Clare, A.: Hierarchical multiclassification with predictive clustering trees in functional genomics. In: Proceedings of Workshop on Computational Methods in Bioinformatics as part of the 12th Portuguese Conference on Artificial Intelligence, pp. 272–283. Springer, Heidelberg (2005)

    Google Scholar 

  19. Suzuki, E., Gotoh, M., Choki, Y.: Bloomy decision tree for multi-objective classification. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS, vol. 2168, pp. 436–447. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  20. Todorovski, L., Blockeel, H., Dzeroski, S.: Ranking with predictive clustering trees. In: Proceedings of the 13th European Conferende on Machine Learning, pp. 444–456. Springer, Heidelberg (2002)

    Google Scholar 

  21. Torgo, L.: Data Fitting with Rule-based Regression. In: Zizka, J., Brazdil, P. (eds.) Proceedings of the workshop on Artificial Intelligence Techniques (AIT 1995), Brno, Czech Republic (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ženko, B., Džeroski, S., Struyf, J. (2006). Learning Predictive Clustering Rules. In: Bonchi, F., Boulicaut, JF. (eds) Knowledge Discovery in Inductive Databases. KDID 2005. Lecture Notes in Computer Science, vol 3933. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11733492_14

Download citation

  • DOI: https://doi.org/10.1007/11733492_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33292-3

  • Online ISBN: 978-3-540-33293-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics