Abstract
To mine databases in which examples are tagged with class labels, the minimum correlation constraint has been studied as an alternative to the minimum frequency constraint. We reformulate previous approaches and show that a minimum correlation constraint can be transformed into a disjunction of minimum frequency constraints. We prove that this observation extends to the multi-class χ 2 correlation measure, and thus obtain an efficient new O(n) prune test. We illustrate how the relation between correlation measures and minimum support thresholds allows for the reuse of previously discovered pattern sets, thus avoiding unneccessary database evaluations. We conclude with experimental results to assess the effectivity of algorithms based on our observations.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Advances in knowledge discovery and data mining, pp. 307–328 (1996)
Bay, S.D., Pazzani, M.J.: Detecting change in categorical data: Mining contrast sets. In: Proceedings of the 5th International Conference on Knowledge Discovery and Data Mining (KDD), pp. 302–306. ACM Press, New York (1999)
Bay, S.D., Pazzani, M.J.: Detecting group differences: Mining contrast sets. In: Data Mining and Knowledge Discovery, vol. 5, pp. 213–246. Kluwer Academic Publishers, Dordrecht (2001)
Bodon, F.: Surprising results of trie-based FIM algorithms. In: Proceedings of the Workshop on Frequent Itemset Mining Implementations (FIMI). CEUR Workshop Proceedings, vol. 90 (2004)
Blake, C.L., Newman, D.J., Hettich, S., Merz, C.J.: UCI repository of machine learning databases (1998)
Fürnkranz, J., Flach, P.: ROC ’n’ rule learning – towards a better understanding of covering algorithms. In: Machine Learning, vol. 58, pp. 39–77 (2005)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 1–12 (2000)
Kavšek, B., Lavrač, N., Jovanoski, V.: Apriori-SD: Adapting association rule learning to subgroup discovery. In: Berthold, M.R., Lenz, H.-J., Bradley, E., Kruse, R., Borgelt, C. (eds.) IDA 2003. LNCS, vol. 2810, pp. 230–241. Springer, Heidelberg (2003)
Liu, B., Ma, Y., Wong, C.K.: Improving an exhaustive search based rule learner. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 504–509. Springer, Heidelberg (2000)
Morishita, S., Sese, J.: Traversing itemset lattices with statistical metric pruning. In: Proceedings of the Nineteenth ACM SIGACT-SIGMOD-SIGART Symposium on Database Systems (PODS), pp. 226–236 (2000)
Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New algorithms for fast discovery of association rules. In: Proceedings of the Third International Conference on Knowledge Di scovery and Data Mining (KDD), pp. 283–286 (1997)
Zimmermann, A., De Raedt, L.: Cluster-grouping: From subgroup discovery to clustering. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 575–577. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nijssen, S., Kok, J.N. (2006). Multi-class Correlated Pattern Mining. In: Bonchi, F., Boulicaut, JF. (eds) Knowledge Discovery in Inductive Databases. KDID 2005. Lecture Notes in Computer Science, vol 3933. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11733492_10
Download citation
DOI: https://doi.org/10.1007/11733492_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33292-3
Online ISBN: 978-3-540-33293-0
eBook Packages: Computer ScienceComputer Science (R0)