Abstract
Detection of interactions among data items constitutes an essential part of knowledge discovery. The cascade model is a rule induction methodology using levelwise expansion of a lattice. It can detect positive and negative interactions using the sum of squares criterion for categorical data. An attribute-value pair is expressed as an item, and the BSS (between-groups sum of squares) value along a link in the itemset lattice indicates the strength of interaction among item pairs. A link with a strong interaction is represented as a rule. Items on the node constitute the left-hand side (LHS) of a rule, and the right-hand side (RHS) displays veiled items with strong interactions with the added item. This implies that we do not need to generate an itemset containing the RHS items to get a rule. This property enables effective rule induction. That is, rule links can be dynamically detected during the generation of a lattice. Furthermore, the BSS value of the added attribute gives an upper bound to those of other attributes along the link. This property gives us an effective pruning method for the itemset lattice. The method was implemented as the software DISCAS. There, the items to appear in the LHS and RHS are easily controlled by input parameters. Its algorithms are depicted and an application is provided as an illustrative example.
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. Proc. ACM SIGMOD (1993) 207–216
Ali, K., Manganaris, S., Srikant, R.: Partial Classification using Association Rules. Proc. KDD-97 (1997) 115–118
Liu, B., Hsu, W., Ma, Y.: Integrating Classification and Association Rule Mining. Proc. KDD-98 (1998) 80–86
Meretakis, D., Wüthrich, B.: Classification as Mining and Use of Labeled Itemsets. Proc. ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (1999)
Silverstein, C., Brin, S., Motwani, R.: Beyond Market Baskets: Generalizing Association Rules to Dependence Rules. Data Mining and Knowledge Discovery, 2 (1998) 39–68
Okada, T.: Finding Discrimination Rules using the Cascade Model. J. Jpn. Soc. Artificial Intelligence, 15 (2000) in press
Okada, T.: Rule Induction in Cascade Model based on Sum of Squares Decomposition. Principles of Data Mining and Knowledge Discovery (Proc. PKDD’99), 468–475, Lecture Notes in Artificial Intelligence 1704, Springer-Verlag (1999).
Okada, T.: Sum of Squares Decomposition for Categorical Data. Kwansei Gakuin Studies in Computer Science 14 (1999) 1–6. http://www.media.kwansei.ac.jp/home/kiyou/kiyou99/kiyou99-e.html
Gini, C.W.: Variability and Mutability, contribution to the study of statistical distributions and relations, Studi Economico-Giuridici della R. Universita de Cagliari (1912). Reviewed in Light, R.J., Margolin, B.H.: An Analysis of Variance for Categorical Data. J. Amer. Stat. Assoc. 66(1971) 534–544
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. Proc. VLDB (1994) 487–499
Toivonen, H.: Sampling Large Databases for Finding Association Rules. Proc. VLDB (1996) 134–145
Brin, S., Motwani, R., Ullman J. D., Tsur, S.: Dynamic Itemset Counting and Implication Rules for Market Basket Data. Proc. ACM SIGMOD (1997) 255–264
Mertz, C. J., Murphy, P. M.: UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html, University of California, Irvine, Dept. of Information and Computer Sci. (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Okada, T. (2000). Efficient Detection of Local Interactions in the Cascade Model. In: Terano, T., Liu, H., Chen, A.L.P. (eds) Knowledge Discovery and Data Mining. Current Issues and New Applications. PAKDD 2000. Lecture Notes in Computer Science(), vol 1805. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45571-X_24
Download citation
DOI: https://doi.org/10.1007/3-540-45571-X_24
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67382-8
Online ISBN: 978-3-540-45571-4
eBook Packages: Springer Book Archive