ABSTRACT
In the paper a new data mining algorithm for finding the most interesting dependence rules is described. Dependence rules are derived from the itemsets with support significantly different from its expected value and therefore considered interesting. Since such itemsets are distributed non-monotonically in the lattice of all itemsets the support monotonicity property cannot be used for their search. Instead we estimate upper/lower bounds for the support to find itemsets with large interval of possible support values called support quota. Since the support quota is known to be monotonically decreasing the search space can be effectively restricted. Strongly dependent itemsets are selected by computing their expected support using iterative proportional fitting algorithm and comparing it with the real itemset support.
- A. A. Freitas, On rule interestingness measures, Knowlege Based Systems 12, 309--315, 1999.Google ScholarDigital Library
- R. Agrawal, T. Imielinski, A. Swami. Mining association rules between sets of items in large databases. Proc. of the ACM SIGMOD Conference on Management of Data, Washington, D.C., May 1993, 207--216. Google ScholarDigital Library
- B. Liu, L.-P. Ku and W. Hsu, Discovering Interesting Holes in Data, Proceedings of Fifteenth International Joint Conference on Artificial Intelligence (IJCAI-97), pp. 930--935, August 23--29, 1997, Nagoya, Japan. Google ScholarDigital Library
- A. Savinov, Mining Possibilistic Set-Valued Rules by Generating Prime Disjunctions, Proc. 3rd European Conference on Principles of Data Mining and Knowledge Discovery (PKDD'99), LNCS No. 1704, pp. 536--541. Google ScholarDigital Library
- A. Savinov, Mining Interesting Possibilistic Set-Valued Rules, in: Fuzzy If-Then Rules in Computational Intelligence: Theory and Applications (Eds.: Da Ruan and Etienne E. Kerre), Kluwer, 2000, 107--133.Google Scholar
- S. Brin, R. Motwani, and C. Silverstein, Beyond market basket: Generalizing association rules to correlations, SIGMOD'97, pp. 265--276. Google ScholarDigital Library
- C. Silverstein, S. Brin, and R. Motwani, Beyond Market Baskets: Generalizing Association Rules to Dependence Rules, Data Mining and Knowledge Discovery 2(1), 39--68. Google ScholarDigital Library
- R. Meo, Theory of dependence values, ACM Transactions on Database Systems, 25(3), 2000, 380--406. Google ScholarDigital Library
- T. Calders and B. Goethals. Mining all non-derivable frequent itemsets. Proc. 6th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD'02), LNCS No. 2431, pp. 74--85. Google ScholarDigital Library
- Darroch and D. Ratchli, Generalized Iterative Scaling for Log-Linear Models, The Annals of Mathematical Statistics, Vol. 43, No. 5, pp. 1470--1480, 1972.Google ScholarCross Ref
- S. Jaroszewicz and D. A. Simovici. Pruning Redundant Association Rules Using Maximum Entropy Principle. Advances in Knowledge Discovery and Data Mining, 6th Pacific-Asia Conference, PAKDD'02, 135--147. Google ScholarDigital Library
- Mining dependence rules by finding largest itemset support quota
Recommendations
Non-derivable itemset mining
All frequent itemset mining algorithms rely heavily on the monotonicity principle for pruning. This principle allows for excluding candidate itemsets from the expensive counting phase. In this paper, we present sound and complete deduction rules to ...
A survey of incremental high-utility itemset mining
Traditional association rule mining has been widely studied. But it is unsuitable for real-world applications where factors such as unit profits of items and purchase quantities must be considered. High-utility itemset mining HUIM is designed to find ...
Pushing Support Constraints Into Association Rules Mining
Interesting patterns often occur at varied levels of support. The classic association mining based on a uniform minimum support, such as Apriori, either misses interesting patterns of low support or suffers from the bottleneck of itemset generation ...
Comments