Abstract
Traditional association rule mining techniques employ the support and confidence framework. However, specifying minimum support of the mined rules in advance often leads to either too many or too few rules, which negatively impacts the performance of the overall system. Here we propose replacing Apriori’s user-defined minimum support threshold with the more meaningful MinAbsSup function. This calculates a custom minimum support for each itemset based on the probability of chance collision of its items, as derived from the inverse of Fisher’s exact test. We will introduce the notion of coincidental itemsets; given a transaction dataset there is a chance that two independent items are appearing together by random coincidence. Rules generated from these itemsets do not denote a meaningful association, and are not useful.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Bocca, J.B., Jarke, M., Zaniolo, C. (eds.) Proceedings of the 20th International Conference on Very Large Data Bases VLDB, Santiago, Chile, pp. 487–499 (1994)
Liu, B., Hsu, W., Ma, Y.: Mining association rules with multiple minimum supports. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 337–341 (1999)
Yun, H., Ha, D., Hwang, B., Ryu, K.H.: Mining association rules on significant rare data using relative support. The Journal of Systems and Software 67(3), 181–191 (2003)
Wang, K., He, Y., Han, J.: Pushing support constraints into association rules mining. IEEE Transactions Knowledge Data Engineering 15(3), 642–658 (2003)
Koh, Y.S., Rountree, N., O’Keefe, R.: Finding non-coincidental sporadic rules using apriori-inverse. International Journal of Data Warehousing and Mining 2(2), 38–54 (to appear, 2006)
Tao, F., Murtagh, F., Farid, M.: Weighted association rule mining using weighted support and significance framework. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 661–666. ACM Press, New York (2003)
Brin, S., Motwani, R., Silverstein, C.: Beyond market baskets: generalizing association rules to correlations. SIGMOD Rec. 26(2), 265–276 (1997)
Silverstein, C., Brin, S., Motwani, R.: Beyond market baskets: Generalizing association rules to dependence rules. Data Mining and Knowledge Discovery 2(1), 39–68 (1998)
Meo, R.: Theory of dependence values. ACM Trans. Database Syst. 25(3), 380–406 (2000)
Wu, X., Zhang, C., Zhang, S.: Efficient mining of both positive and negative association rules. ACM Trans. Inf. Syst. 22(3), 381–405 (2004)
Newman, D., Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Huang, S., Webb, G.: Pruning derivative partial rules during impact rule discovery. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 71–80. Springer, Heidelberg (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Koh, Y.S. (2008). Mining Non-coincidental Rules without a User Defined Support Threshold. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2008. Lecture Notes in Computer Science(), vol 5012. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68125-0_92
Download citation
DOI: https://doi.org/10.1007/978-3-540-68125-0_92
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68124-3
Online ISBN: 978-3-540-68125-0
eBook Packages: Computer ScienceComputer Science (R0)