Abstract
How to efficiently discard potentially uninteresting rules in exploratory rule discovery is one of the important research foci in data mining. Many researchers have presented algorithms to automatically remove potentially uninteresting rules utilizing background knowledge and user-specified constraints. Identifying the significance of exploratory rules using a significance test is desirable for removing rules that may appear interesting by chance, hence providing the users with a more compact set of resulting rules. However, applying statistical tests to identify significant rules requires considerable computation and data access in order to obtain the necessary statistics. The situation gets worse as the size of the database increases. In this paper, we propose two approaches for improving the efficiency of significant exploratory rule discovery. We also evaluate the experimental effect in impact rule discovery which is suitable for discovering exploratory rules in very large, dense databases.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Buneman, P., Jajodia, S. (eds.) Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, D.C, pp. 207–216 (May 26-28, 1993)
Aumann, Y., Lindell, Y.: A statistical theory for quantitative association rules. In: Knowledge Discovery and Data Mining, pp. 261–270 (1999)
Bay, S.D.: The uci kdd archive (1999), http://kdd.ics.uci.edu
Bay, S.D., Pazzani, M.J.: Detecting group differences: Mining contrast sets. In: Data Mining and Knowledge Discovery, pp. 213–246 (2001)
Bayardo Jr., R.J., Agrawal, R., Gunopulos, D.: Constraint-based rule mining in large, dense databases. Data Min. Knowl. Discov. 4(2-3), 217–240 (2000)
Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998)
Han, J., Kamber, M.: Data mining: concepts and techniques. Morgan Kaufmann, San Francisco (2001)
Huang, S., Webb, G.I.: Discarding insignificant rules during impact rule discovery in large database. In: SIAM Data Mining Conference, Newport Beach, USA (2005)
Pei, J.H.J., Lakshmanan, L.V.S.: Mining frequent itemsets with convertible constraints. In: Proceedings of the 17th International Conference on Data Engineering, p. 433. IEEE Computer Society, Los Alamitos (2001)
Liu, B., Hsu, W., Ma, Y.: Pruning and summarizing the discovered associations. In: Knowledge Discovery and Data Mining, pp. 125–134 (1999)
Michalski, R.S.: A theory and methodology of inductive learning. In: Michalski, R.S., Carbonell, J.G., Mitchell, T.M. (eds.) Machine Learning: An Artificial Intelligence Approach, pp. 83–134. Springer, Heidelberg (1984)
Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)
Webb, G.I.: Discovering associations with numeric variables. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 383–388. ACM Press, New York (2001)
Webb, G.I.: OPUS: An efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research 3, 431–465 (1995)
Webb, G.I.: Statistically sound exploratory rule discovery (2004) (to be published)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Huang, S., Webb, G.I. (2006). Efficiently Identifying Exploratory Rules’ Significance. In: Williams, G.J., Simoff, S.J. (eds) Data Mining. Lecture Notes in Computer Science(), vol 3755. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11677437_6
Download citation
DOI: https://doi.org/10.1007/11677437_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32547-5
Online ISBN: 978-3-540-32548-2
eBook Packages: Computer ScienceComputer Science (R0)