Skip to main content

Efficiently Identifying Exploratory Rules’ Significance

  • Chapter
Data Mining

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3755))

  • 3373 Accesses

Abstract

How to efficiently discard potentially uninteresting rules in exploratory rule discovery is one of the important research foci in data mining. Many researchers have presented algorithms to automatically remove potentially uninteresting rules utilizing background knowledge and user-specified constraints. Identifying the significance of exploratory rules using a significance test is desirable for removing rules that may appear interesting by chance, hence providing the users with a more compact set of resulting rules. However, applying statistical tests to identify significant rules requires considerable computation and data access in order to obtain the necessary statistics. The situation gets worse as the size of the database increases. In this paper, we propose two approaches for improving the efficiency of significant exploratory rule discovery. We also evaluate the experimental effect in impact rule discovery which is suitable for discovering exploratory rules in very large, dense databases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Buneman, P., Jajodia, S. (eds.) Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, D.C, pp. 207–216 (May 26-28, 1993)

    Google Scholar 

  2. Aumann, Y., Lindell, Y.: A statistical theory for quantitative association rules. In: Knowledge Discovery and Data Mining, pp. 261–270 (1999)

    Google Scholar 

  3. Bay, S.D.: The uci kdd archive (1999), http://kdd.ics.uci.edu

  4. Bay, S.D., Pazzani, M.J.: Detecting group differences: Mining contrast sets. In: Data Mining and Knowledge Discovery, pp. 213–246 (2001)

    Google Scholar 

  5. Bayardo Jr., R.J., Agrawal, R., Gunopulos, D.: Constraint-based rule mining in large, dense databases. Data Min. Knowl. Discov. 4(2-3), 217–240 (2000)

    Article  Google Scholar 

  6. Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998)

    Google Scholar 

  7. Han, J., Kamber, M.: Data mining: concepts and techniques. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  8. Huang, S., Webb, G.I.: Discarding insignificant rules during impact rule discovery in large database. In: SIAM Data Mining Conference, Newport Beach, USA (2005)

    Google Scholar 

  9. Pei, J.H.J., Lakshmanan, L.V.S.: Mining frequent itemsets with convertible constraints. In: Proceedings of the 17th International Conference on Data Engineering, p. 433. IEEE Computer Society, Los Alamitos (2001)

    Google Scholar 

  10. Liu, B., Hsu, W., Ma, Y.: Pruning and summarizing the discovered associations. In: Knowledge Discovery and Data Mining, pp. 125–134 (1999)

    Google Scholar 

  11. Michalski, R.S.: A theory and methodology of inductive learning. In: Michalski, R.S., Carbonell, J.G., Mitchell, T.M. (eds.) Machine Learning: An Artificial Intelligence Approach, pp. 83–134. Springer, Heidelberg (1984)

    Google Scholar 

  12. Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)

    Google Scholar 

  13. Webb, G.I.: Discovering associations with numeric variables. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 383–388. ACM Press, New York (2001)

    Chapter  Google Scholar 

  14. Webb, G.I.: OPUS: An efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research 3, 431–465 (1995)

    MATH  Google Scholar 

  15. Webb, G.I.: Statistically sound exploratory rule discovery (2004) (to be published)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Huang, S., Webb, G.I. (2006). Efficiently Identifying Exploratory Rules’ Significance. In: Williams, G.J., Simoff, S.J. (eds) Data Mining. Lecture Notes in Computer Science(), vol 3755. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11677437_6

Download citation

  • DOI: https://doi.org/10.1007/11677437_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-32547-5

  • Online ISBN: 978-3-540-32548-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics