Skip to main content

Strategies for Partitioning Data in Association Rule Mining

  • Conference paper
Book cover Research and Development in Intelligent Systems XX (SGAI 2003)

Abstract

The problem of extracting association rules from databases is well known. The most demanding part of the problem is the determination of the support for all those sets of attributes which occur often enough to be of possible interest. We have previously described methods we have developed that approach the problem by first constructing a tree (the P-tree) that contains a record of all the relevant information in the database and a partial computation of the support totals. This approach offers significant performance advantages over comparable alternative methods, which we have demonstrated experimentally with store-resident datasets. In practice, however, the real focus of interest is on much larger databases. In this paper we discuss strategies for partitioning the data in these cases, and present results of the performance analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agarwal, R., Aggarwal, C. and Prasad, V. Depth First Generation of Long Patterns. In Proc. of the ACM KDD Conference on Management of DataBoston, pages 108–118, 2000.

    Google Scholar 

  2. Agrawal, R., Imielinski, T. and Swami, A. Mining Association Rules between Sets of Items in Large Databases. In Proc. of the ACM SIGMOD Conference on Management of DataWashington, D.C., pages 207–216, May 1993.

    Google Scholar 

  3. Agrawal, R. and Srikant, R. Fast Algorithm for Mining Association Rules. In Proc. of the 20th VLDB Conference, Santiago, Santiago, Chile, pages 487–499, September 1994.

    Google Scholar 

  4. Bayardo, R.J. Efficiently Mining Long Pattern from Databases. In Proc. of the ACM SIGMOD Conference on Management of Data, pages 85–93, 1998.

    Google Scholar 

  5. Bayardo, R.J., Agrawal, R. and Gunopulos, D. Constraint-Based Rule Mining in Large, Dense Databases. In Proc, of the 15th Int’l Conference on Data Engineering, 1999.

    Google Scholar 

  6. Han, J., Pei, J. and Yin, Y. Mining Frequent Patterns without Candidate Generation. In Proc. of the ACM SIGMOD Conference on Management of Data, Dallas, pages 1–12, 2000.

    Google Scholar 

  7. Coenen, F., Goulbourne, G., and Leng, P. Computing Association Rules using Partial Totals. PKDD 2001, pages 54–66, 2001.

    Google Scholar 

  8. Coenen, F. and Leng, P. Optimising Association Rule Algorithms Using Itemset Ordering. Research and Development in Intelligent Systems XVIII: Proc ES2001 Conference, eds M Bramer, F Coenen and A Preece, Springer, pp53–66.

    Google Scholar 

  9. Goulbourne, G., Coenen, F. and Leng, P. Algorithms for Computing Association Rules Using a Partial-Support Tree. J. Knowledge-Based System 13 (2000), pages 141–149. (also Proc ES’99.)

    Article  Google Scholar 

  10. Toivonen, H. Sampling Large Databases for Association Rules. In Proc. of the 22th VLDB Conference, Mumbai, India, pages 1–12, 1996.

    Google Scholar 

  11. Brin, S., Motwani, R., Ullman, J. D. and Tsur, S. Dynamic Itemset Counting and Implication Rules for Market Basket Data. In Proc. of the ACM SIGMOD Conference on Management of Data, USA, pages 255–264, 1997.

    Google Scholar 

  12. Savasere, A., Omiecinski, E. and Navathe, S. An Efficient Algorithm for Mining Association Rules in Large Databases. In Proc, of the 21th VLDB Conference, Zurich, Swizerland, pages 432–444, 1995.

    Google Scholar 

  13. Zaki, M.J. Parthasarathy, S. Ogihara, M. and Li, W. New Algorithms for fast discovery of association rules. Technical report 651, University of Rochester, Computer Science Department, New York. July 1997.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Copyright information

© 2004 Springer-Verlag London

About this paper

Cite this paper

Ahmed, S., Coenen, F., Leng, P. (2004). Strategies for Partitioning Data in Association Rule Mining. In: Coenen, F., Preece, A., Macintosh, A. (eds) Research and Development in Intelligent Systems XX. SGAI 2003. Springer, London. https://doi.org/10.1007/978-0-85729-412-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-0-85729-412-8_10

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-85233-780-3

  • Online ISBN: 978-0-85729-412-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics