Strategies for Partitioning Data in Association Rule Mining

Ahmed, Shakil; Coenen, Frans; Leng, Paul

doi:10.1007/978-0-85729-412-8_10

Shakil Ahmed⁴,
Frans Coenen⁴ &
Paul Leng⁴

Included in the following conference series:

International Conference on Innovative Techniques and Applications of Artificial Intelligence

105 Accesses
1 Citations

Abstract

The problem of extracting association rules from databases is well known. The most demanding part of the problem is the determination of the support for all those sets of attributes which occur often enough to be of possible interest. We have previously described methods we have developed that approach the problem by first constructing a tree (the P-tree) that contains a record of all the relevant information in the database and a partial computation of the support totals. This approach offers significant performance advantages over comparable alternative methods, which we have demonstrated experimentally with store-resident datasets. In practice, however, the real focus of interest is on much larger databases. In this paper we discuss strategies for partitioning the data in these cases, and present results of the performance analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agarwal, R., Aggarwal, C. and Prasad, V. Depth First Generation of Long Patterns. In Proc. of the ACM KDD Conference on Management of DataBoston, pages 108–118, 2000.
Google Scholar
Agrawal, R., Imielinski, T. and Swami, A. Mining Association Rules between Sets of Items in Large Databases. In Proc. of the ACM SIGMOD Conference on Management of DataWashington, D.C., pages 207–216, May 1993.
Google Scholar
Agrawal, R. and Srikant, R. Fast Algorithm for Mining Association Rules. In Proc. of the 20th VLDB Conference, Santiago, Santiago, Chile, pages 487–499, September 1994.
Google Scholar
Bayardo, R.J. Efficiently Mining Long Pattern from Databases. In Proc. of the ACM SIGMOD Conference on Management of Data, pages 85–93, 1998.
Google Scholar
Bayardo, R.J., Agrawal, R. and Gunopulos, D. Constraint-Based Rule Mining in Large, Dense Databases. In Proc, of the 15th Int’l Conference on Data Engineering, 1999.
Google Scholar
Han, J., Pei, J. and Yin, Y. Mining Frequent Patterns without Candidate Generation. In Proc. of the ACM SIGMOD Conference on Management of Data, Dallas, pages 1–12, 2000.
Google Scholar
Coenen, F., Goulbourne, G., and Leng, P. Computing Association Rules using Partial Totals. PKDD 2001, pages 54–66, 2001.
Google Scholar
Coenen, F. and Leng, P. Optimising Association Rule Algorithms Using Itemset Ordering. Research and Development in Intelligent Systems XVIII: Proc ES2001 Conference, eds M Bramer, F Coenen and A Preece, Springer, pp53–66.
Google Scholar
Goulbourne, G., Coenen, F. and Leng, P. Algorithms for Computing Association Rules Using a Partial-Support Tree. J. Knowledge-Based System 13 (2000), pages 141–149. (also Proc ES’99.)
Article Google Scholar
Toivonen, H. Sampling Large Databases for Association Rules. In Proc. of the 22th VLDB Conference, Mumbai, India, pages 1–12, 1996.
Google Scholar
Brin, S., Motwani, R., Ullman, J. D. and Tsur, S. Dynamic Itemset Counting and Implication Rules for Market Basket Data. In Proc. of the ACM SIGMOD Conference on Management of Data, USA, pages 255–264, 1997.
Google Scholar
Savasere, A., Omiecinski, E. and Navathe, S. An Efficient Algorithm for Mining Association Rules in Large Databases. In Proc, of the 21th VLDB Conference, Zurich, Swizerland, pages 432–444, 1995.
Google Scholar
Zaki, M.J. Parthasarathy, S. Ogihara, M. and Li, W. New Algorithms for fast discovery of association rules. Technical report 651, University of Rochester, Computer Science Department, New York. July 1997.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Liverpool, Liverpool, L69 7ZF, UK
Shakil Ahmed, Frans Coenen & Paul Leng

Authors

Shakil Ahmed
View author publications
You can also search for this author in PubMed Google Scholar
Frans Coenen
View author publications
You can also search for this author in PubMed Google Scholar
Paul Leng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Liverpool, Liverpool, UK
Frans Coenen
Dept of Computer Science, University of Aberdeen, Aberdeen, UK
Alun Preece
Napier University, International Teledemocracy Centre, Edinburgh, EH10 5DT, UK
Ann Macintosh BSc, CEng

Copyright information

About this paper

Cite this paper

Ahmed, S., Coenen, F., Leng, P. (2004). Strategies for Partitioning Data in Association Rule Mining. In: Coenen, F., Preece, A., Macintosh, A. (eds) Research and Development in Intelligent Systems XX. SGAI 2003. Springer, London. https://doi.org/10.1007/978-0-85729-412-8_10

Download citation

DOI: https://doi.org/10.1007/978-0-85729-412-8_10
Publisher Name: Springer, London
Print ISBN: 978-1-85233-780-3
Online ISBN: 978-0-85729-412-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics