Skip to main content

Enhancing the Apriori Algorithm for Frequent Set Counting

  • Conference paper
  • First Online:
Book cover Data Warehousing and Knowledge Discovery (DaWaK 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2114))

Included in the following conference series:

Abstract

In this paper we propose DCP, a new algorithm for solving the Frequent Set Counting problem, which enhances Apriori. Our goal was to optimize the initial iterations of Apriori, i.e. the most time consuming ones when datasets characterized by short or medium length frequent patterns are considered. The main improvements regard the use of an innovative method for storing candidate set of items and counting their support, and the exploitation of effective pruning techniques which significantly reduce the size of the dataset as execution progresses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R.C. Agarwal, C.C. Aggarwal, and V.V.V. Prasad. Depth first generation of long patterns. In Proc. of the 6th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pages 108–118, Boston, MA, USA, 2000.

    Google Scholar 

  2. R. Agrawal, T. Imielinski, and Swami A. Mining Associations between Sets of Items in Massive Databases. In Proc. of the ACM-SIGMOD 1993 Int’l Conf. on Management of Data, pages 207–216, Washington D.C., USA, 1993.

    Google Scholar 

  3. R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. InkeriVerkamo. Fast Discovery of Association Rules in Large Databases. In Advances in Knowledge Discovery and Data Mining, pages 307–328. AAAI Press, 1996.

    Google Scholar 

  4. R. Agrawal and R. Srikant. Fast Algorithms for Mining Association Rules in Large Databases. In Proc. of the 20th VLDB Conf., pages 487–499, Santiago, Chile, 1994.

    Google Scholar 

  5. R. Baraglia, D. Laforenza, S. Orlando, P. Palmerini, and R. Perego. Implementation issues in the design of I/O intensive data mining applications on clusters of workstations. In Proc. of the 3rd Workshop on High Performance Data Mining, in conjunction with IPDPS-2000, Cancun, Mexico, pages 350–357. LNCS 1800 Spinger-Verlag, 2000.

    Google Scholar 

  6. R.J. Bayardo Jr. Efficiently Mining Long Patterns from Databases. In Proc. of the ACM SIGMOD Int. Conf. on Management of Data, pages 85–93, Seattle, Washington, USA, 1998.

    Google Scholar 

  7. Brian Dunkel and Nandit Soparkar. Data organization and access for efficient data mining. In Proceedings of the 15th ICDE Int. Conf. on Data Engineering, pages 522–529, Sydney, Australia, 1999. IEEE Computer Society.

    Google Scholar 

  8. U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors. Advances in Knowledge Discovery and Data Mining. AAAI Press, 1998.

    Google Scholar 

  9. V. Ganti, J. Gehrke, and R. Ramakrishnan. Mining Very Large Databases. IEEE Computer, 32(8):38–45, 1999.

    Google Scholar 

  10. E.H. Han, G. Karypis, and Kumar V. Scalable Parallel Data Mining for Association Rules. IEEE Transactions on Knowledge and Data Engineering, 12(3):337–352, May/June 2000.

    Article  Google Scholar 

  11. J. Han, J. Pei, and Y. Yin. Mining Frequent Patterns without Candidate Generation. In Proc. of the ACM SIGMOD Int. Conf. on Management of Data, pages 1–12, Dallas, Texas, USA, 2000.

    Google Scholar 

  12. J.-L. Lin and M.H. Dunham. Mining association rules: Anti-skew algorithms. In Proceedings of the 14-th Int. Conf. on Data Engineering, pages 486–493, Orlando, Florida, USA, 1998. IEEE Computer Society.

    Google Scholar 

  13. A. Mueller. Fast Sequential and Parallel Algorithms for Association Rule Mining: A Comparison. Technical Report CS-TR-3515, Univ. of Maryland, College Park, 1995.

    Google Scholar 

  14. S. Orlando, P. Palmerini, and R. Perego. The DCP algorithm for Frequent Set Counting. Technical Report CS-2001-7, Dip. di Informatica, Università di Venezia, 2001. Available at http://www.dsi.unive.it/~orlando/TR01-7.pdf.

  15. J.S. Park, M.-S. Chen, and P.S. Yu. An Effective Hash Based Algorithm for Mining Association Rules. In Proc. of the 1995 ACM SIGMOD International Conference on Management of Data, pages 175–186, San Jose, California, 1995.

    Google Scholar 

  16. N. Ramakrishnan and A.Y. Grama. Data Mining: From Serendipity to Science. IEEE Computer, 32(8):34–37, 1999.

    Google Scholar 

  17. A. Savasere, E. Omiecinski, and S.B. Navathe. An Efficient Algorithm for Mining Association Rules in Large Databases. In Proceedings of the 21th VLDB Conference, pages 432–444, Zurich, Switzerland, 1995.

    Google Scholar 

  18. H. Toivonen. Sampling Large Databases for Association Rules. In Proceedings of the 22th VLDB Conference, pages 134–145, Mumbai (Bombay), IndiaA, 1996.

    Google Scholar 

  19. M.J. Zaki. Scalable algorithms for association mining. IEEE Transactions on Knowledge and Data Engineering, 12:372–390, May/June 2000.

    Article  Google Scholar 

  20. M.J. Zaki, S. Parthasarathy, W. Li, and M. Ogihara. Evaluation of Sampling for Data Mining of Association Rules. In 7th Int. Workshop on Research Issues in Data Engineering (RIDE), pages 42–50, Birmingham, UK, 1997.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Perego, R., Orlando, S., Palmerini, P. (2001). Enhancing the Apriori Algorithm for Frequent Set Counting. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2001. Lecture Notes in Computer Science, vol 2114. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44801-2_8

Download citation

  • DOI: https://doi.org/10.1007/3-540-44801-2_8

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42553-3

  • Online ISBN: 978-3-540-44801-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics