Skip to main content

A Sampling-Based Method for Mining Frequent Patterns from Databases

  • Conference paper
Fuzzy Systems and Knowledge Discovery (FSKD 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3614))

Included in the following conference series:

Abstract

Mining frequent item sets (frequent patterns) in transaction databases is a well known problem in data mining research. This work proposes a sampling-based method to find frequent patterns. The proposed method contains three phases. In the first phase, we draw a small sample of data to estimate the set of frequent patterns, denoted as F S. The second phase computes the actual supports of the patterns in F S as well as identifies a subset of patterns in F S that need to be further examined in the next phase. Finally, the third phase explores this set and finds all missing frequent patterns. The empirical results show that our algorithm is efficient, about two or three times faster than the well-known FP-growth algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th VLDB Conference, pp. 478–499 (1994)

    Google Scholar 

  2. Park, J.S., Chen, M.S., Yu, P.S.: Using a hash-based method with transaction trimming for mining association rules. IEEE Transactions on Knowledge and Data Engineering 9, 813–825 (1997)

    Article  Google Scholar 

  3. Brin, S., Motwani, R., Ullman, J., Tsur, S.: Dynamic itemset counting and implication rules for market basket data. In: Proceedings of the 1997 ACM-SIGMOD Conf. on Management of Data, pp. 255–264 (1997)

    Google Scholar 

  4. Savasere, A., Omiecinski, E., Navathe, S.: An efficient algorithm for mining association rules in large databases. In: Proceedings of Int’l Conf. Very Large Data Bases, pp. 432–444 (1995)

    Google Scholar 

  5. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of SIGMOD, pp. 1–12 (2000)

    Google Scholar 

  6. Relue, R., Wu, X., Huang, H.: Efficient runtime generation of association rules. In: Proceedings of the Tenth International Conference on Information and Knowledge Management, pp. 466–473 (2001)

    Google Scholar 

  7. Liu, J., Pan, Y., Wang, K., Han, J.: Mining frequent item sets by opportunistic projection. In: Proceedings of 2002 Int. Conf. on Knowledge Discovery in Databases, pp. 229–238 (2002)

    Google Scholar 

  8. Pei, J., Han, J., Lu, H., Nishio, S., Tang, S., Yang, D.: H-mine: hyper-structure mining of frequent patterns in large databases. In: Proceedings of IEEE International Conference on Data Mining, pp. 441–448 (2001)

    Google Scholar 

  9. Agrawal, R.C., Aggarwal, C.C., Prasad, V.V.V.: Depth first generation of long patterns. In: Proceedings of SIGKDD Conference, pp. 108–118 (2000)

    Google Scholar 

  10. Burdick, D., Calimlim, M., Gehrke, J.: MAFIA: a maximal frequent itemset algorithm for transactional databases. In: Proceedings of 17th Int. Conf. Data Engineering, pp. 443–452 (2001)

    Google Scholar 

  11. Toivonen, H.: Sampling large databases for association rules. In: Proceedings of the 22th International Conference on Very Large Databases, pp. 134–145 (1996)

    Google Scholar 

  12. Agarwal, R., Aggarwal, C., Prasad, V.V.V.: A tree projection algorithm for generation of frequent itemsets. Journal of Parallel and Distributed Computing 61, 350–371 (2001)

    Article  MATH  Google Scholar 

  13. Bayardo Jr., R.J.: Efficiently mining long patterns from databases. In: Proceedings of the ACM-SIGMOD Int’l Conf. on Management of Data, pp. 85–93 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, YL., Ho, CY. (2005). A Sampling-Based Method for Mining Frequent Patterns from Databases. In: Wang, L., Jin, Y. (eds) Fuzzy Systems and Knowledge Discovery. FSKD 2005. Lecture Notes in Computer Science(), vol 3614. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11540007_65

Download citation

  • DOI: https://doi.org/10.1007/11540007_65

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28331-7

  • Online ISBN: 978-3-540-31828-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics