Skip to main content

Maintaining Frequent Itemsets over High-Speed Data Streams

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3918))

Included in the following conference series:

Abstract

We propose a false-negative approach to approximate the set of frequent itemsets (FIs) over a sliding window. Existing approximate algorithms use an error parameter, ε, to control the accuracy of the mining result. However, the use of ε leads to a dilemma. A smaller ε gives a more accurate mining result but higher computational complexity, while increasing ε degrades the mining accuracy. We address this dilemma by introducing a progressively increasing minimum support function. When an itemset is retained in the window longer, we require its minimum support to approach the minimum support of an FI. Thus, the number of potential FIs to be maintained is greatly reduced. Our experiments show that our algorithm not only attains highly accurate mining results, but also runs significantly faster and consumes less memory than do existing algorithms for mining FIs over a sliding window.

This work is partially supported by RGC CERG under grant number HKUST6185/02E and HKUST6185/03E.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chang, J.H., Lee, W.S.: estWin: Adaptively Monitoring the Recent Change of Frequent Itemsets over Online Data Streams. In: Proc. of CIKM (2003)

    Google Scholar 

  2. Chang, J.H., Lee, W.S.: A Sliding Window method for Finding Recently Frequent Itemsets over Online Data Streams. Journal of Information Science and Engineering 20(4) (July 2004)

    Google Scholar 

  3. Cheng, J., Ke, Y., Ng, W.: Maintaining Frequent Itemsets over High-Speed Data Streams. Technical Report, http://www.cs.ust.hk/~csjames/pakdd06tr.pdf

  4. Li, H., Lee, S., Shan, M.: An Efficient Algorithm for Mining Frequent Itemsets over the Entire History of Data Streams. In: Proc. of First International Workshop on Knowledge Discovery in Data Streams (2004)

    Google Scholar 

  5. Manku, G.S., Motwani, R.: Approximate Frequency Counts over Data Streams. In: Proc. of VLDB (2002)

    Google Scholar 

  6. Yu, J., Chong, Z., Lu, H., Zhou, A.: False Positive or False Negative: Mining Frequent Itemsets from High Speed Transactional Data Streams. In: VLDB (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cheng, J., Ke, Y., Ng, W. (2006). Maintaining Frequent Itemsets over High-Speed Data Streams. In: Ng, WK., Kitsuregawa, M., Li, J., Chang, K. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2006. Lecture Notes in Computer Science(), vol 3918. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11731139_53

Download citation

  • DOI: https://doi.org/10.1007/11731139_53

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33206-0

  • Online ISBN: 978-3-540-33207-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics