Skip to main content

EStream: Online Mining of Frequent Sets with Precise Error Guarantee

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4081))

Abstract

In data stream applications, a good approximation obtained in a timely manner is often better than the exact answer that’s delayed beyond the window of opportunity. Of course, the quality of the approximate is as important as its timely delivery. Unfortunately, algorithms capable of online processing do not conform strictly to a precise error guarantee. Since online processing is essential and so is the precision of the error, it is necessary that stream algorithms meet both criteria. Yet, this is not the case for mining frequent sets in data streams. We present EStream, a novel algorithm that allows online processing while producing results strictly within the error bound. Our theoretical and experimental results show that EStream is a better candidate for finding frequent sets in data streams, when both constraints need to be satisfied.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB Conference, pp. 487–499 (1994)

    Google Scholar 

  2. Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: PODS Conference, pp. 1–16 (2002)

    Google Scholar 

  3. Babcock, B., Datar, M., Motwani, R.: Sampling from a moving window over streaming data. In: ACM-SIAM Symposium on Discrete Algorithms (2002)

    Google Scholar 

  4. Chang, J.H., Lee, W.S.: Estwin: Adaptively monitoring the recent change of frequent itemsets over online data streams. In: CIKM Conference (2003)

    Google Scholar 

  5. Chang, J.H., Lee, W.S.: Finding recent frequent itemsets adaptively over online data streams. In: ACM SIGKDD Conference, pp. 487–492 (2003)

    Google Scholar 

  6. Cormode, G., Muthukrishnan, S.: What’s hot and what’s not: tracking most frequent items dynamically. ACM Trans. Database Syst. 30(1), 249–278 (2005)

    Article  MathSciNet  Google Scholar 

  7. Dang, X.H., Ng, W.K., Ong, K.L.: Online mining of frequent patterns with precise error guarantees. Technical Report, Nanyang Technological University

    Google Scholar 

  8. Garofalakis, M., Gehrke, J., Rastogi, R.: Querying and mining data streams: you only get one look a tutorial. In: ACM SIGMOD Conference (2002)

    Google Scholar 

  9. Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining Frequent Patterns in Data Streams at Multiple Time Granularities. AAAI/MIT (2003)

    Google Scholar 

  10. Hidber, C.: Online association rule mining. In: SIGMOD Conference (1999)

    Google Scholar 

  11. Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. In: VLDB Conference, pp. 346–357 (2002)

    Google Scholar 

  12. Tatbul, N., Çetintemel, U., Zdonik, S.B., Cherniack, M., Stonebraker, M.: Load shedding in a data stream manager. In: VLDB Conference, pp. 309–320 (2003)

    Google Scholar 

  13. Teng, W.G., Chen, M.S., Yu, P.S.: A regression-based temporal pattern mining scheme for data streams. In: VLDB Conference, pp. 93–104 (2003)

    Google Scholar 

  14. Yu, J.X., Lu, Z.C.H., Zhou, A.: False positive or false negative: Mining frequent itemsets from high speed transactional data streams. In: VLDB Conference (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dang, X.H., Ng, WK., Ong, KL. (2006). EStream: Online Mining of Frequent Sets with Precise Error Guarantee. In: Tjoa, A.M., Trujillo, J. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2006. Lecture Notes in Computer Science, vol 4081. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11823728_30

Download citation

  • DOI: https://doi.org/10.1007/11823728_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-37736-8

  • Online ISBN: 978-3-540-37737-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics