Skip to main content

Adaptive Load Shedding for Mining Frequent Patterns from Data Streams

  • Conference paper
Book cover Data Warehousing and Knowledge Discovery (DaWaK 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4081))

Included in the following conference series:

Abstract

Most algorithms that focus on discovering frequent patterns from data streams assumed that the machinery is capable of managing all the incoming transactions without any delay; or without the need to drop transactions. However, this assumption is often impractical due to the inherent characteristics of data stream environments. Especially under high load conditions, there is often a shortage of system resources to process the incoming transactions. This causes unwanted latencies that in turn, affects the applicability of the data mining models produced – which often has a small window of opportunity. We propose a load shedding algorithm to address this issue. The algorithm adaptively detects overload situations and drops transactions from data streams using a probabilistic model. We tested our algorithm on both synthetic and real-life datasets to verify the feasibility of our algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB Conference, pp. 487–499 (1994)

    Google Scholar 

  2. Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: PODS Conference, pp. 1–16 (2002)

    Google Scholar 

  3. Babcock, B., Datar, M., Motwani, R.: Load shedding for aggregation queries over data streams. In: ICDE Conference, pp. 350–361 (2004)

    Google Scholar 

  4. Chambers, C., Feng, W., Sahu, S., Saha, D.: Measurement-based characterization of a collection of on-line games. In: IMC Conference, pp. 1–14 (2005)

    Google Scholar 

  5. Chang, J.H., Lee, W.S.: Finding recent frequent itemsets adaptively over online data streams. In: ACM SIGKDD Conference, pp. 487–492 (2003)

    Google Scholar 

  6. Chi, Y., Yu, P.S., Wang, H., Muntz, R.R.: Loadstar: A load shedding scheme for classifying data streams. In: SIAM Conference, pp. 346–357 (2005)

    Google Scholar 

  7. Dang, X.H., Ng, W.K., Ong, K.L.: Adaptive load shedding for mining frequent patterns from data streams. Technical Report, Nanyang Technological University

    Google Scholar 

  8. Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining Frequent Patterns in Data Streams at Multiple Time Granularities. In: Next Generation Data Mining, AAAI/MIT (2003)

    Google Scholar 

  9. Hoeffding, W.: Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association 58(301), 13–30 (1963)

    Article  MATH  MathSciNet  Google Scholar 

  10. Lin, C.H., Chiu, D.Y., Wu, Y.H., Chen, A.L.P.: Mining frequent itemsets from data streams with a time-sensitive sliding window. In: SIAM Conference (2005)

    Google Scholar 

  11. Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. In: VLDB Conference, pp. 346–357 (2002)

    Google Scholar 

  12. Tatbul, N., Çetintemel, U., Zdonik, S.B., Cherniack, M., Stonebraker, M.: Load shedding in a data stream manager. In: VLDB Conference, pp. 309–320 (2003)

    Google Scholar 

  13. Teng, W.G., Chen, M.S., Yu, P.S.: A regression-based temporal pattern mining scheme for data streams. In: VLDB Conference, pp. 93–104 (2003)

    Google Scholar 

  14. Yang, G.: The complexity of mining maximal frequent itemsets and maximal frequent patterns. In: ACM SIGKDD Conference, pp. 344–353 (2004)

    Google Scholar 

  15. Yu, J.X., Lu, Z.C.H., Zhou, A.: False positive or false negative: Mining frequent itemsets from high speed transactional data streams. In: VLDB Conference (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dang, X.H., Ng, WK., Ong, KL. (2006). Adaptive Load Shedding for Mining Frequent Patterns from Data Streams. In: Tjoa, A.M., Trujillo, J. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2006. Lecture Notes in Computer Science, vol 4081. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11823728_33

Download citation

  • DOI: https://doi.org/10.1007/11823728_33

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-37736-8

  • Online ISBN: 978-3-540-37737-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics