Adaptive Load Shedding for Mining Frequent Patterns from Data Streams

Dang, Xuan Hong; Ng, Wee-Keong; Ong, Kok-Leong

doi:10.1007/11823728_33

Xuan Hong Dang¹⁸,
Wee-Keong Ng¹⁸ &
Kok-Leong Ong¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4081))

Included in the following conference series:

International Conference on Data Warehousing and Knowledge Discovery

778 Accesses
3 Citations
1 Altmetric

Abstract

Most algorithms that focus on discovering frequent patterns from data streams assumed that the machinery is capable of managing all the incoming transactions without any delay; or without the need to drop transactions. However, this assumption is often impractical due to the inherent characteristics of data stream environments. Especially under high load conditions, there is often a shortage of system resources to process the incoming transactions. This causes unwanted latencies that in turn, affects the applicability of the data mining models produced – which often has a small window of opportunity. We propose a load shedding algorithm to address this issue. The algorithm adaptively detects overload situations and drops transactions from data streams using a probabilistic model. We tested our algorithm on both synthetic and real-life datasets to verify the feasibility of our algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB Conference, pp. 487–499 (1994)
Google Scholar
Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: PODS Conference, pp. 1–16 (2002)
Google Scholar
Babcock, B., Datar, M., Motwani, R.: Load shedding for aggregation queries over data streams. In: ICDE Conference, pp. 350–361 (2004)
Google Scholar
Chambers, C., Feng, W., Sahu, S., Saha, D.: Measurement-based characterization of a collection of on-line games. In: IMC Conference, pp. 1–14 (2005)
Google Scholar
Chang, J.H., Lee, W.S.: Finding recent frequent itemsets adaptively over online data streams. In: ACM SIGKDD Conference, pp. 487–492 (2003)
Google Scholar
Chi, Y., Yu, P.S., Wang, H., Muntz, R.R.: Loadstar: A load shedding scheme for classifying data streams. In: SIAM Conference, pp. 346–357 (2005)
Google Scholar
Dang, X.H., Ng, W.K., Ong, K.L.: Adaptive load shedding for mining frequent patterns from data streams. Technical Report, Nanyang Technological University
Google Scholar
Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining Frequent Patterns in Data Streams at Multiple Time Granularities. In: Next Generation Data Mining, AAAI/MIT (2003)
Google Scholar
Hoeffding, W.: Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association 58(301), 13–30 (1963)
Article MATH MathSciNet Google Scholar
Lin, C.H., Chiu, D.Y., Wu, Y.H., Chen, A.L.P.: Mining frequent itemsets from data streams with a time-sensitive sliding window. In: SIAM Conference (2005)
Google Scholar
Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. In: VLDB Conference, pp. 346–357 (2002)
Google Scholar
Tatbul, N., Çetintemel, U., Zdonik, S.B., Cherniack, M., Stonebraker, M.: Load shedding in a data stream manager. In: VLDB Conference, pp. 309–320 (2003)
Google Scholar
Teng, W.G., Chen, M.S., Yu, P.S.: A regression-based temporal pattern mining scheme for data streams. In: VLDB Conference, pp. 93–104 (2003)
Google Scholar
Yang, G.: The complexity of mining maximal frequent itemsets and maximal frequent patterns. In: ACM SIGKDD Conference, pp. 344–353 (2004)
Google Scholar
Yu, J.X., Lu, Z.C.H., Zhou, A.: False positive or false negative: Mining frequent itemsets from high speed transactional data streams. In: VLDB Conference (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Engineering, Nanyang Technological University, Singapore
Xuan Hong Dang & Wee-Keong Ng
School of Engineering & IT, Deakin University, Australia
Kok-Leong Ong

Authors

Xuan Hong Dang
View author publications
You can also search for this author in PubMed Google Scholar
Wee-Keong Ng
View author publications
You can also search for this author in PubMed Google Scholar
Kok-Leong Ong
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Software Technology and Interactive Systems, Vienna University of Technology, Favoritenstr. 9-11/188, A-1040, Wien, Austria
A Min Tjoa
Department of Software and Computing Systems, University of Alicante, Spain
Juan Trujillo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dang, X.H., Ng, WK., Ong, KL. (2006). Adaptive Load Shedding for Mining Frequent Patterns from Data Streams. In: Tjoa, A.M., Trujillo, J. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2006. Lecture Notes in Computer Science, vol 4081. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11823728_33

Download citation

DOI: https://doi.org/10.1007/11823728_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37736-8
Online ISBN: 978-3-540-37737-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics