Mining Approximate Frequent Itemsets over Data Streams Using Window Sliding Techniques

Kim, Younghee; Park, Eunkyoung; Kim, Ungmo

doi:10.1007/978-3-642-10583-8_7

Younghee Kim⁶,
Eunkyoung Park⁶ &
Ungmo Kim⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 64))

Included in the following conference series:

International Conference on Database Theory and Application

451 Accesses
1 Citations

Abstract

Frequent itemset mining is a core data mining operation and has been extensively studied in a broad range of application. The frequent data stream itemset mining is to find an approximate set of frequent itemsets in transaction with respect to a given support threshold. In this paper, we consider the problem of approximate that frequency counts for space efficient computation over data stream sliding windows. Approximate frequent itemsets mining algorithms use a user-specified error parameter, ε, to obtain an extra set of itemsets that are potential to become frequent later. Hence, we developed an algorithm based on the Chernoff bound for finding frequent itemsets over data stream sliding window. We present an improved algorithm MAFIM (a maximal approximate frequent itemsets mining) for frequent itemsets mining based on approximate counting using previous saved maximal frequent itemsets. The proposed algorithm gave a guarantee of the output quality and also a bound on the memory usage.

This work was supported by the Korea Science and Engineering Foundation (KOSEF) grant funded by the Korea government (MEST) (No.2009-0075771).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Mining Discriminative Itemsets Over Data Streams Using Efficient Sliding Window

Article Open access 27 June 2023

Mining Data Streams with Dynamic Confidence Intervals

Frequent Itemsets in Data Streams Using Dynamically Generated Minimum Support

References

Manku, G.S., Motwani, R.: Approximate Frequency Counts Over Data Streams. In: Proceedings of the 28^th International Conference on Very Large Data Bases, pp. 346–357 (2002)
Google Scholar
Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining frequent patterns in data streams at multiple time granularities. In: Data Mining, Next Generation Challenges and Futures Directions, pp. 191–212. AAAI/MIT Press (2004)
Google Scholar
Chang, J., Lee, W.: A Sliding Window Method for Finding Recently Frequent Itemsets over Online Data Streams. Journal of Information Science and Engineering 20(4) (July 2004)
Google Scholar
Lee, C.H., Lin, C.R., Chen, M.S.: Sliding window filtering: An efficient method for incremental mining on a time-variant database. Information Systems 30, 227–244 (2005)
Article Google Scholar
Yu, J.X., Chong, Z., Lu, H., Zhang, Z., Zhou, A.: False positive or false negative: mining frequent itemsets from high speed transactional data streams. In: Proc, VLDB, pp. 204–215 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information and Communication Engineering, Sungkyunkwan University, 300 Chunchun-dong, Suwon, Gyeonggi-Do, 440-746, Korea
Younghee Kim, Eunkyoung Park & Ungmo Kim

Authors

Younghee Kim
View author publications
You can also search for this author in PubMed Google Scholar
Eunkyoung Park
View author publications
You can also search for this author in PubMed Google Scholar
Ungmo Kim
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Warsaw and Infobright Inc., Poland
Dominik Ślęzak
Hannam University, 306-791, Daejeon, South Korea
Tai-hoon Kim
Utrecht University, The Netherlands
Yanchun Zhang
Hosei University, Tokyo, Japan
Jianhua Ma
ETRI, South Korea
Kyo-il Chung

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kim, Y., Park, E., Kim, U. (2009). Mining Approximate Frequent Itemsets over Data Streams Using Window Sliding Techniques. In: Ślęzak, D., Kim, Th., Zhang, Y., Ma, J., Chung, Ki. (eds) Database Theory and Application. DTA 2009. Communications in Computer and Information Science, vol 64. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10583-8_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-10583-8_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10582-1
Online ISBN: 978-3-642-10583-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics