Abstract
Frequent itemset mining is a core data mining operation and has been extensively studied in a broad range of application. The frequent data stream itemset mining is to find an approximate set of frequent itemsets in transaction with respect to a given support threshold. In this paper, we consider the problem of approximate that frequency counts for space efficient computation over data stream sliding windows. Approximate frequent itemsets mining algorithms use a user-specified error parameter, ε, to obtain an extra set of itemsets that are potential to become frequent later. Hence, we developed an algorithm based on the Chernoff bound for finding frequent itemsets over data stream sliding window. We present an improved algorithm MAFIM (a maximal approximate frequent itemsets mining) for frequent itemsets mining based on approximate counting using previous saved maximal frequent itemsets. The proposed algorithm gave a guarantee of the output quality and also a bound on the memory usage.
This work was supported by the Korea Science and Engineering Foundation (KOSEF) grant funded by the Korea government (MEST) (No.2009-0075771).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Manku, G.S., Motwani, R.: Approximate Frequency Counts Over Data Streams. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 346–357 (2002)
Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining frequent patterns in data streams at multiple time granularities. In: Data Mining, Next Generation Challenges and Futures Directions, pp. 191–212. AAAI/MIT Press (2004)
Chang, J., Lee, W.: A Sliding Window Method for Finding Recently Frequent Itemsets over Online Data Streams. Journal of Information Science and Engineering 20(4) (July 2004)
Lee, C.H., Lin, C.R., Chen, M.S.: Sliding window filtering: An efficient method for incremental mining on a time-variant database. Information Systems 30, 227–244 (2005)
Yu, J.X., Chong, Z., Lu, H., Zhang, Z., Zhou, A.: False positive or false negative: mining frequent itemsets from high speed transactional data streams. In: Proc, VLDB, pp. 204–215 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, Y., Park, E., Kim, U. (2009). Mining Approximate Frequent Itemsets over Data Streams Using Window Sliding Techniques. In: Ślęzak, D., Kim, Th., Zhang, Y., Ma, J., Chung, Ki. (eds) Database Theory and Application. DTA 2009. Communications in Computer and Information Science, vol 64. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10583-8_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-10583-8_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10582-1
Online ISBN: 978-3-642-10583-8
eBook Packages: Computer ScienceComputer Science (R0)