Abstract
In this paper, we consider the problem of frequent elements over data stream seeks the set of items whose frequency exceeds σN for a given threshold parameter σ. We refer to this model as the sliding window model. We also use a user specified error parameter, ε, to control the accuracy of the mining result. We also propose an FIA (Frequent Itemsets mining based on an Approximate counting) algorithm based on the Chernoff bound with a guarantee of the output quality and also a bound on the memory usage. The proposed algorithm show that runs significantly faster and consumes less memory than do existing algorithms for mining approximate frequent itemsets.
This work was supported by the Korea Science and Engineering Foundation (KOSEF) grant funded by the Korea government(MEST) (No. 2009-0075771).
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Datar, M., Gionis, A., Indyk, P., Motwani, R.: Maintaining stream statistics over sliding windows. SIAM Journal on Computing 31(6), 1794–1813 (2002)
Manku, G.S., Motwani, R.: Approximate Frequency Counts Over Data Streams. In: Proceedings of the 28th International Conference on VLDB, pp. 346–357 (2002)
Yu, J.X., Chong, Z., Lu, H., Zhang, Z., Zhou, A.: False positive or false negative: mining frequent itemsets from high speed transactional data streams. In: Proc, VLDB (2004)
Chang, J., Lee, W.: A Sliding Window Method for Finding Recently Frequent Itemsets over Online Data Streams. Journal of Information Science and Engineering 20 (2004)
Lee, C.H., Lin, C.R., Chen, M.S.: Sliding window filtering: An efficient method for incremental mining on a time-variant database. Information Systems 30, 227–244 (2005)
Lin, C.-H., Chiu, D.-Y., Wu, Y.-H., Chen, A.L.P.: Mining frequent itemsets from data streams with a time-sensitive sliding window. In: Proc, SIAM Int’l Conference on Data Mining, pp. 68–79 (2005)
Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining frequent patterns in data streams at multiple time granularities. In: Data Mining, Next Generation Challenges and Futures Directions, pp. 191–212. AAAI/MIT Press (2004)
Li, H.F., Lee, S.Y.: Mining frequent itemsets over data streams using efficient window sliding techniques. Expert Systems with Applications (2008)
Li, H.F., Ho, C.C., Shan, M.K., Lee, S.Y.: Efficient Maintenance and Mining of Frequent Itemsets over Online Data Streams with a Sliding Window. In: IEEE SMC 2006 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, Y., Ryu, J., Kim, U. (2009). FIA: Frequent Itemsets Mining Based on Approximate Counting in Data Streams. In: Leung, C.S., Lee, M., Chan, J.H. (eds) Neural Information Processing. ICONIP 2009. Lecture Notes in Computer Science, vol 5863. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10677-4_35
Download citation
DOI: https://doi.org/10.1007/978-3-642-10677-4_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10676-7
Online ISBN: 978-3-642-10677-4
eBook Packages: Computer ScienceComputer Science (R0)