WSFI-Mine: Mining Frequent Patterns in Data Streams

Kim, Younghee; Kim, Ungmo

doi:10.1007/978-3-642-01510-6_95

Younghee Kim¹⁹ &
Ungmo Kim¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5552))

Included in the following conference series:

International Symposium on Neural Networks

1054 Accesses
1 Citations

Abstract

A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Data mining over data streams should support the flexible trade-off between processing time and mining accuracy. This should occur without a fixed granule of data mining to catch the sensitive change of its mining results as soon as possible. The continuous characteristic of streaming data necessitates the use of algorithms that require only one scan over the stream for knowledge discovery. This paper focuses on research issues concerning mining frequent itemsets in data streams and presents an efficient algorithm WSFI(Weighted Support Frequent Itemsets)-mine to mine all frequent itemsets by one scan from the data stream. WSFI-mine’s novel contribution is to effectively execute frequent patterns by generating constraint candidate item sets and extended FPtree-based compact pattern representation under window sliding of the data stream. This method can be achieved effectively with less memory and lowered execution time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Frequent Itemset Mining over Data Streams

Frequent Itemset Mining Algorithms—A Literature Survey

Frequent Itemsets in Data Streams Using Dynamically Generated Minimum Support

References

Chang, J., Lee, W.: A Sliding Window Method for Finding Recently Frequent Itemsets over Online Data Streams. Journal of Information Science and Engineering 20(4) (July 2004)
Google Scholar
Manku, G.S., Motwani, R.: Approximate Frequency Counts Over Data Streams. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 346–357 (2002)
Google Scholar
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Conf. of the 20th VLDB conference, pp. 487–499 (1994)
Google Scholar
Li., H.F., Lee, S.Y., Shan, M.K.: An Efficient Algorithm for Mining Frequent Itemsets over the Entire History of Data Streams. In: Proceedings of First International Workshop on Knowledge Discovery in Data Streams 9IWKDDS (2004)
Google Scholar
Li., H.F., Lee, S.Y., Shan, M.K.: Online Mining (Recently) Maximal Frequent Itemsets over Data Streams. In: Proceedings of the 15th IEEE International Workshop on Research Issues on Data Engineering, RIDE (2005)
Google Scholar
Chi, Y., Wang, H., Yu, P.S., Muntz, R.R.: Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window. In: Proceedings of the Fourth IEEE International Conference on Data Mining (ICDM 2004) (2004)
Google Scholar
Lee, C.H., Lin, C.R., Chen, M.S.: Sliding Window Filtering: An Efficient Method for Incremental Mining on a Time-variant Database. Information Systems 30, 227–244 (2005)
Article Google Scholar
Lin, C.H., Chiu, D.Y., Wu, Y.H., Chen, A.L.P.: Mining Frequent Itemsets from Data Streams with a Time-sensitive Sliding Window. In: Proc. SIAM Int’l. Conference on Data Mining, pp. 68–79. SIAM, Philadelphia (2005)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. In: Proc. of 2000 ACM SIGMOD, pp. 1–12 (2000)
Google Scholar
Li, H.F., Lee, S.Y.: Mining Frequent Itemsets over Data Streams using Efficient Window Sliding Techniques. Expert Systems with Applications (2008)
Google Scholar
Li, H.F., Ho, C.C., Shan, M.K., Lee, S.Y.: Efficient Maintenance and Mining of Frequent Itemsets over Online Data Streams with a Sliding Window. In: IEEE SMC 2006 (2006)
Google Scholar
Chu, C.J., Tseng, V.S., Liang, T.: An Efficient Algorithm for Mining Temporal High Utility Itemsets from Data Streams. The Journal of System and Software 81, 1105–1117 (2008)
Article Google Scholar
Guo, Y., et al.: A FP-tree based method for inverse frequent set mining. In: Bell, D.A., Hong, J. (eds.) BNCOD 2006. LNCS, vol. 4042, pp. 152–163. Springer, Heidelberg (2006)
Chapter Google Scholar
Leung, C.K.S., et al.: A tree-based approach for frequent pattern mining from uncertain data. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 653–661. Springer, Heidelberg (2008)
Chapter Google Scholar
Leung, C.K.S., et al.: CanTree: a canonical-order tree for incremental frequent-pattern mining. KAIS 11(3), 287–311 (2007)
Google Scholar
Han, J., Pei, J., Yin, Y., Mao, R.: Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach. Data Mining and Knowledge Discovery 8, 53–87 (2004)
Article MathSciNet Google Scholar
Zhu, X.D., Huang, Z.Q.: Conceptual modeling rules extracting for data streams. Knowledge-Based Systems, 1–7 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Sungkyunkwan University, 300 Chunchun-dong, Suwon, Gyeonggi-Do, 440-746, Korea
Younghee Kim & Ungmo Kim

Authors

Younghee Kim
View author publications
You can also search for this author in PubMed Google Scholar
Ungmo Kim
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departamento de Control Automático,, CINVESTAV-IPN,, A.P. 14-740, Av.IPN 2508,, D.F., 07360,, México, México
Wen Yu
Deptartment of Electrical and Computer Engineering,, Stevens Institute of Technology,, NJ 07030,, Hoboken,, USA
Haibo He
Dept. of Electrical and Computer Engineering,, South Dakota School of Mines & Technology,, 501 E. St. Joseph Street,, SD 57701,, Rapid City,, USA
Nian Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kim, Y., Kim, U. (2009). WSFI-Mine: Mining Frequent Patterns in Data Streams. In: Yu, W., He, H., Zhang, N. (eds) Advances in Neural Networks – ISNN 2009. ISNN 2009. Lecture Notes in Computer Science, vol 5552. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01510-6_95

Download citation

DOI: https://doi.org/10.1007/978-3-642-01510-6_95
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01509-0
Online ISBN: 978-3-642-01510-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics