Abstract
Catching the recent trend of data is an important issue when mining frequent itemsets from data streams. To prevent from storing the whole transaction data within the sliding window, the frequency changing point (FCP) method was proposed for monitoring the recent occurrences of itemsets in a data stream under the assumption that exact one transaction arrives at each time point. In this paper, the FCP method is extended for maintaining recent patterns in a data stream where a block of various numbers of transactions (including zero or more transactions) is inputted within each time unit. Moreover, to avoid generating redundant information in the mining results, the recently representative patterns are discovered from the maintained structure approximately. The experimental results show that our approach reduces the run-time memory usage significantly. Moreover, the proposed GFCP algorithm achieves high accuracy of mining results and guarantees no false dismissal occurring.
This work was partially supported by the R.O.C. N.S.C. under Contract No. 95-2221-E-003-011 and 95-2524-S-003-012.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proc. of Int. Conf. on Very Large Data Bases (1994)
Chang, J.H, Lee, W.S.: Finding Recent Frequent Itemsets Adaptively over Online Data Streams. In: Proc. of the 9th ACM International Conference on Knowledge Discovery and Data Ming (2003)
Chang, J.H., Lee, W.S.: A Sliding Window Method for Finding Recently Frequent Itemsets over Online Data Streams. Journal of Information Science and Engineering 20, 753–762 (2004)
Han, J., Pei, J., Yin, Y., Mao, R.: Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach. Data Mining and Knowledge Discovery 8(1), 53–87 (2004)
Jin, C., Qian, W., Sha, C., Yu, J.X., Zhou, A.: Dynamically Maintaining Frequent Items Over a Data Stream. In: Proc. of the 12th ACM International Conference on Information and Knowledge Management (2003)
Koh, J.L., Shin, S.N.: An Approximate Approach for Mining Recently Frequent Itemsets from Data Streams. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2006. LNCS, vol. 4081, Springer, Heidelberg (2006)
Lin, C.H., Chiu, D.Y., Wu, Y.H., Chen, A.L.P.: Mining Frequent Itemsets from Data Streams with a Time-Sensitive Sliding Window. In: Proc. of SIAM Intl. Conference on Data Mining (2005)
Manku, G.S., Chen Motwani, R.: Approximate Frequent Counts over Data Streams. In: Proc. of the 28th International Conference on Very Large Database, Hong Kong, China (August 2002)
Park, J.S., Chen, M.S., Yu, P.S.: An Effective Hash-based Algorithm for Mining Association Rules. In: Proc. of the ACM SIGMOD International Conference on Management of Data (SIGMOD 1995), May, pp. 175–186 (1995)
Pei, J., Han, J., Mao, R.: CLOSET: An efficient algorithm for mining frequent closed itemsets. In: Proc. of ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (2000)
Wang, K., Tang, L., Han, J., Liu, J.: Top Down FP-Growth for Association Rule Mining. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 6–8. Springer, Heidelberg (2002)
Xin, D., Han, J., Yan, X., Cheng, H.: Mining Compressed Frequent-Pattern Sets. in Proc. of Int. Conf. on Very Large Data Bases (VLDB 2005) (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Koh, JL., Don, YB. (2007). Approximately Mining Recently Representative Patterns on Data Streams. In: Washio, T., et al. Emerging Technologies in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4819. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77018-3_25
Download citation
DOI: https://doi.org/10.1007/978-3-540-77018-3_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77016-9
Online ISBN: 978-3-540-77018-3
eBook Packages: Computer ScienceComputer Science (R0)