Approximately Mining Recently Representative Patterns on Data Streams

Koh, Jia-Ling; Don, Yuan-Bin

doi:10.1007/978-3-540-77018-3_25

Jia-Ling Koh¹ &
Yuan-Bin Don¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4819))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1521 Accesses
2 Citations

Abstract

Catching the recent trend of data is an important issue when mining frequent itemsets from data streams. To prevent from storing the whole transaction data within the sliding window, the frequency changing point (FCP) method was proposed for monitoring the recent occurrences of itemsets in a data stream under the assumption that exact one transaction arrives at each time point. In this paper, the FCP method is extended for maintaining recent patterns in a data stream where a block of various numbers of transactions (including zero or more transactions) is inputted within each time unit. Moreover, to avoid generating redundant information in the mining results, the recently representative patterns are discovered from the maintained structure approximately. The experimental results show that our approach reduces the run-time memory usage significantly. Moreover, the proposed GFCP algorithm achieves high accuracy of mining results and guarantees no false dismissal occurring.

This work was partially supported by the R.O.C. N.S.C. under Contract No. 95-2221-E-003-011 and 95-2524-S-003-012.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Mining Data Streams with Dynamic Confidence Intervals

Incremental Mining of Frequent Serial Episodes Considering Multiple Occurrences

Time-weighted counting for recently frequent pattern mining in data streams

Article 22 March 2017

References

Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proc. of Int. Conf. on Very Large Data Bases (1994)
Google Scholar
Chang, J.H, Lee, W.S.: Finding Recent Frequent Itemsets Adaptively over Online Data Streams. In: Proc. of the 9th ACM International Conference on Knowledge Discovery and Data Ming (2003)
Google Scholar
Chang, J.H., Lee, W.S.: A Sliding Window Method for Finding Recently Frequent Itemsets over Online Data Streams. Journal of Information Science and Engineering 20, 753–762 (2004)
Google Scholar
Han, J., Pei, J., Yin, Y., Mao, R.: Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach. Data Mining and Knowledge Discovery 8(1), 53–87 (2004)
Article MathSciNet Google Scholar
Jin, C., Qian, W., Sha, C., Yu, J.X., Zhou, A.: Dynamically Maintaining Frequent Items Over a Data Stream. In: Proc. of the 12th ACM International Conference on Information and Knowledge Management (2003)
Google Scholar
Koh, J.L., Shin, S.N.: An Approximate Approach for Mining Recently Frequent Itemsets from Data Streams. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2006. LNCS, vol. 4081, Springer, Heidelberg (2006)
Chapter Google Scholar
Lin, C.H., Chiu, D.Y., Wu, Y.H., Chen, A.L.P.: Mining Frequent Itemsets from Data Streams with a Time-Sensitive Sliding Window. In: Proc. of SIAM Intl. Conference on Data Mining (2005)
Google Scholar
Manku, G.S., Chen Motwani, R.: Approximate Frequent Counts over Data Streams. In: Proc. of the 28th International Conference on Very Large Database, Hong Kong, China (August 2002)
Google Scholar
Park, J.S., Chen, M.S., Yu, P.S.: An Effective Hash-based Algorithm for Mining Association Rules. In: Proc. of the ACM SIGMOD International Conference on Management of Data (SIGMOD 1995), May, pp. 175–186 (1995)
Google Scholar
Pei, J., Han, J., Mao, R.: CLOSET: An efficient algorithm for mining frequent closed itemsets. In: Proc. of ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (2000)
Google Scholar
Wang, K., Tang, L., Han, J., Liu, J.: Top Down FP-Growth for Association Rule Mining. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 6–8. Springer, Heidelberg (2002)
Google Scholar
Xin, D., Han, J., Yan, X., Cheng, H.: Mining Compressed Frequent-Pattern Sets. in Proc. of Int. Conf. on Very Large Data Bases (VLDB 2005) (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Science and Computer Engineering, National Taiwan Normal University, Taipei, Taiwan
Jia-Ling Koh & Yuan-Bin Don

Authors

Jia-Ling Koh
View author publications
You can also search for this author in PubMed Google Scholar
Yuan-Bin Don
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Takashi Washio Zhi-Hua Zhou Joshua Zhexue Huang Xiaohua Hu Jinyan Li Chao Xie Jieyue He Deqing Zou Kuan-Ching Li Mário M. Freire

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Koh, JL., Don, YB. (2007). Approximately Mining Recently Representative Patterns on Data Streams. In: Washio, T., et al. Emerging Technologies in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4819. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77018-3_25

Download citation

DOI: https://doi.org/10.1007/978-3-540-77018-3_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77016-9
Online ISBN: 978-3-540-77018-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics