Abstract
We investigate the problem of finding frequent patterns in a continuous stream of transactions. It is recognized that the approximate solutions are usually sufficient and many existing literature explicitly trade off accuracy for speed where the quality of the final approximate counts are governed by an error parameter, ε. However, the quantification of ε is never simple. By setting a small ε, we achieve good accuracy but suffer in terms of efficiency. A bigger ε improves the efficiency but seriously degrades the mining accuracy. To alleviate this problem, we offer an alternative which allows user to customize a set of error bounds based on his requirement. Our experimental studies show that the proposed algorithm has high precision, requires less memory and consumes less CPU time.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Cheng, J., Ke, Y., Ng, W.: A survey on algorithms for mining frequent itemsets over data streams. An International Journal of Knowledge and Information Systems (2007)
Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. In: VLDB, pp. 346–357 (2002)
Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.: Mining frequent patterns in data streams at multiple time granularities. In: Kargupta, H., Joshi, A., Sivakumar, K., Yesha, Y. (eds.) Next Generation Data Mining, pp. 191–212. AAAI/MIT (2003)
Cheng, J., Ke, Y., Ng, W.: Maintaining Frequent Itemsets over High-Speed Data Streams. In: Ng, W.K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 462–467. Springer, Heidelberg (2006)
Kohavi, Z.Z.R.: Real world performance of association rule algorithms. In: ACM SIGKDD (2001)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc.of the 20th VLDB conf. (1994)
Chen, L., Lee, W.: Finding recent frequent itemsets adaptively over online data streams. In: Proc. of ACM SIGKDD Cof., pp. 487–492 (2003)
Yang, L., Sanver, M.: Mining short association rules with one database scan. In: Arabnia, H.R. (ed.) Proceedings of the International Conference on Information and Knowledge Engineering, pp. 392–398. CSREA Press (2004)
Liu, B., Hsu, W., Ma, Y.: Pruning and summarizing the discovered associations. In: Proceedings of ACM SIGKDD International Conference in Knowledge Discovery and Data Mining, pp. 125–134 (1999)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ng, W., Dash, M. (2008). Efficient Approximate Mining of Frequent Patterns over Transactional Data Streams. In: Song, IY., Eder, J., Nguyen, T.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2008. Lecture Notes in Computer Science, vol 5182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85836-2_23
Download citation
DOI: https://doi.org/10.1007/978-3-540-85836-2_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85835-5
Online ISBN: 978-3-540-85836-2
eBook Packages: Computer ScienceComputer Science (R0)