Efficient Approximate Mining of Frequent Patterns over Transactional Data Streams

Ng, Willie; Dash, Manoranjan

doi:10.1007/978-3-540-85836-2_23

Efficient Approximate Mining of Frequent Patterns over Transactional Data Streams

Willie Ng¹ &
Manoranjan Dash¹

Conference paper

1793 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5182))

Abstract

We investigate the problem of finding frequent patterns in a continuous stream of transactions. It is recognized that the approximate solutions are usually sufficient and many existing literature explicitly trade off accuracy for speed where the quality of the final approximate counts are governed by an error parameter, ε. However, the quantification of ε is never simple. By setting a small ε, we achieve good accuracy but suffer in terms of efficiency. A bigger ε improves the efficiency but seriously degrades the mining accuracy. To alleviate this problem, we offer an alternative which allows user to customize a set of error bounds based on his requirement. Our experimental studies show that the proposed algorithm has high precision, requires less memory and consumes less CPU time.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cheng, J., Ke, Y., Ng, W.: A survey on algorithms for mining frequent itemsets over data streams. An International Journal of Knowledge and Information Systems (2007)
Google Scholar
Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. In: VLDB, pp. 346–357 (2002)
Google Scholar
Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.: Mining frequent patterns in data streams at multiple time granularities. In: Kargupta, H., Joshi, A., Sivakumar, K., Yesha, Y. (eds.) Next Generation Data Mining, pp. 191–212. AAAI/MIT (2003)
Google Scholar
Cheng, J., Ke, Y., Ng, W.: Maintaining Frequent Itemsets over High-Speed Data Streams. In: Ng, W.K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 462–467. Springer, Heidelberg (2006)
Chapter Google Scholar
Kohavi, Z.Z.R.: Real world performance of association rule algorithms. In: ACM SIGKDD (2001)
Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc.of the 20th VLDB conf. (1994)
Google Scholar
Chen, L., Lee, W.: Finding recent frequent itemsets adaptively over online data streams. In: Proc. of ACM SIGKDD Cof., pp. 487–492 (2003)
Google Scholar
Yang, L., Sanver, M.: Mining short association rules with one database scan. In: Arabnia, H.R. (ed.) Proceedings of the International Conference on Information and Knowledge Engineering, pp. 392–398. CSREA Press (2004)
Google Scholar
Liu, B., Hsu, W., Ma, Y.: Pruning and summarizing the discovered associations. In: Proceedings of ACM SIGKDD International Conference in Knowledge Discovery and Data Mining, pp. 125–134 (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Advanced Information Systems, Nanyang Technological University, Singapore, 639798
Willie Ng & Manoranjan Dash

Authors

Willie Ng
View author publications
You can also search for this author in PubMed Google Scholar
Manoranjan Dash
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Il-Yeol Song Johann Eder Tho Manh Nguyen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ng, W., Dash, M. (2008). Efficient Approximate Mining of Frequent Patterns over Transactional Data Streams. In: Song, IY., Eder, J., Nguyen, T.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2008. Lecture Notes in Computer Science, vol 5182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85836-2_23

Download citation

DOI: https://doi.org/10.1007/978-3-540-85836-2_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85835-5
Online ISBN: 978-3-540-85836-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics