Skip to main content

Efficient Approximate Mining of Frequent Patterns over Transactional Data Streams

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5182))

Abstract

We investigate the problem of finding frequent patterns in a continuous stream of transactions. It is recognized that the approximate solutions are usually sufficient and many existing literature explicitly trade off accuracy for speed where the quality of the final approximate counts are governed by an error parameter, ε. However, the quantification of ε is never simple. By setting a small ε, we achieve good accuracy but suffer in terms of efficiency. A bigger ε improves the efficiency but seriously degrades the mining accuracy. To alleviate this problem, we offer an alternative which allows user to customize a set of error bounds based on his requirement. Our experimental studies show that the proposed algorithm has high precision, requires less memory and consumes less CPU time.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cheng, J., Ke, Y., Ng, W.: A survey on algorithms for mining frequent itemsets over data streams. An International Journal of Knowledge and Information Systems (2007)

    Google Scholar 

  2. Manku, G.S., Motwani, R.: Approximate frequency counts over data streams. In: VLDB, pp. 346–357 (2002)

    Google Scholar 

  3. Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.: Mining frequent patterns in data streams at multiple time granularities. In: Kargupta, H., Joshi, A., Sivakumar, K., Yesha, Y. (eds.) Next Generation Data Mining, pp. 191–212. AAAI/MIT (2003)

    Google Scholar 

  4. Cheng, J., Ke, Y., Ng, W.: Maintaining Frequent Itemsets over High-Speed Data Streams. In: Ng, W.K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 462–467. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  5. Kohavi, Z.Z.R.: Real world performance of association rule algorithms. In: ACM SIGKDD (2001)

    Google Scholar 

  6. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc.of the 20th VLDB conf. (1994)

    Google Scholar 

  7. Chen, L., Lee, W.: Finding recent frequent itemsets adaptively over online data streams. In: Proc. of ACM SIGKDD Cof., pp. 487–492 (2003)

    Google Scholar 

  8. Yang, L., Sanver, M.: Mining short association rules with one database scan. In: Arabnia, H.R. (ed.) Proceedings of the International Conference on Information and Knowledge Engineering, pp. 392–398. CSREA Press (2004)

    Google Scholar 

  9. Liu, B., Hsu, W., Ma, Y.: Pruning and summarizing the discovered associations. In: Proceedings of ACM SIGKDD International Conference in Knowledge Discovery and Data Mining, pp. 125–134 (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Il-Yeol Song Johann Eder Tho Manh Nguyen

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ng, W., Dash, M. (2008). Efficient Approximate Mining of Frequent Patterns over Transactional Data Streams. In: Song, IY., Eder, J., Nguyen, T.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2008. Lecture Notes in Computer Science, vol 5182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85836-2_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85836-2_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85835-5

  • Online ISBN: 978-3-540-85836-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics