Skip to main content

Mining Closed Itemsets in Data Stream Using Formal Concept Analysis

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6263))

Abstract

Mining of frequent closed itemsets has been shown to be more efficient than mining frequent itemsets for generating non-redundant association rules. The task is challenging in data stream environment because of the unbounded nature and no-second-look characteristics.

In this paper, we propose an algorithm, CLICI, for mining all recent closed itemsets in landmark window model of online data stream. The algorithm consists of an online component, which processes the transactions arriving in the stream without candidate generation and updates the synopsis appropriately. The offline component is invoked on demand to mine all frequent closed itemsets. User can explore and experiment by specifying the support threshold dynamically.

The synopsis, CILattice, stores all recent closed itemsets in the stream. It is based on Concept Lattice - a core structure of Formal Concept Analysis (FCA). Closed itemsets stored in the form of lattice facilitate generation of non-redundant association rules and is the main motivation behind using lattice based synopsis.

Experimental evaluation using synthetic and real life datasets demonstrates the scalablility of the algorithm.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agarwal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: 20th International Conference on Very Large Databases, pp. 487–499 (1994)

    Google Scholar 

  2. Chang, J., Lee, W.: Finding Recent Frequent Itemsets Adaptively over Online Data stream. In: 9th ACM SIGKDD, pp. 487–492. ACM Press, New York (2003)

    Google Scholar 

  3. Cheng, J., Ke, Y., Ng, W.: A Survey on Algorithms for Mining Frequent Itemsets over Data stream. KAIS Journal 16(1), 1–27 (2008)

    Google Scholar 

  4. Chen, J., Li, S.: GC-Tree: A Fast Online Algorithm for Mining Frequent Closed Itemsets. In: Proceeding of PAKDD Workshop of HPDMA, pp. 457–468 (2007)

    Google Scholar 

  5. Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer, Heidelberg (1999)

    Book  MATH  Google Scholar 

  6. Han, J., Cheng, H., Xin, D., Yan, X.: Frequent Pattern Mining: Current Status and Future Directions. Journal of Data Mining and Knowledge Discovery 15, 55–86 (2007)

    Article  MathSciNet  Google Scholar 

  7. Jiang, N., Gruenwald, L.: CFI-Stream: Mining Closed Frequent Itemsets in Data stream. In: 12th ACM SIGKDD, Poster Paper, pp. 592–597. ACM Press, New York (2006)

    Google Scholar 

  8. Kuznetsov, S.O., Obiedkov, S.A.: Comparing Performance of Algorithms for Generating Concept Lattices. JETAI 14, 189–216 (2002)

    MATH  Google Scholar 

  9. Li, H., Ho, C., Lee, S.: Incremental Updates of Closed Frequent Itemsets Over Continuous Data stream. Expert Systems with Applications 36, 2451–2458 (2009)

    Article  Google Scholar 

  10. Liu, X., Guan, J., Hu, P.: Mining Frequent Closed Itemsets from a landmark window over online data stream. Journal of Computers and Mathematics with Applications 57(6), 927–936 (2009)

    Article  MATH  Google Scholar 

  11. Pasquier, N., et al.: Efficient Mining of Association Rules using Closed Itemset Lattices. Journal of Information Systems 24(1), 25–46 (1999)

    Article  Google Scholar 

  12. Stumme, G., et al.: Computing Iceberg Concept Lattices with Titanic. Journal on Knowledge and Data Engineering 42(2), 189–222 (2002)

    Article  MATH  Google Scholar 

  13. Valtchev, P., Missaoui, R., Godin, R.: A framework for incremental generation of closed itemsets. Discrete Applied Mathematics 156(6), 924–949 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  14. Chi, Y., Wang, H., Yu, P.S., Muntz, R.R.: Catch the Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window. Journal of Knowledge and Information Systems 10, 265–294 (2006)

    Article  Google Scholar 

  15. Yahia, S.B., Hamrouni, T., Nguifo, E.M.: Frequent Closed Itemset Based Algorithms: A thorough structural and analytical survey. ACM SIGKDD Explorations Newsletter 8, 93–104 (2006)

    Article  Google Scholar 

  16. Zaki, M.J.: Generating Non-Redundant Association Rules. In: 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 34–43. ACM Press, New York (2000)

    Google Scholar 

  17. Zheng, Z., Kohavi, R., Mason, L.: Real World Performance of Association Rule Algorithms. In: Proceedings of the 2001 International Conference Knowledge Discovery and Data Mining, SIGKDD 2001 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gupta, A., Bhatnagar, V., Kumar, N. (2010). Mining Closed Itemsets in Data Stream Using Formal Concept Analysis. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2010. Lecture Notes in Computer Science, vol 6263. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15105-7_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15105-7_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15104-0

  • Online ISBN: 978-3-642-15105-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics