Skip to main content
Log in

Mining frequent itemsets using the N-list and subsume concepts

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Frequent itemset mining is a fundamental element with respect to many data mining problems directed at finding interesting patterns in data. Recently the PrePost algorithm, a new algorithm for mining frequent itemsets based on the idea of N-lists, which in most cases outperforms other current state-of-the-art algorithms, has been presented. This paper proposes an improved version of PrePost, the N-list and Subsume-based algorithm for mining Frequent Itemsets (NSFI) algorithm that uses a hash table to enhance the process of creating the N-lists associated with 1-itemsets and an improved N-list intersection algorithm. Furthermore, two new theorems are proposed for determining the “subsume index” of frequent 1-itemsets based on the N-list concept. Using the subsume index, NSFI can identify groups of frequent itemsets without determining the N-list associated with them. The experimental results show that NSFI outperforms PrePost in terms of runtime and memory usage and outperforms dEclat in terms of runtime.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Notes

  1. Downloaded from http://fimi.cs.helsinki.fi/data/.

References

  1. Agrawal R, Imielinski T, Swami AN (1993) Mining association rules between sets of items in large databases. In: Proceeding of the SIGMOD’93, pp 207–216

  2. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceeding of the VLDB’94, pp 487–499

  3. Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceeding of the 11th international conference on data engineering, Taipei, Taiwan, pp 3–14

  4. Ayres J, Gehrke JE, Yiu T, Flannick J (2002) Sequential pattern mining using a bitmap representation. In: Proceeding of the SIGKDD’02, pp 429–435

  5. Baralis E, Cerquitelli T, Chiusano S (2010) Constrained itemset mining on a sequence of incoming data blocks. Int J Intell Syst 25(5):389–410

    MATH  Google Scholar 

  6. Chen G, Liu H, Yu L, Wei Q, Zhang X (2006) A new approach to classification based on association rule mining. Decis Support Syst 42(2):674–689

    Article  Google Scholar 

  7. Coenen F, Leng P, Zhang L (2007) The effect of threshold values on association rule based classification accuracy. Data Knowl Eng 60(2):345–360

    Article  Google Scholar 

  8. Deng Z, Fang G, Wang Z, Xu X (2009) Mining erasable itemsets. In: Proceeding of the ICMLC’09, pp 67–73

  9. Deng Z, Wang Z (2010) A new fast vertical method for mining frequent patterns. Int J Comput Intell Syst 3(6):733–744

    Article  MathSciNet  Google Scholar 

  10. Deng Z, Wang Z, Jiang JJ (2012) A new algorithm for fast mining frequent itemsets using N-lists. Sci China Inform Sci 55(9):2008–2030

    Article  MathSciNet  MATH  Google Scholar 

  11. Dong J, Han M (2007) BitTableFI: an efficient mining frequent itemsets algorithm. Knowl Based Syst 20:329–335

    Article  Google Scholar 

  12. Fournier-Viger P, Faghihi U, Nkambou R, Nguifo EM (2012) CMRules: an efficient algorithm for mining sequential rules common to several sequences. Knowl Based Syst 25(1):63–76

    Article  Google Scholar 

  13. Grahne G, Zhu J (2005) Fast algorithms for frequent itemset mining using FP-trees. IEEE Trans Knowl Data Eng 17:1347–1362

    Article  Google Scholar 

  14. Gouda K, Hassaan M, Zaki MJ (2010) PRISM: a primal-encoding approach for frequent sequence mining. J Comput Syst Sci 76(1):88–102

    Article  MathSciNet  MATH  Google Scholar 

  15. Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceeding of the SIGMODKDD’00, pp 1–12

  16. Le T, Vo B, Coenen F (2013) An efficient algorithm for mining erasable itemsets using the difference of NC-Sets. In: Proceeding of the IEEE SMC’13, pp 2270–2274

  17. Le T, Vo B (2014) MEI: an efficient algorithm for mining erasable itemsets. Eng Appl Artif Intell 27(1):155–166

    Article  Google Scholar 

  18. Li W, Han J, Pei J (2001) CMAR: accurate and efficient classification based on multiple class-association rules. In: Proceeding of the ICDM’01, pp 369–376

  19. Lim AHL, Lee CS (2010) Processing online analytics with classification and association rule mining. Knowl Based Syst 23(3):248–255

    Article  MathSciNet  Google Scholar 

  20. Liu B, Hsu W, Ma Y (1998) Integrating classification and association rule mining. In: Proceeding of the SIGKDD’98, pp 80–86

  21. Liu L, Yu Z, Guo J, Mao C, Hong X (2013) Chinese question classification based on question property kernel. Int J Mach Learn Cybern. doi:10.1007/s13042-013-0216-y

    Google Scholar 

  22. Lucchese B, Orlando S, Perego R (2006) Fast and memory efficient mining of frequent closed itemsets. IEEE Trans Knowl Data Eng 18(1):21–36

    Article  Google Scholar 

  23. Mohamed MH, Darwieesh MM (2013) Efficient mining frequent itemsets algorithms. Int J Mach Learn Cybern. doi:10.1007/s13042-013-0172-6

    Google Scholar 

  24. Nguyen LTT, Vo B, Hong TP, Hoang CT (2012) Classification based on association rules: a lattice-based approach. Expert Syst Appl 39(13):11357–11366

    Article  Google Scholar 

  25. Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Efficient mining of association rules using closed itemset lattices. Inform Syst 24(1):25–46

    Article  MATH  Google Scholar 

  26. Pham TT, Luo JW, Hong TP, Vo B (2012) MSGPs: a novel algorithm for mining sequential generator patterns. In: Proceeding of the ICCCI’12, pp 393–402

  27. Song W, Yang B, Xu Z (2008) Index-BitTableFI: an improved algorithm for mining frequent itemsets. Knowl Based Syst 21:507–513

    Article  Google Scholar 

  28. Veloso A, Meira W Jr, Goncalves M, Almeida HM, Zaki MJ (2011) Calibrated lazy associative classification. Inform Sci 181(13):2656–2670

    Article  Google Scholar 

  29. Vo B, Coenen F, Le B (2013) A new method for mining Frequent Weighted Itemsets based on WIT-trees. Expert Syst Appl 40(4):1256–1264

    Article  Google Scholar 

  30. Vo B, Hong TP, Le B (2013) A lattice-based approach for mining most generalization association rules. Knowl Based Syst 45:20–30

    Article  Google Scholar 

  31. Vo B, Le B (2011) Interestingness measures for mining association rules: combination between lattice and hash tables. Expert Syst Appl 38(9):11630–11640

    Article  Google Scholar 

  32. Vo B, Le T, Coenen F, Hong TP (2013) A hybrid approach for mining frequent itemsets. In: Proceeding of the IEEE SMC’13, pp 4647–4651

  33. Zaki MJ (2004) Mining non-redundant association rules. Data Min Knowl Disc 9(3):223–248

    Article  MathSciNet  Google Scholar 

  34. Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997) New algorithms for fast discovery of association rules. In: Proceeding of KDD’97, pp 283–286

  35. Zaki MJ, Hsiao CJ (2005) Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Trans Knowl Data Eng 17(4):462–478

    Article  Google Scholar 

  36. Zhang X, Chen G, Wei Q (2011) Building a highly-compact and accurate associative classifier. Appl Intell 34(1):74–86

    Article  Google Scholar 

  37. Zhao S, Tsang ECC, Chen D, Wang XZ (2010) Building a rule-based classifier—a fuzzy-rough set approach. IEEE Trans Knowl Data Eng 22(5):624–638

    Article  Google Scholar 

Download references

Acknowledgments

This research was funded by Foundation for Science and Technology Development of Ton Duc Thang University (FOSTECT), website: http://fostect.tdt.edu.vn, under Grant FOSTECT.2014.BR.07.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tuong Le.

Additional information

This paper is a expanded version of the paper “A hybrid approach for mining frequent itemsets” [32] presented in IEEE International Conference on Systems, Man, and Cybernetics 2013.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vo, B., Le, T., Coenen, F. et al. Mining frequent itemsets using the N-list and subsume concepts. Int. J. Mach. Learn. & Cyber. 7, 253–265 (2016). https://doi.org/10.1007/s13042-014-0252-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-014-0252-2

Keywords

Navigation