Mining frequent itemsets using the N-list and subsume concepts

Vo, Bay; Le, Tuong; Coenen, Frans; Hong, Tzung-Pei

doi:10.1007/s13042-014-0252-2

Mining frequent itemsets using the N-list and subsume concepts

Original Article
Published: 26 April 2014

Volume 7, pages 253–265, (2016)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Bay Vo¹,
Tuong Le¹,
Frans Coenen² &
…
Tzung-Pei Hong³

774 Accesses
60 Citations
Explore all metrics

Abstract

Frequent itemset mining is a fundamental element with respect to many data mining problems directed at finding interesting patterns in data. Recently the PrePost algorithm, a new algorithm for mining frequent itemsets based on the idea of N-lists, which in most cases outperforms other current state-of-the-art algorithms, has been presented. This paper proposes an improved version of PrePost, the N-list and Subsume-based algorithm for mining Frequent Itemsets (NSFI) algorithm that uses a hash table to enhance the process of creating the N-lists associated with 1-itemsets and an improved N-list intersection algorithm. Furthermore, two new theorems are proposed for determining the “subsume index” of frequent 1-itemsets based on the N-list concept. Using the subsume index, NSFI can identify groups of frequent itemsets without determining the N-list associated with them. The experimental results show that NSFI outperforms PrePost in terms of runtime and memory usage and outperforms dEclat in terms of runtime.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Study of Effective Mining Algorithms for Frequent Itemsets

Mining Frequent Sequences Using Itemset-Based Extension

A high utility itemset mining algorithm based on subsume index

Article 09 December 2015

Wei Song, Zihan Zhang & Jinhong Li

Notes

Downloaded from http://fimi.cs.helsinki.fi/data/.

References

Agrawal R, Imielinski T, Swami AN (1993) Mining association rules between sets of items in large databases. In: Proceeding of the SIGMOD’93, pp 207–216
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceeding of the VLDB’94, pp 487–499
Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceeding of the 11th international conference on data engineering, Taipei, Taiwan, pp 3–14
Ayres J, Gehrke JE, Yiu T, Flannick J (2002) Sequential pattern mining using a bitmap representation. In: Proceeding of the SIGKDD’02, pp 429–435
Baralis E, Cerquitelli T, Chiusano S (2010) Constrained itemset mining on a sequence of incoming data blocks. Int J Intell Syst 25(5):389–410
MATH Google Scholar
Chen G, Liu H, Yu L, Wei Q, Zhang X (2006) A new approach to classification based on association rule mining. Decis Support Syst 42(2):674–689
Article Google Scholar
Coenen F, Leng P, Zhang L (2007) The effect of threshold values on association rule based classification accuracy. Data Knowl Eng 60(2):345–360
Article Google Scholar
Deng Z, Fang G, Wang Z, Xu X (2009) Mining erasable itemsets. In: Proceeding of the ICMLC’09, pp 67–73
Deng Z, Wang Z (2010) A new fast vertical method for mining frequent patterns. Int J Comput Intell Syst 3(6):733–744
Article MathSciNet Google Scholar
Deng Z, Wang Z, Jiang JJ (2012) A new algorithm for fast mining frequent itemsets using N-lists. Sci China Inform Sci 55(9):2008–2030
Article MathSciNet MATH Google Scholar
Dong J, Han M (2007) BitTableFI: an efficient mining frequent itemsets algorithm. Knowl Based Syst 20:329–335
Article Google Scholar
Fournier-Viger P, Faghihi U, Nkambou R, Nguifo EM (2012) CMRules: an efficient algorithm for mining sequential rules common to several sequences. Knowl Based Syst 25(1):63–76
Article Google Scholar
Grahne G, Zhu J (2005) Fast algorithms for frequent itemset mining using FP-trees. IEEE Trans Knowl Data Eng 17:1347–1362
Article Google Scholar
Gouda K, Hassaan M, Zaki MJ (2010) PRISM: a primal-encoding approach for frequent sequence mining. J Comput Syst Sci 76(1):88–102
Article MathSciNet MATH Google Scholar
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceeding of the SIGMODKDD’00, pp 1–12
Le T, Vo B, Coenen F (2013) An efficient algorithm for mining erasable itemsets using the difference of NC-Sets. In: Proceeding of the IEEE SMC’13, pp 2270–2274
Le T, Vo B (2014) MEI: an efficient algorithm for mining erasable itemsets. Eng Appl Artif Intell 27(1):155–166
Article Google Scholar
Li W, Han J, Pei J (2001) CMAR: accurate and efficient classification based on multiple class-association rules. In: Proceeding of the ICDM’01, pp 369–376
Lim AHL, Lee CS (2010) Processing online analytics with classification and association rule mining. Knowl Based Syst 23(3):248–255
Article MathSciNet Google Scholar
Liu B, Hsu W, Ma Y (1998) Integrating classification and association rule mining. In: Proceeding of the SIGKDD’98, pp 80–86
Liu L, Yu Z, Guo J, Mao C, Hong X (2013) Chinese question classification based on question property kernel. Int J Mach Learn Cybern. doi:10.1007/s13042-013-0216-y
Google Scholar
Lucchese B, Orlando S, Perego R (2006) Fast and memory efficient mining of frequent closed itemsets. IEEE Trans Knowl Data Eng 18(1):21–36
Article Google Scholar
Mohamed MH, Darwieesh MM (2013) Efficient mining frequent itemsets algorithms. Int J Mach Learn Cybern. doi:10.1007/s13042-013-0172-6
Google Scholar
Nguyen LTT, Vo B, Hong TP, Hoang CT (2012) Classification based on association rules: a lattice-based approach. Expert Syst Appl 39(13):11357–11366
Article Google Scholar
Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Efficient mining of association rules using closed itemset lattices. Inform Syst 24(1):25–46
Article MATH Google Scholar
Pham TT, Luo JW, Hong TP, Vo B (2012) MSGPs: a novel algorithm for mining sequential generator patterns. In: Proceeding of the ICCCI’12, pp 393–402
Song W, Yang B, Xu Z (2008) Index-BitTableFI: an improved algorithm for mining frequent itemsets. Knowl Based Syst 21:507–513
Article Google Scholar
Veloso A, Meira W Jr, Goncalves M, Almeida HM, Zaki MJ (2011) Calibrated lazy associative classification. Inform Sci 181(13):2656–2670
Article Google Scholar
Vo B, Coenen F, Le B (2013) A new method for mining Frequent Weighted Itemsets based on WIT-trees. Expert Syst Appl 40(4):1256–1264
Article Google Scholar
Vo B, Hong TP, Le B (2013) A lattice-based approach for mining most generalization association rules. Knowl Based Syst 45:20–30
Article Google Scholar
Vo B, Le B (2011) Interestingness measures for mining association rules: combination between lattice and hash tables. Expert Syst Appl 38(9):11630–11640
Article Google Scholar
Vo B, Le T, Coenen F, Hong TP (2013) A hybrid approach for mining frequent itemsets. In: Proceeding of the IEEE SMC’13, pp 4647–4651
Zaki MJ (2004) Mining non-redundant association rules. Data Min Knowl Disc 9(3):223–248
Article MathSciNet Google Scholar
Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997) New algorithms for fast discovery of association rules. In: Proceeding of KDD’97, pp 283–286
Zaki MJ, Hsiao CJ (2005) Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Trans Knowl Data Eng 17(4):462–478
Article Google Scholar
Zhang X, Chen G, Wei Q (2011) Building a highly-compact and accurate associative classifier. Appl Intell 34(1):74–86
Article Google Scholar
Zhao S, Tsang ECC, Chen D, Wang XZ (2010) Building a rule-based classifier—a fuzzy-rough set approach. IEEE Trans Knowl Data Eng 22(5):624–638
Article Google Scholar

Download references

Acknowledgments

This research was funded by Foundation for Science and Technology Development of Ton Duc Thang University (FOSTECT), website: http://fostect.tdt.edu.vn, under Grant FOSTECT.2014.BR.07.

Author information

Authors and Affiliations

Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam
Bay Vo & Tuong Le
Department of Computer Science, University of Liverpool, Liverpool, UK
Frans Coenen
Department of Computer Science and Information Engineering, National University of Kaohsiung, Kaohsiung, Taiwan
Tzung-Pei Hong

Authors

Bay Vo
View author publications
You can also search for this author in PubMed Google Scholar
Tuong Le
View author publications
You can also search for this author in PubMed Google Scholar
Frans Coenen
View author publications
You can also search for this author in PubMed Google Scholar
Tzung-Pei Hong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tuong Le.

Additional information

This paper is a expanded version of the paper “A hybrid approach for mining frequent itemsets” [32] presented in IEEE International Conference on Systems, Man, and Cybernetics 2013.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vo, B., Le, T., Coenen, F. et al. Mining frequent itemsets using the N-list and subsume concepts. Int. J. Mach. Learn. & Cyber. 7, 253–265 (2016). https://doi.org/10.1007/s13042-014-0252-2

Download citation

Received: 27 December 2013
Accepted: 02 April 2014
Published: 26 April 2014
Issue Date: April 2016
DOI: https://doi.org/10.1007/s13042-014-0252-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mining frequent itemsets using the N-list and subsume concepts

Abstract

Access this article

Similar content being viewed by others

Study of Effective Mining Algorithms for Frequent Itemsets

Mining Frequent Sequences Using Itemset-Based Extension

A high utility itemset mining algorithm based on subsume index

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

Study of Effective Mining Algorithms for Frequent Itemsets

Mining Frequent Sequences Using Itemset-Based Extension

A high utility itemset mining algorithm based on subsume index

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation