Efficient algorithms for deriving complete frequent itemsets from frequent closed itemsets

Wu, Cheng-Wei; Huang, JianTao; Lin, Yun-Wei; Chuang, Chien-Yu; Tseng, Yu-Chee

doi:10.1007/s10489-020-02172-7

Efficient algorithms for deriving complete frequent itemsets from frequent closed itemsets

Published: 11 April 2021

Volume 52, pages 7002–7023, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Cheng-Wei Wu ORCID: orcid.org/0000-0002-5656-015X¹,
JianTao Huang¹,
Yun-Wei Lin¹,
Chien-Yu Chuang¹ &
…
Yu-Chee Tseng²

403 Accesses
3 Citations
Explore all metrics

Abstract

When mining frequent itemsets (abbr. FIs) from dense datasets, it usually produces too many itemsets and results in the mining task to suffer from a very long execution time and high memory consumption. Frequent closed itemset (abbr. FCI) is a compact and lossless representation of FI. Mining FCIs can not only reduce the execution time and memory usage, but also reserve the complete information of FIs derived from FCIs. Although many studies have been proposed with various efficient methods for mining FCIs, few of them have developed algorithms for efficiently deriving FIs from FCIs. In this work, we propose two efficient algorithms named DFI-List and DFI-Growth for efficiently deriving FIs from FCIs. The both algorithms adopt depth-first search and divide-and-conquer methodology to derive all the FIs. DFI-List efficiently derives all the FIs with a vertical index structure called Cid List. DFI-Growth compresses the information of FCIs into tree structures and applies pattern-growth strategy to derive FIs from the trees. Empirical experiments show that DFI-List is the most efficient and scalable algorithm on the dense datasets. For example, when the minimum support threshold is set to 50% on the Chess dataset, DFI-List runs faster than LevelWise (Pasquier et al. Inf Syst 24(1): 25-46, 1999b) over 100 times. As for DFI-Growth, it is the most stable and memory efficient algorithm on the sparse datasets. Both DFI-Growth and DFI-List are superior to the state-of-the-art algorithm (Pasquier et al. Inf Syst 24(1): 25-46, 199b) in terms of execution time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Article 12 April 2024

A survey of density based clustering algorithms

Article 29 September 2020

Graph Databases: Their Power and Limitations

References

Agrawal R, Srikant R, et al. (1994) Fast algorithms for mining association rules. In: Proc. 20th int. conf. very large data bases, VLDB, vol 1215. Citeseer, pp 487–499
Aryabarzan N, Minaei-Bidgoli B, Teshnehlab M (2018) negfin: An efficient algorithm for fast mining frequent itemsets. Expert Systems with Applications 105:129–143
Article Google Scholar
Boulicaut JF, Bykowski A, Rigotti C (2000) Approximation of frequency queries by means of free-sets. In: European conference on principles of data mining and knowledge discovery. Springer, pp 75–85
Calders T, Goethals B (2002) Mining all non-derivable frequent itemsets. In: European conference on principles of data mining and knowledge discovery. Springer, pp 74–86
Deng ZH (2016) Diffnodesets: An efficient structure for fast mining frequent itemsets. Appl Soft Comput 41:214–223
Article Google Scholar
El-Hajj M, Zaiane OR (2003) Cofi-tree mining: a new approach to pattern growth with reduced candidacy generation. In: Workshop on frequent itemset mining implementations (FIMI’03) in conjunction with IEEE-ICDM
Fournier-Viger P, Gomariz A, Gueniche T, Soltani A, Wu CW, Tseng VS, et al. (2014) Spmf: A java open-source pattern mining library. J Mach Learn Res 15(1):3389–3393
MATH Google Scholar
Gouda K, Zaki MJ (2005) Genmax: An efficient algorithm for mining maximal frequent itemsets. Data Min Knowl Disc 11(3):223–242
Article MathSciNet Google Scholar
Gupta S, Mamtora R (2014) A survey on association rule mining in market basket analysis. Int J Inf Comput Technol 4(4):409–414
Google Scholar
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. ACM sigmod record 29(2):1–12
Article Google Scholar
Huang J, Lai YP, Lo C, Wu CW (2019) An efficient algorithm for deriving frequent itemsets from lossless condensed representation. In: International conference on industrial, engineering and other applications of applied intelligent systems. Springer, pp 216–229
Kim D, Yun U (2016) Efficient mining of high utility pattern with considering of rarity and length. Appl Intell 45(1):152–173
Article Google Scholar
Kim D, Yun U (2017) Efficient algorithm for mining high average-utility itemsets in incremental transaction databases. Appl Intell 47(1):114–131
Article Google Scholar
Le T, Vo B (2015) An n-list-based algorithm for mining frequent closed patterns. Expert Syst Appl 42(19):6648–6657
Lee G, Yun U (2017) A new efficient approach for mining uncertain frequent patterns using minimum data structure without false positives. Futur Gener Comput Syst 68:89–110
Article Google Scholar
Lee G, Yun U, Ryang H, Kim D (2016) Approximate maximal frequent pattern mining with weight conditions and error tolerance. Int J Pattern Recogn Artif Intell 30(06):1650012
Article Google Scholar
Liu J, Shang J, Wang C, Ren X, Han J (2015) Mining quality phrases from massive text corpora. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data, pp 1729–1744
Lucchese C, Orlando S, Perego R (2005) Fast and memory efficient mining of frequent closed itemsets. IEEE Trans Knowl Data Eng 18(1):21–36
Article Google Scholar
Park JS, Chen MS, Yu PS (1995) An effective hash-based algorithm for mining association rules. Acm sigmod record 24(2):175–186
Article Google Scholar
Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: International conference on database theory. Springer, pp 398–416
Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Efficient mining of association rules using closed itemset lattices. Inf Syst 24(1):25–46
Article Google Scholar
Pei J, Han J, Lu H, Nishio S, Tang S, Yang D (2001) H-mine: Hyper-structure mining of frequent patterns in large databases. In: Proceedings 2001 IEEE international conference on data mining. IEEE, pp 441–448
Prabha S, Shanmugapriya S, Duraiswamy K (2013) A survey on closed frequent pattern mining. Int J Comput Appl 63(14)
Ryang H, Yun U (2017) Indexed list-based high utility pattern mining with utility upper-bound reduction and pattern combination techniques. Knowl Inf Syst 51(2):627–659
Article Google Scholar
Ting S, Shum C, Kwok SK, Tsang AH, Lee WB, et al. (2009) Data mining in biomedicine: Current applications and further directions for research. J Softw Eng Appl 2(03):150
Article Google Scholar
Wang J, Han J, Pei J (2003) Closet+ searching for the best strategies for mining frequent closed itemsets. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining, pp 236–245
Yun U, Lee G, Yoon E (2017) Efficient high utility pattern mining for establishing manufacturing plans with sliding window control. IEEE Trans Ind Electron 64(9):7239–7249
Article Google Scholar
Yun U, Ryang H, Lee G, Fujita H (2017) An efficient algorithm for mining high utility patterns from incremental databases with one database scan. Knowl-Based Syst 124:188–206
Article Google Scholar
Zaki MJ (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12 (3):372–390
Article Google Scholar
Zaki MJ, Hsiao CJ (2002) Charm: An efficient algorithm for closed itemset mining. In: Proceedings of the 2002 SIAM international conference on data mining. SIAM, pp 457–473
Zhang Q, Segall RS (2008) Web mining: A survey of current research, techniques, and software. Int J Inf Technol Decision Making 7(04):683–720
Article Google Scholar
Source code of the implemented dfi-growth algorithm released in spmf. http://www.philippe-fournier-viger.com/spmf/DFI-Growth.php
Source code of the implemented dfi-list algorithm released in spmf. http://www.philippe-fournier-viger.com/spmf/DFI-List.php
Source code of the implemented levelwise algorithm released in spmf. http://www.philippe-fournier-viger.com/spmf/LevelWise.php

Download references

Acknowledgments

This work is partially supported by Ministry of Science and Technology, Taiwan, under Grant No. 109-2221-E-197-027 and 109-2634-F-009-026 through Pervasive Artificial Intelligence Research (PAIR) Labs.

Author information

Authors and Affiliations

National I-Lan University, Yilan, Taiwan
Cheng-Wei Wu, JianTao Huang, Yun-Wei Lin & Chien-Yu Chuang
National Yang Ming Chiao Tung University, Yilan, Taiwan
Yu-Chee Tseng

Authors

Cheng-Wei Wu
View author publications
You can also search for this author in PubMed Google Scholar
JianTao Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yun-Wei Lin
View author publications
You can also search for this author in PubMed Google Scholar
Chien-Yu Chuang
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Chee Tseng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cheng-Wei Wu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special issue on Artificial intelligence in practice - from theory to application

Guest Editors: Franz Wotawa, Gerhard Friedrich and Ingo Pill

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, CW., Huang, J., Lin, YW. et al. Efficient algorithms for deriving complete frequent itemsets from frequent closed itemsets. Appl Intell 52, 7002–7023 (2022). https://doi.org/10.1007/s10489-020-02172-7

Download citation

Accepted: 22 December 2020
Published: 11 April 2021
Issue Date: April 2022
DOI: https://doi.org/10.1007/s10489-020-02172-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient algorithms for deriving complete frequent itemsets from frequent closed itemsets

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

A survey of density based clustering algorithms

Graph Databases: Their Power and Limitations

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient algorithms for deriving complete frequent itemsets from frequent closed itemsets

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

A survey of density based clustering algorithms

Graph Databases: Their Power and Limitations

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation