Skip to main content
Log in

Efficient algorithms for deriving complete frequent itemsets from frequent closed itemsets

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

When mining frequent itemsets (abbr. FIs) from dense datasets, it usually produces too many itemsets and results in the mining task to suffer from a very long execution time and high memory consumption. Frequent closed itemset (abbr. FCI) is a compact and lossless representation of FI. Mining FCIs can not only reduce the execution time and memory usage, but also reserve the complete information of FIs derived from FCIs. Although many studies have been proposed with various efficient methods for mining FCIs, few of them have developed algorithms for efficiently deriving FIs from FCIs. In this work, we propose two efficient algorithms named DFI-List and DFI-Growth for efficiently deriving FIs from FCIs. The both algorithms adopt depth-first search and divide-and-conquer methodology to derive all the FIs. DFI-List efficiently derives all the FIs with a vertical index structure called Cid List. DFI-Growth compresses the information of FCIs into tree structures and applies pattern-growth strategy to derive FIs from the trees. Empirical experiments show that DFI-List is the most efficient and scalable algorithm on the dense datasets. For example, when the minimum support threshold is set to 50% on the Chess dataset, DFI-List runs faster than LevelWise (Pasquier et al. Inf Syst 24(1): 25-46, 1999b) over 100 times. As for DFI-Growth, it is the most stable and memory efficient algorithm on the sparse datasets. Both DFI-Growth and DFI-List are superior to the state-of-the-art algorithm (Pasquier et al. Inf Syst 24(1): 25-46, 199b) in terms of execution time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29
Fig. 30
Fig. 31
Fig. 32
Fig. 33
Fig. 34

Similar content being viewed by others

References

  1. Agrawal R, Srikant R, et al. (1994) Fast algorithms for mining association rules. In: Proc. 20th int. conf. very large data bases, VLDB, vol 1215. Citeseer, pp 487–499

  2. Aryabarzan N, Minaei-Bidgoli B, Teshnehlab M (2018) negfin: An efficient algorithm for fast mining frequent itemsets. Expert Systems with Applications 105:129–143

    Article  Google Scholar 

  3. Boulicaut JF, Bykowski A, Rigotti C (2000) Approximation of frequency queries by means of free-sets. In: European conference on principles of data mining and knowledge discovery. Springer, pp 75–85

  4. Calders T, Goethals B (2002) Mining all non-derivable frequent itemsets. In: European conference on principles of data mining and knowledge discovery. Springer, pp 74–86

  5. Deng ZH (2016) Diffnodesets: An efficient structure for fast mining frequent itemsets. Appl Soft Comput 41:214–223

    Article  Google Scholar 

  6. El-Hajj M, Zaiane OR (2003) Cofi-tree mining: a new approach to pattern growth with reduced candidacy generation. In: Workshop on frequent itemset mining implementations (FIMI’03) in conjunction with IEEE-ICDM

  7. Fournier-Viger P, Gomariz A, Gueniche T, Soltani A, Wu CW, Tseng VS, et al. (2014) Spmf: A java open-source pattern mining library. J Mach Learn Res 15(1):3389–3393

    MATH  Google Scholar 

  8. Gouda K, Zaki MJ (2005) Genmax: An efficient algorithm for mining maximal frequent itemsets. Data Min Knowl Disc 11(3):223–242

    Article  MathSciNet  Google Scholar 

  9. Gupta S, Mamtora R (2014) A survey on association rule mining in market basket analysis. Int J Inf Comput Technol 4(4):409–414

    Google Scholar 

  10. Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. ACM sigmod record 29(2):1–12

    Article  Google Scholar 

  11. Huang J, Lai YP, Lo C, Wu CW (2019) An efficient algorithm for deriving frequent itemsets from lossless condensed representation. In: International conference on industrial, engineering and other applications of applied intelligent systems. Springer, pp 216–229

  12. Kim D, Yun U (2016) Efficient mining of high utility pattern with considering of rarity and length. Appl Intell 45(1):152–173

    Article  Google Scholar 

  13. Kim D, Yun U (2017) Efficient algorithm for mining high average-utility itemsets in incremental transaction databases. Appl Intell 47(1):114–131

    Article  Google Scholar 

  14. Le T, Vo B (2015) An n-list-based algorithm for mining frequent closed patterns. Expert Syst Appl 42(19):6648–6657

  15. Lee G, Yun U (2017) A new efficient approach for mining uncertain frequent patterns using minimum data structure without false positives. Futur Gener Comput Syst 68:89–110

    Article  Google Scholar 

  16. Lee G, Yun U, Ryang H, Kim D (2016) Approximate maximal frequent pattern mining with weight conditions and error tolerance. Int J Pattern Recogn Artif Intell 30(06):1650012

    Article  Google Scholar 

  17. Liu J, Shang J, Wang C, Ren X, Han J (2015) Mining quality phrases from massive text corpora. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data, pp 1729–1744

  18. Lucchese C, Orlando S, Perego R (2005) Fast and memory efficient mining of frequent closed itemsets. IEEE Trans Knowl Data Eng 18(1):21–36

    Article  Google Scholar 

  19. Park JS, Chen MS, Yu PS (1995) An effective hash-based algorithm for mining association rules. Acm sigmod record 24(2):175–186

    Article  Google Scholar 

  20. Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: International conference on database theory. Springer, pp 398–416

  21. Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Efficient mining of association rules using closed itemset lattices. Inf Syst 24(1):25–46

    Article  Google Scholar 

  22. Pei J, Han J, Lu H, Nishio S, Tang S, Yang D (2001) H-mine: Hyper-structure mining of frequent patterns in large databases. In: Proceedings 2001 IEEE international conference on data mining. IEEE, pp 441–448

  23. Prabha S, Shanmugapriya S, Duraiswamy K (2013) A survey on closed frequent pattern mining. Int J Comput Appl 63(14)

  24. Ryang H, Yun U (2017) Indexed list-based high utility pattern mining with utility upper-bound reduction and pattern combination techniques. Knowl Inf Syst 51(2):627–659

    Article  Google Scholar 

  25. Ting S, Shum C, Kwok SK, Tsang AH, Lee WB, et al. (2009) Data mining in biomedicine: Current applications and further directions for research. J Softw Eng Appl 2(03):150

    Article  Google Scholar 

  26. Wang J, Han J, Pei J (2003) Closet+ searching for the best strategies for mining frequent closed itemsets. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining, pp 236–245

  27. Yun U, Lee G, Yoon E (2017) Efficient high utility pattern mining for establishing manufacturing plans with sliding window control. IEEE Trans Ind Electron 64(9):7239–7249

    Article  Google Scholar 

  28. Yun U, Ryang H, Lee G, Fujita H (2017) An efficient algorithm for mining high utility patterns from incremental databases with one database scan. Knowl-Based Syst 124:188–206

    Article  Google Scholar 

  29. Zaki MJ (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12 (3):372–390

    Article  Google Scholar 

  30. Zaki MJ, Hsiao CJ (2002) Charm: An efficient algorithm for closed itemset mining. In: Proceedings of the 2002 SIAM international conference on data mining. SIAM, pp 457–473

  31. Zhang Q, Segall RS (2008) Web mining: A survey of current research, techniques, and software. Int J Inf Technol Decision Making 7(04):683–720

    Article  Google Scholar 

  32. Source code of the implemented dfi-growth algorithm released in spmf. http://www.philippe-fournier-viger.com/spmf/DFI-Growth.php

  33. Source code of the implemented dfi-list algorithm released in spmf. http://www.philippe-fournier-viger.com/spmf/DFI-List.php

  34. Source code of the implemented levelwise algorithm released in spmf. http://www.philippe-fournier-viger.com/spmf/LevelWise.php

Download references

Acknowledgments

This work is partially supported by Ministry of Science and Technology, Taiwan, under Grant No. 109-2221-E-197-027 and 109-2634-F-009-026 through Pervasive Artificial Intelligence Research (PAIR) Labs.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cheng-Wei Wu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special issue on Artificial intelligence in practice - from theory to application

Guest Editors: Franz Wotawa, Gerhard Friedrich and Ingo Pill

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, CW., Huang, J., Lin, YW. et al. Efficient algorithms for deriving complete frequent itemsets from frequent closed itemsets. Appl Intell 52, 7002–7023 (2022). https://doi.org/10.1007/s10489-020-02172-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-02172-7

Keywords

Navigation