An efficient algorithm for mining top-rank-k frequent patterns

Dam, Thu-Lan; Li, Kenli; Fournier-Viger, Philippe; Duong, Quang-Huy

doi:10.1007/s10489-015-0748-9

An efficient algorithm for mining top-rank-k frequent patterns

Published: 28 January 2016

Volume 45, pages 96–111, (2016)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Thu-Lan Dam^1,2,3,
Kenli Li^1,3,4,
Philippe Fournier-Viger⁵ &
…
Quang-Huy Duong¹

725 Accesses
1 Altmetric
Explore all metrics

Abstract

Mining top-rank-k frequent patterns is a popular data mining task, which consists of discovering the patterns in a transaction database that belong to the k first ranks in terms of support. Although, several algorithms have been proposed for this task, it remains computationally expensive. To address this issue, this paper proposes a novel algorithm named BTK. It relies on a novel tree structure named TB-tree to store crucial information about frequent patterns. Moreover, BTK employs a new B-list structure to store information about patterns, and relies on subsume indexes to reduce the search space and speed up the discovery of top-rank-k frequent patterns. BTK also uses an early pruning strategy and an effective threshold raising mechanism. Additionally, BTK introduces two efficient procedures for respectively generating subsume indexes and intersecting B-lists. Extensive experiments were conducted on several datasets to evaluate the efficiency of the proposed algorithm. Results show that BTK is highly efficient and competitive.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SS-FIM: Single Scan for Frequent Itemsets Mining in Transactional Databases

Fast Top-K association rule mining using rule generation property pruning

Article 26 October 2020

TKEH: an efficient algorithm for mining top-k high utility itemsets

Article 25 October 2018

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

Downloaded from FIMI repository http://fimi.ua.ac.be/data/.
Downloaded from http://cgi.csc.liv.ac.uk/~frans/KDD/Software/LUCS-KDD-DataGen/generator.html.

References

Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. SIGMOD RECORD 22(2):207–216
Article Google Scholar
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. VLDB:487–499
Burdick D, Calimlim M, Flannick J, Gehrke J, Yiu TM (2005) MAFIA: A maximal frequent itemset algorithm. IEEE Transactions on Knowledge and Data Engineering 17(11):1490–1504
Article Google Scholar
Deng Z (2013) Mining top-rank-k erasable itemsets by PID_lists. International Journal of Intelligent Systems 28(4):366–379
Article Google Scholar
Deng Z, Wang Z (2010) A new fast vertical method for mining frequent patterns. International Journal of Computational Intelligence Systems 3(6):733–744
Article MathSciNet Google Scholar
Deng ZH (2014) Fast mining top-rank-k frequent patterns by using Node-lists. Expert Systems with Applications 41(4):1763–1768
Article Google Scholar
Deng ZH, Fang GD (2007) Mining top-rank-k frequent patterns
Deng ZH, Lv SL (2014) Fast mining frequent itemsets using Nodesets. Expert Systems with Applications 41(10):4505–4512
Article Google Scholar
Deng ZH, Lv SL (2015) PrePost+: An efficient N-lists-based algorithm for mining frequent itemsets via Children? Parent Equivalence pruning. Expert Systems with Applications 42(13):5424–5432
Article Google Scholar
Deng ZH, Wang ZH, Jiang JJ (2012) A new algorithm for fast mining frequent itemsets using N-lists. Science China Information Sciences 55(9):2008–2030
Article MathSciNet MATH Google Scholar
Fang GD, Deng ZH (2008) VTK: Vertical Mining of Top-Rank-K Frequent Patterns. Fifth International Conference on Fuzzy Systems and Knowledge Discovery, Vol 2. Proceedings
Fournier-Viger P, Lin JCW, Gueniche T, Barhate P (2015) Efficient incremental high utility itemset mining
Fournier-Viger P, Wu CW, Zida S, Tseng V (2014) FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning
Fournier-Viger P, Zida S (2015) FOSHU: Faster On-shelf High Utility Itemset Mining – with or Without Negative Unit Profit. ACM, New York, NY, USA, pp 857–864
Google Scholar
Han JW, Pei J, Yin YW (2000) Mining frequent patterns without candidate generation. SIGMOD RECORD 29(2):1–12
Article Google Scholar
Han JW, Pei J, Yin YW (2004) Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery 8(1):53–87
Article MathSciNet Google Scholar
Han JW, Wang JY, Lu Y, Tzvetkov P (2002) Mining top-k frequent closed patterns without minimum support. In: IEEE International Conference on Data Mining, Proceedings (2002)
Koufakou A (2013) Mining non-derivable hypercliques. Knowledge and Information Systems 41(1):77–99
Article Google Scholar
Le T, Vo B (2015) An N-list-based algorithm for mining frequent closed patterns. Expert Systems with Applications 42(19):6648–6657
Article Google Scholar
Nguyen G, Le T, Vo B, Le B (2015) EIFDD: An efficient approach for erasable itemset mining of very dense datasets. Applied Intelligence 43(1):85–94
Article Google Scholar
Quyen HTL, Tuong L, Vo B, Bac L (2015) An efficient and effective algorithm for mining top-rank- k frequent patterns. Expert Systems with Applications 42(1):156–164
Article Google Scholar
Ryang H, Yun U (2015) Top-k high utility pattern mining with effective threshold raising strategies. Knowledge-Based Systems 76(0):109–126
Article Google Scholar
Song W, Yang BR, Xu ZY (2008) Index-BitTableFI: An improved algorithm for mining frequent itemsets. Knowledge-Based Systems 21(6):507–513
Article Google Scholar
Tsay YJ, Chiang JY (2005) CBAR: an efficient method for mining association rules. Knowledge-Based Systems 18(2-3):99–105
Article Google Scholar
Tseng V, Wu CW, Fournier-Viger P, Yu P (2015) Efficient algorithms for mining the concise and lossless representation of high utility itemsets. Knowledge and Data Engineering. IEEE Transactions on 27(3):726–739
Google Scholar
Vimieiro R, Moscato P (2014) Disclosed: An efficient depth-first, top-down algorithm for mining disjunctive closed itemsets in high-dimensional data. Information Sciences 280:171–187
Article MathSciNet Google Scholar
Vo B, Le T, Coenen F, Hong TP (2014) Mining frequent itemsets using the N-list and subsume concepts. International Journal of Machine Learning and Cybernetics:1–13
Wang JY, Han JW, Lu Y, Tzvetkov P (2005) TFP: An efficient algorithm for mining top-k frequent closed itemsets. IEEE Transactions on Knowledge and Data Engineering 17(5):652–664
Article Google Scholar
Xiong H, Tan PN, Kumar V (2006) Hyperclique pattern discovery. Data Mining and Knowledge Discovery 13(2):219–242
Article MathSciNet Google Scholar
Yun U, Ryang H (2015) Incremental high utility pattern mining with static and dynamic databases. Applied Intelligence 42(2):323–352
Article Google Scholar
Zaki MJ, Gouda K (2003) Fast vertical mining using Diffsets
Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997) New algorithms for fast discovery of association rules. Proceedings of the Third International Conference on Knowledge Discovery and Data Mining:283–6

Download references

Acknowledgments

This research was funded by the National Natural Science Foundation of China (Grant Nos. 61133005, 61432005, 61370095, 61472124, 61202109, and 61472126) and the International Science & Technology Cooperation Program of China (Grant No. 2015DFA11240,2014DFBS0010). T-L. Dam was also partially supported by science research fund of Hanoi University of Industry, Hanoi, Vietnam.

Author information

Authors and Affiliations

College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
Thu-Lan Dam, Kenli Li & Quang-Huy Duong
Faculty of Information Technology, Hanoi University of Industry, Hanoi, Vietnam
Thu-Lan Dam
CIC of HPC, National University of Defense Technology, Changsha, 410073, China
Thu-Lan Dam & Kenli Li
National Supercomputing Center in Changsha, Changsha, Hunan, 410082, China
Kenli Li
School of Natural Sciences and Humanities, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, 518055, China
Philippe Fournier-Viger

Authors

Thu-Lan Dam
View author publications
You can also search for this author inPubMed Google Scholar
Kenli Li
View author publications
You can also search for this author inPubMed Google Scholar
Philippe Fournier-Viger
View author publications
You can also search for this author inPubMed Google Scholar
Quang-Huy Duong
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Kenli Li.

Ethics declarations

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Conflict of interests

The authors declare that they have no conflict of interest.

Funding

This research was funded by the National Natural Science Foundation of China (Grant Nos. 61133005, 61432005, 61370095, 61472124, 61202109, and 61472126) and the International Science & Technology Cooperation Program of China (Grant No. 2015DFA11240). T-L. Dam was also partially supported by science research fund of Hanoi University of Industry, Hanoi, Vietnam.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dam, TL., Li, K., Fournier-Viger, P. et al. An efficient algorithm for mining top-rank-k frequent patterns. Appl Intell 45, 96–111 (2016). https://doi.org/10.1007/s10489-015-0748-9

Download citation

Published: 28 January 2016
Issue Date: July 2016
DOI: https://doi.org/10.1007/s10489-015-0748-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient algorithm for mining top-rank-k frequent patterns

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

SS-FIM: Single Scan for Frequent Itemsets Mining in Transactional Databases

Fast Top-K association rule mining using rule generation property pruning

TKEH: an efficient algorithm for mining top-k high utility itemsets

Explore related subjects

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical approval

Conflict of interests

Funding

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now