Abstract
Mining top-rank-k frequent patterns is a popular data mining task, which consists of discovering the patterns in a transaction database that belong to the k first ranks in terms of support. Although, several algorithms have been proposed for this task, it remains computationally expensive. To address this issue, this paper proposes a novel algorithm named BTK. It relies on a novel tree structure named TB-tree to store crucial information about frequent patterns. Moreover, BTK employs a new B-list structure to store information about patterns, and relies on subsume indexes to reduce the search space and speed up the discovery of top-rank-k frequent patterns. BTK also uses an early pruning strategy and an effective threshold raising mechanism. Additionally, BTK introduces two efficient procedures for respectively generating subsume indexes and intersecting B-lists. Extensive experiments were conducted on several datasets to evaluate the efficiency of the proposed algorithm. Results show that BTK is highly efficient and competitive.


















Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Downloaded from FIMI repository http://fimi.ua.ac.be/data/.
References
Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. SIGMOD RECORD 22(2):207–216
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. VLDB:487–499
Burdick D, Calimlim M, Flannick J, Gehrke J, Yiu TM (2005) MAFIA: A maximal frequent itemset algorithm. IEEE Transactions on Knowledge and Data Engineering 17(11):1490–1504
Deng Z (2013) Mining top-rank-k erasable itemsets by PID_lists. International Journal of Intelligent Systems 28(4):366–379
Deng Z, Wang Z (2010) A new fast vertical method for mining frequent patterns. International Journal of Computational Intelligence Systems 3(6):733–744
Deng ZH (2014) Fast mining top-rank-k frequent patterns by using Node-lists. Expert Systems with Applications 41(4):1763–1768
Deng ZH, Fang GD (2007) Mining top-rank-k frequent patterns
Deng ZH, Lv SL (2014) Fast mining frequent itemsets using Nodesets. Expert Systems with Applications 41(10):4505–4512
Deng ZH, Lv SL (2015) PrePost+: An efficient N-lists-based algorithm for mining frequent itemsets via Children? Parent Equivalence pruning. Expert Systems with Applications 42(13):5424–5432
Deng ZH, Wang ZH, Jiang JJ (2012) A new algorithm for fast mining frequent itemsets using N-lists. Science China Information Sciences 55(9):2008–2030
Fang GD, Deng ZH (2008) VTK: Vertical Mining of Top-Rank-K Frequent Patterns. Fifth International Conference on Fuzzy Systems and Knowledge Discovery, Vol 2. Proceedings
Fournier-Viger P, Lin JCW, Gueniche T, Barhate P (2015) Efficient incremental high utility itemset mining
Fournier-Viger P, Wu CW, Zida S, Tseng V (2014) FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning
Fournier-Viger P, Zida S (2015) FOSHU: Faster On-shelf High Utility Itemset Mining – with or Without Negative Unit Profit. ACM, New York, NY, USA, pp 857–864
Han JW, Pei J, Yin YW (2000) Mining frequent patterns without candidate generation. SIGMOD RECORD 29(2):1–12
Han JW, Pei J, Yin YW (2004) Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery 8(1):53–87
Han JW, Wang JY, Lu Y, Tzvetkov P (2002) Mining top-k frequent closed patterns without minimum support. In: IEEE International Conference on Data Mining, Proceedings (2002)
Koufakou A (2013) Mining non-derivable hypercliques. Knowledge and Information Systems 41(1):77–99
Le T, Vo B (2015) An N-list-based algorithm for mining frequent closed patterns. Expert Systems with Applications 42(19):6648–6657
Nguyen G, Le T, Vo B, Le B (2015) EIFDD: An efficient approach for erasable itemset mining of very dense datasets. Applied Intelligence 43(1):85–94
Quyen HTL, Tuong L, Vo B, Bac L (2015) An efficient and effective algorithm for mining top-rank- k frequent patterns. Expert Systems with Applications 42(1):156–164
Ryang H, Yun U (2015) Top-k high utility pattern mining with effective threshold raising strategies. Knowledge-Based Systems 76(0):109–126
Song W, Yang BR, Xu ZY (2008) Index-BitTableFI: An improved algorithm for mining frequent itemsets. Knowledge-Based Systems 21(6):507–513
Tsay YJ, Chiang JY (2005) CBAR: an efficient method for mining association rules. Knowledge-Based Systems 18(2-3):99–105
Tseng V, Wu CW, Fournier-Viger P, Yu P (2015) Efficient algorithms for mining the concise and lossless representation of high utility itemsets. Knowledge and Data Engineering. IEEE Transactions on 27(3):726–739
Vimieiro R, Moscato P (2014) Disclosed: An efficient depth-first, top-down algorithm for mining disjunctive closed itemsets in high-dimensional data. Information Sciences 280:171–187
Vo B, Le T, Coenen F, Hong TP (2014) Mining frequent itemsets using the N-list and subsume concepts. International Journal of Machine Learning and Cybernetics:1–13
Wang JY, Han JW, Lu Y, Tzvetkov P (2005) TFP: An efficient algorithm for mining top-k frequent closed itemsets. IEEE Transactions on Knowledge and Data Engineering 17(5):652–664
Xiong H, Tan PN, Kumar V (2006) Hyperclique pattern discovery. Data Mining and Knowledge Discovery 13(2):219–242
Yun U, Ryang H (2015) Incremental high utility pattern mining with static and dynamic databases. Applied Intelligence 42(2):323–352
Zaki MJ, Gouda K (2003) Fast vertical mining using Diffsets
Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997) New algorithms for fast discovery of association rules. Proceedings of the Third International Conference on Knowledge Discovery and Data Mining:283–6
Acknowledgments
This research was funded by the National Natural Science Foundation of China (Grant Nos. 61133005, 61432005, 61370095, 61472124, 61202109, and 61472126) and the International Science & Technology Cooperation Program of China (Grant No. 2015DFA11240,2014DFBS0010). T-L. Dam was also partially supported by science research fund of Hanoi University of Industry, Hanoi, Vietnam.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Conflict of interests
The authors declare that they have no conflict of interest.
Funding
This research was funded by the National Natural Science Foundation of China (Grant Nos. 61133005, 61432005, 61370095, 61472124, 61202109, and 61472126) and the International Science & Technology Cooperation Program of China (Grant No. 2015DFA11240). T-L. Dam was also partially supported by science research fund of Hanoi University of Industry, Hanoi, Vietnam.
Rights and permissions
About this article
Cite this article
Dam, TL., Li, K., Fournier-Viger, P. et al. An efficient algorithm for mining top-rank-k frequent patterns. Appl Intell 45, 96–111 (2016). https://doi.org/10.1007/s10489-015-0748-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-015-0748-9