Skip to main content
Log in

An efficient algorithm for mining top-rank-k frequent patterns

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Mining top-rank-k frequent patterns is a popular data mining task, which consists of discovering the patterns in a transaction database that belong to the k first ranks in terms of support. Although, several algorithms have been proposed for this task, it remains computationally expensive. To address this issue, this paper proposes a novel algorithm named BTK. It relies on a novel tree structure named TB-tree to store crucial information about frequent patterns. Moreover, BTK employs a new B-list structure to store information about patterns, and relies on subsume indexes to reduce the search space and speed up the discovery of top-rank-k frequent patterns. BTK also uses an early pruning strategy and an effective threshold raising mechanism. Additionally, BTK introduces two efficient procedures for respectively generating subsume indexes and intersecting B-lists. Extensive experiments were conducted on several datasets to evaluate the efficiency of the proposed algorithm. Results show that BTK is highly efficient and competitive.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Notes

  1. Downloaded from FIMI repository http://fimi.ua.ac.be/data/.

  2. Downloaded from http://cgi.csc.liv.ac.uk/~frans/KDD/Software/LUCS-KDD-DataGen/generator.html.

References

  1. Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. SIGMOD RECORD 22(2):207–216

    Article  Google Scholar 

  2. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. VLDB:487–499

  3. Burdick D, Calimlim M, Flannick J, Gehrke J, Yiu TM (2005) MAFIA: A maximal frequent itemset algorithm. IEEE Transactions on Knowledge and Data Engineering 17(11):1490–1504

    Article  Google Scholar 

  4. Deng Z (2013) Mining top-rank-k erasable itemsets by PID_lists. International Journal of Intelligent Systems 28(4):366–379

    Article  Google Scholar 

  5. Deng Z, Wang Z (2010) A new fast vertical method for mining frequent patterns. International Journal of Computational Intelligence Systems 3(6):733–744

    Article  MathSciNet  Google Scholar 

  6. Deng ZH (2014) Fast mining top-rank-k frequent patterns by using Node-lists. Expert Systems with Applications 41(4):1763–1768

    Article  Google Scholar 

  7. Deng ZH, Fang GD (2007) Mining top-rank-k frequent patterns

  8. Deng ZH, Lv SL (2014) Fast mining frequent itemsets using Nodesets. Expert Systems with Applications 41(10):4505–4512

    Article  Google Scholar 

  9. Deng ZH, Lv SL (2015) PrePost+: An efficient N-lists-based algorithm for mining frequent itemsets via Children? Parent Equivalence pruning. Expert Systems with Applications 42(13):5424–5432

    Article  Google Scholar 

  10. Deng ZH, Wang ZH, Jiang JJ (2012) A new algorithm for fast mining frequent itemsets using N-lists. Science China Information Sciences 55(9):2008–2030

    Article  MathSciNet  MATH  Google Scholar 

  11. Fang GD, Deng ZH (2008) VTK: Vertical Mining of Top-Rank-K Frequent Patterns. Fifth International Conference on Fuzzy Systems and Knowledge Discovery, Vol 2. Proceedings

  12. Fournier-Viger P, Lin JCW, Gueniche T, Barhate P (2015) Efficient incremental high utility itemset mining

  13. Fournier-Viger P, Wu CW, Zida S, Tseng V (2014) FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning

  14. Fournier-Viger P, Zida S (2015) FOSHU: Faster On-shelf High Utility Itemset Mining – with or Without Negative Unit Profit. ACM, New York, NY, USA, pp 857–864

    Google Scholar 

  15. Han JW, Pei J, Yin YW (2000) Mining frequent patterns without candidate generation. SIGMOD RECORD 29(2):1–12

    Article  Google Scholar 

  16. Han JW, Pei J, Yin YW (2004) Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery 8(1):53–87

    Article  MathSciNet  Google Scholar 

  17. Han JW, Wang JY, Lu Y, Tzvetkov P (2002) Mining top-k frequent closed patterns without minimum support. In: IEEE International Conference on Data Mining, Proceedings (2002)

  18. Koufakou A (2013) Mining non-derivable hypercliques. Knowledge and Information Systems 41(1):77–99

    Article  Google Scholar 

  19. Le T, Vo B (2015) An N-list-based algorithm for mining frequent closed patterns. Expert Systems with Applications 42(19):6648–6657

    Article  Google Scholar 

  20. Nguyen G, Le T, Vo B, Le B (2015) EIFDD: An efficient approach for erasable itemset mining of very dense datasets. Applied Intelligence 43(1):85–94

    Article  Google Scholar 

  21. Quyen HTL, Tuong L, Vo B, Bac L (2015) An efficient and effective algorithm for mining top-rank- k frequent patterns. Expert Systems with Applications 42(1):156–164

    Article  Google Scholar 

  22. Ryang H, Yun U (2015) Top-k high utility pattern mining with effective threshold raising strategies. Knowledge-Based Systems 76(0):109–126

    Article  Google Scholar 

  23. Song W, Yang BR, Xu ZY (2008) Index-BitTableFI: An improved algorithm for mining frequent itemsets. Knowledge-Based Systems 21(6):507–513

    Article  Google Scholar 

  24. Tsay YJ, Chiang JY (2005) CBAR: an efficient method for mining association rules. Knowledge-Based Systems 18(2-3):99–105

    Article  Google Scholar 

  25. Tseng V, Wu CW, Fournier-Viger P, Yu P (2015) Efficient algorithms for mining the concise and lossless representation of high utility itemsets. Knowledge and Data Engineering. IEEE Transactions on 27(3):726–739

    Google Scholar 

  26. Vimieiro R, Moscato P (2014) Disclosed: An efficient depth-first, top-down algorithm for mining disjunctive closed itemsets in high-dimensional data. Information Sciences 280:171–187

    Article  MathSciNet  Google Scholar 

  27. Vo B, Le T, Coenen F, Hong TP (2014) Mining frequent itemsets using the N-list and subsume concepts. International Journal of Machine Learning and Cybernetics:1–13

  28. Wang JY, Han JW, Lu Y, Tzvetkov P (2005) TFP: An efficient algorithm for mining top-k frequent closed itemsets. IEEE Transactions on Knowledge and Data Engineering 17(5):652–664

    Article  Google Scholar 

  29. Xiong H, Tan PN, Kumar V (2006) Hyperclique pattern discovery. Data Mining and Knowledge Discovery 13(2):219–242

    Article  MathSciNet  Google Scholar 

  30. Yun U, Ryang H (2015) Incremental high utility pattern mining with static and dynamic databases. Applied Intelligence 42(2):323–352

    Article  Google Scholar 

  31. Zaki MJ, Gouda K (2003) Fast vertical mining using Diffsets

  32. Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997) New algorithms for fast discovery of association rules. Proceedings of the Third International Conference on Knowledge Discovery and Data Mining:283–6

Download references

Acknowledgments

This research was funded by the National Natural Science Foundation of China (Grant Nos. 61133005, 61432005, 61370095, 61472124, 61202109, and 61472126) and the International Science & Technology Cooperation Program of China (Grant No. 2015DFA11240,2014DFBS0010). T-L. Dam was also partially supported by science research fund of Hanoi University of Industry, Hanoi, Vietnam.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kenli Li.

Ethics declarations

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Conflict of interests

The authors declare that they have no conflict of interest.

Funding

This research was funded by the National Natural Science Foundation of China (Grant Nos. 61133005, 61432005, 61370095, 61472124, 61202109, and 61472126) and the International Science & Technology Cooperation Program of China (Grant No. 2015DFA11240). T-L. Dam was also partially supported by science research fund of Hanoi University of Industry, Hanoi, Vietnam.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dam, TL., Li, K., Fournier-Viger, P. et al. An efficient algorithm for mining top-rank-k frequent patterns. Appl Intell 45, 96–111 (2016). https://doi.org/10.1007/s10489-015-0748-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-015-0748-9

Keywords

Navigation