Skip to main content
Log in

Fast Top-K association rule mining using rule generation property pruning

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Traditional association rule mining algorithms can have a long runtime, high memory consumption, and generate a huge number of rules. Browsing through numerous rules and adjusting parameters to find just enough rules is a tedious task for users, who are often only interested in finding the strongest rules. Hence, many recent studies have focused on mining the top-k most frequent association rules that have a minimum confidence so as to limit the number of rules by ranking them by frequency. Though this redefined task has many applications, the performance of current algorithms remains an issue. To address this issue, this paper presents a novel algorithm named FTARM (Fast Top-K Association Rule Miner) to efficiently find the set of top-k association rules using a novel technique called Rule Generation Property Pruning (RGPP). This technique reduces the search space by analyzing the internal relationships between items of the database to be mined and the parameters set by users. Furthermore, a novel candidate pruning property is used by this technique to speed up the mining process. FTARM’s efficiency was evaluated on various public benchmark datasets. A substantial reduction of the association rule mining time and memory usage was observed, and that FTARM has good scalability, which can benefit to many applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Agrawal R, Imielinski T, Swami AN (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pp 207–216. https://doi.org/10.1145/170035.170072

  2. Alwidian J, Hammo B, Obeid N (2018) WCBA: Weighted Classification based on association rules algorithm for breast cancer disease. Appl Soft Comput 62:536–549. https://doi.org/10.1016/j.asoc.2017.11.013

    Article  Google Scholar 

  3. Anand HS, Vinodchandra SS (2018) Association rule mining using treap. Int J Mach Learn Cybern 9(4):589–597. https://doi.org/10.1007/s13042-016-0546-7

    Article  Google Scholar 

  4. Anwar T, Uma V (2019) CD-SPM: Cross-domain book recommendation using sequential pattern mining and rule mining. Journal of King Saud University. https://doi.org/10.1016/j.jksuci.2019.01.012

  5. Aqra I, Ghani NA, Maple C, Machado JM, Safa NS (2019) Incremental algorithm for association rule mining under dynamic threshold. Appl Sci 9(24):5398. https://doi.org/10.3390/app9245398

    Article  Google Scholar 

  6. Aryabarzan N, Minaeibidgoli B, Teshnehlab M (2018) negFIN: An efficient algorithm for fast mining frequent itemsets. Expert Syst Appl 105:129–143. https://doi.org/10.1016/j.eswa.2018.03.041

  7. Bustiomartinez L, Letrasluna M, Cumplido R, Hernandezleon R, Feregrinouribe C, Bandeserrano JM (2019) Using hashing and lexicographic order for Frequent Itemsets Mining on data streams. J Parallel Distrib Comput 125:58–71. https://doi.org/10.1016/j.jpdc.2018.11.002

    Article  Google Scholar 

  8. Chon KW, Hwang SH, Kim M (2018) GMiner: A fast GPU-based frequent itemset mining method for large-scale data. Inf Sci:19–38. https://doi.org/10.1016/j.ins.2018.01.046

  9. Chuang K-T, Huang J-L, Chen M-S (2008) Mining top-k frequent patterns in the presence of the memory constraint. VLDB J 17(5):1321–1344. https://doi.org/10.1007/s00778-007-0078-6

    Article  Google Scholar 

  10. Czibula G, Czibula IG, Miholca D, Crivei LM (2019) A novel concurrent relational association rule mining approach. Expert Syst Appl 125:142–156. https://doi.org/10.1016/j.eswa.2019.01.082

    Article  Google Scholar 

  11. Deng Z (2014) Fast mining Top-Rank-k frequent patterns by using Node-lists. Expert Syst Appl 41(4):1763–1768. https://doi.org/10.1016/j.eswa.2013.08.075

    Article  Google Scholar 

  12. Djenouri Y, Belhadi A, Fournier-Viger P (2018) Extracting useful knowledge from event logs: a frequent itemset mining approach. Knowl Based Syst 139:132–148. https://doi.org/10.1016/j.knosys.2017.10.016

    Article  Google Scholar 

  13. Djenouri Y, Comuzzi M (2017) Combining Apriori heuristic and bio-inspired algorithms for solving the frequent itemsets mining problem. Inf Sci 420:1–15. https://doi.org/10.1016/j.ins.2017.08.043

    Article  Google Scholar 

  14. Fournier-Viger P, Wu C, Tseng VS (2012) Mining top-k association rules. In: Proceedings of the 25th canadian conference on artificial intelligence, pp 61–73. https://doi.org/10.1007/978-3-642-30353-1_6

  15. Fournier-Viger P, Zhang Y, Lin JC, Fujita H, Koh YS (2019) Mining local and peak high utility itemsets. Inf Sci 481:344–367. https://doi.org/10.1016/j.ins.2018.12.070

    Article  MathSciNet  Google Scholar 

  16. Gan W, Lin JC, Fournier-Viger P, Chao H, Hong T, Fujita H (2018) A survey of incremental high-utility itemset mining. Wiley Interdiscip Rev-Data Min Knowl Discov 8(2). https://doi.org/10.1002/widm.1242

  17. Han X, Liu X, Chen J, Lai G, Gao H, Li J (2019) Efficiently mining frequent itemsets on massive data. IEEE Access 7:31409–31421. https://doi.org/10.1109/access.2019.2902602

    Article  Google Scholar 

  18. Hashem T, Karim MR, Samiullah M, Ahmed CF (2017) An efficient dynamic superset bit-vector approach for mining frequent closed itemsets and their lattice structure. Expert Syst Appl 67:252–271. https://doi.org/10.1016/j.eswa.2016.09.023

    Article  Google Scholar 

  19. Heydari M, Yousefli A (2017) A new optimization model for market basket analysis with allocation considerations: a genetic algorithm solution approach. Manag Market 12(1):1–11. https://doi.org/10.1515/mmcks-2017-0001

    Google Scholar 

  20. Huynhthile Q, Le T, Vo B, Le B (2015) An efficient and effective algorithm for mining top-rank-k frequent patterns. Expert Syst Appl 42(1):156–164. https://doi.org/10.1016/j.eswa.2014.07.045

    Article  Google Scholar 

  21. Jorritsma W, Cnossen F, Dierckx R, Oudkerk M, Van Ooijen PMA (2016) Pattern mining of user interaction logs for a post-deployment usability evaluation of a radiology PACS client. Int J Med Inform 85(1):36–42. https://doi.org/10.1016/j.ijmedinf.2015.10.007

    Article  Google Scholar 

  22. Khan S, Parkinson S (2018) Eliciting and utilising knowledge for security event log analysis: an association rule mining and automated planning approach. Expert Syst Appl 113:116–127. https://doi.org/10.1016/j.eswa.2018.07.006

    Article  Google Scholar 

  23. Kieu T, Vo B, Le T, Deng Z, Le B (2017) Mining top-k co-occurrence items with sequential pattern. Expert Syst Appl 85:123–133. https://doi.org/10.1016/j.eswa.2017.05.021

    Article  Google Scholar 

  24. Krishnamoorthy S (2019) Mining top-k high utility itemsets with effective threshold raising strategies. Expert Syst Appl 117:148–165. https://doi.org/10.1016/j.eswa.2018.09.051

    Article  Google Scholar 

  25. Le T, Vo B (2016) The lattice-based approaches for mining association rules: a review. Wiley Interdiscip Rev-Data Min Knowl Discov 6(4):140–151. https://doi.org/10.1002/widm.1181

    Article  Google Scholar 

  26. Le T, Vo B, Baik SW (2018) Efficient algorithms for mining top-rank-k erasable patterns using pruning strategies and the subsume concept. Eng Appl Artif Intell 68:1–9. https://doi.org/10.1016/j.engappai.2017.09.010

    Article  Google Scholar 

  27. Le T, Vo B, Huynh V, Nguyen NT, Baik SW (2020) Mining top- k frequent patterns from uncertain databases. Appl Intell:1–11. https://doi.org/10.1007/s10489-019-01622-1

  28. Li J, Ma X, Zhang J, Tao J, Wang P, Guan X (2017) Mining repeating pattern in packet arrivals: Metrics, models, and applications. Inf Sci 408:1–22. https://doi.org/10.1016/j.ins.2017.04.033

  29. Lin JC, Gan W, Fournier-Viger P, Hong T, Tseng VS (2016) Fast algorithms for mining high-utility itemsets with various discount strategies. Adv Eng Inform 30(2):109–126. https://doi.org/10.1016/j.aei.2016.02.003

    Article  Google Scholar 

  30. Mai T, Vo B, Nguyen LTT (2017) A lattice-based approach for mining high utility association rules. Inf Sci 399:81–97. https://doi.org/10.1016/j.ins.2017.02.058

    Article  Google Scholar 

  31. Mlakar U, Zorman M, Fister I (2017) Modified binary cuckoo search for association rule mining. J Intell Fuzzy Syst 32(6):4319–4330. https://doi.org/10.3233/JIFS-16963

    Article  Google Scholar 

  32. Moslehi F, Haeri A, Martinezalvarez F (2020) A novel hybrid GA–PSO framework for mining quantitative association rules. In: soft computing, pp 4645–4666. https://doi.org/10.1007/s00500-019-04226-6

  33. Nguyen D, Luo W, Phung D, Venkatesh S (2018) LTARM: A novel temporal association rule mining method to understand toxicities in a routine cancer treatment. Knowl Based Syst 161:313–328. https://doi.org/10.1016/j.knosys.2018.07.031

    Article  Google Scholar 

  34. Nguyen LTT, Vo B, Nguyen LTT, Fournier-Viger P, Selamat A (2017) ETARM: An efficient top-k association rule mining algorithm. Appl Intell 48(5):1148–1160. https://doi.org/10.1007/s10489-017-1047-4

    Google Scholar 

  35. Raj S, Ramesh D, Sreenu M, Sethi KK (2020) EAFIM: Efficient apriori-based frequent itemset mining algorithm on Spark for big transactional data. Knowl Inf Syst 62(9):3565–3583. https://doi.org/10.1007/s10115-020-01464-1

    Article  Google Scholar 

  36. Ryang H, Yun U (2015) Top- k high utility pattern mining with effective threshold raising strategies. Knowl Based Syst 76(1):109–126. https://doi.org/10.1016/j.knosys.2014.12.010

    Article  Google Scholar 

  37. Sahoo J, Das AK, Goswami A (2015) An efficient approach for mining association rules from high utility itemsets. Expert Syst Appl 42(13):5754–5778. https://doi.org/10.1016/j.eswa.2015.02.051

    Article  Google Scholar 

  38. Son LH, Chiclana F, Kumar R, Mittal M, Khari M, Chatterjee JM, Baik SW (2018) ARM-AMO: An efficient association rule mining algorithm based on animal migration optimization. Knowl Based Syst 154:68–80. https://doi.org/10.1016/j.knosys.2018.04.038

    Article  Google Scholar 

  39. Telikani A, Gandomi AH, Shahbahrami A (2020) A survey of evolutionary computation for association rule mining. Information Sciences. https://doi.org/10.1016/j.ins.2020.02.073

  40. Thabtah F, Qabajeh I, Chiclana F (2016) Constrained dynamic rule induction learning. Expert Syst Appl 63:74–85. https://doi.org/10.1016/j.eswa.2016.06.041

    Article  Google Scholar 

  41. Tseng VS, Wu C, Fournier-Viger P, Yu PS (2016) Efficient algorithms for mining Top-K high utility itemsets. IEEE Trans Knowl Data Eng 28(1):54–67. https://doi.org/10.1109/TKDE.2015.2458860

    Article  Google Scholar 

  42. Vo B, Bui H, Vo T, Le T (2020) Mining top-rank-k frequent weighted itemsets using WN-list structures and an early pruning strategy. Knowl-Based Syst 201-202:106064. https://doi.org/10.1016/j.knosys.2020.106064

  43. Wang J, Han J, Lu Y, Tzvetkov P (2005) TFP: An efficient algorithm for mining top-k frequent closed itemsets. IEEE Trans Knowl Data Eng 17(5):652–664. https://doi.org/10.1109/TKDE.2005.81

    Article  Google Scholar 

  44. Wang L, Meng J, Xu P, Peng K (2018) Mining temporal association rules with frequent itemsets tree. Appl Soft Comput 62:817–829. https://doi.org/10.1016/j.asoc.2017.09.013

    Article  Google Scholar 

  45. Webb GI (2011) Filtered-top-k association discovery. Wiley Interdiscip Revi-Data Min Knowl Discov 1(3):183–192. https://doi.org/10.1002/widm.28

    Article  Google Scholar 

  46. Webb GI, Zhang S (2005) K-Optimal Rule discovery. Data Min Knowl Disc 10(1):39–79. https://doi.org/10.1007/s10618-005-0255-4

    Article  MathSciNet  Google Scholar 

  47. Wen F, Zhang G, Sun L, Wang X, Xu X (2019) A hybrid temporal association rules mining method for traffic congestion prediction. Comput Ind Eng 130:779–787. https://doi.org/10.1016/j.cie.2019.03.020

    Article  Google Scholar 

  48. Xiong X, Chen F, Huang P, Tian M, Hu X, Chen B, Qin J (2018) Frequent itemsets mining with differential privacy over Large-Scale data. IEEE Access 6:28877–28889. https://doi.org/10.1109/access.2018.2839752

    Article  Google Scholar 

  49. Zhang Z, Chai N, Ostrosi E, Shang Y (2019) Extraction of association rules in the schematic design of product service system based on pareto-MODGDFA. Comput Ind Eng 129:392–403. https://doi.org/10.1016/j.cie.2019.01.040

    Article  Google Scholar 

  50. Zhang Z, Pedrycz W, Huang J (2017) Efficient frequent itemsets mining through sampling and information granulation. Eng Appl Artif Intell 65:119–136. https://doi.org/10.1016/j.engappai.2017.07.016

    Article  Google Scholar 

Download references

Acknowledgments

This research is sponsored by the Science and Technology Planning Project of Sichuan Province under Grant No. 2020YFG0054, and the Scientific Research Project of State Grid Sichuan Electric Power Company Information and Communication Company under Grant No. SGSCXT00XGJS1800219.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xinzheng Niu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, X., Niu, X. & Fournier-Viger, P. Fast Top-K association rule mining using rule generation property pruning. Appl Intell 51, 2077–2093 (2021). https://doi.org/10.1007/s10489-020-01994-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-01994-9

Keywords

Navigation