Skip to main content
Log in

HUC-Prune: an efficient candidate pruning technique to mine high utility patterns

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Traditional frequent pattern mining methods consider an equal profit/weight for all items and only binary occurrences (0/1) of the items in transactions. High utility pattern mining becomes a very important research issue in data mining by considering the non-binary frequency values of items in transactions and different profit values for each item. However, most of the existing high utility pattern mining algorithms suffer in the level-wise candidate generation-and-test problem and generate too many candidate patterns. Moreover, they need several database scans which are directly dependent on the maximum candidate length. In this paper, we present a novel tree-based candidate pruning technique, called HUC-Prune (High Utility Candidates Prune), to solve these problems. Our technique uses a novel tree structure, called HUC-tree (High Utility Candidates tree), to capture important utility information of the candidate patterns. HUC-Prune avoids the level-wise candidate generation process by adopting a pattern growth approach. In contrast to the existing algorithms, its number of database scans is completely independent of the maximum candidate length. Extensive experimental results show that our algorithm is very efficient for high utility pattern mining and it outperforms the existing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Adnan M, Alhajj R (2009) DRFP-tree: disk resident frequent pattern tree. Appl Intell 30:84–97

    Article  Google Scholar 

  2. Ahmed CF, Tanbeer SK, Jeong B-S, Lee Y-K (2008) Handling dynamic weights in weighted frequent pattern mining. IEICE Trans Inf Syst E91-D(11):2578–2588

    Article  Google Scholar 

  3. Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 12th ACM SIGMOD international conference on management of data, 1993, pp 207–216

  4. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th international conference on very large data bases (VLDB), 1994, pp 487–499

  5. Barber B, Hamilton HJ (2003) Extracting share frequent itemsets with infrequent subsets. Data Min Knowl Discov 7:153–185

    Article  MathSciNet  Google Scholar 

  6. Brijs T, Swinnen G, Vanhoof K, Wets G (1999) Using association rules for product assortment decisions: a case study. In: Proceedings of the 5th ACM SIGKDD international conference on knowledge discovery and data mining, 1999, pp 254–260

  7. Chan R, Yang Q, Shen YD (2003) Mining high utility itemsets. In: Proceedings of the 3rd IEEE international conference on data mining, 2003, pp 19–26

  8. Cooper C, Zito M (2007) Realistic synthetic data for testing association rule mining algorithms for market basket databases. In: Proceedings of the 11th international conference on principles and practice of knowledge discovery in databases (PKDD), 2007, pp 398–405

  9. Dong J, Han M (2007) BitTableFI: An efficient mining frequent itemsets algorithm. Knowl-Based Syst 20:329–335

    Article  Google Scholar 

  10. Erwin A, Gopalan RP, Achuthan NR (2007) CTU-Mine: an efficient high utility itemset mining algorithm using the pattern growth approach. In: Proceedings of the 7th IEEE international conference on computer and information technology (CIT), 2007, pp 71–76

  11. Frequent itemset mining dataset repository. Available from: http://fimi.cs.helsinki.fi/data/

  12. Grahne G, Zhu J (2005) Fast algorithms for frequent itemset mining using FP-Trees. IEEE Trans Knowl Data Eng 17(10):1347–1362

    Article  Google Scholar 

  13. Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: current status and future directions. Data Min Knowl Discov 15:55–86

    Article  MathSciNet  Google Scholar 

  14. Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8:53–87

    Article  MathSciNet  Google Scholar 

  15. Huang Y, Xiong H, Wu W, Deng P, Zhang Z (2007) Mining maximal hyperclique pattern: a hybrid search strategy. Inf Sci 177:703–721

    Article  MathSciNet  MATH  Google Scholar 

  16. IBM (2009) QUEST Data Mining Project. Available from: http://www.almaden.ibm.com/cs/disciplines/iis/

  17. Li Y-C, Yeh J-S, Chang C-C (2008) Isolated items discarding strategy for discovering high utility itemsets. Data Knowl Eng 64:198–217

    Article  Google Scholar 

  18. Liu B, Ma Y, Wong CK (2003) Scoring the data using association rules. Appl Intell 18:119–135

    Article  MATH  Google Scholar 

  19. Liu Y, Liao W-K, Choudhary A (2005) A fast high utility itemsets mining algorithm. In: Proceedings of the 1st international conference on utility-based data mining, 2005, pp 90–99

  20. Liu Y, Liao W-K, Choudhary A (2005) A two phase algorithm for fast discovery of high utility of itemsets. In: Proceedings of the 9th Pacific-Asia conference on knowledge discovery and data mining (PAKDD), 2005, pp 689–695

  21. Pei J, Han J (2000) CLOSET: An efficient algorithm for mining frequent closed itemsets. In: Proceedings of ACM SIGMOD workshop on research issues in data mining and knowledge discovery, 2000, pp 21–30

  22. Pisharath J, Liu Y, Parhi J, Liao W-K, Choudhary A, Memik G (2006) NU-MineBench version 2.0 source code and datasets. Available from: http://cucis.ece.northwestern.edu/projects/DMS/MineBench.html

  23. Song M, Rajasekaran S (2006) A transaction mapping algorithm for frequent itemsets mining. IEEE Trans Knowl Data Eng 18(4):472–481

    Article  Google Scholar 

  24. Sucahyo YG, Gopalan RP, Rudra A (2003) Efficient mining frequent patterns from dense datasets using a cluster of computers. In: AI 2003. LNAI, vol 2903. Springer, Berlin, pp 233–244

    Google Scholar 

  25. Tan PN, Kumar V, Srivastava J (2002) Selecting the right interestingness measure for association patterns. In: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining, 2002, pp 32–41

  26. Tanbeer SK, Ahmed CF, Jeong B-S, Lee Y-K (2009) Efficient single-pass frequent pattern mining using a prefix-tree. Inf Sci 179(5):559–583

    Article  MathSciNet  MATH  Google Scholar 

  27. Tao F (2003) Weighted association rule mining using weighted support and significant framework. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining, 2003, pp 661–666

  28. Tseng M-C, Lin W-Y, Jeng R (2008) Updating generalized association rules with evolving taxonomies. Appl Intell 29:306–320

    Article  Google Scholar 

  29. UCI machine learning repository. Available from: http://archive.ics.uci.edu/ml/

  30. Verma K, Vyas OP (2005) Efficient calendar based temporal association rule. SIGMOD Rec 34(3):63–70

    Article  Google Scholar 

  31. Wang J, Han J, Pei J (2003) CLOSET+: searching for the best strategies for mining frequent closed itemsets. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining, 2003, pp 236–245

  32. Wang CY, Tseng SS, Hong TP (2006) Flexible online association rule mining based on multidimensional pattern relations. Inf Sci 176:1752–1780

    Article  MATH  Google Scholar 

  33. Wang W, Yang J, Yu PS (2004) WAR: weighted association rules for item intensities. Knowl Inf Syst 6:203–229

    Google Scholar 

  34. Wu F, Chiang S-W, Lin J-R (2007) A new approach to mine frequent patterns using item-transformation method. Inf Syst 32:1056–1072

    Article  Google Scholar 

  35. Xiong H, Tan P-N, Kumar V (2006) Hyperclique Pattern Discovery. Data Min Knowl Discov 13:219–242

    Article  MathSciNet  Google Scholar 

  36. Yao H, Hamilton HJ (2006) Mining itemset utilities from transaction databases. Data Knowl Eng 59:603–626

    Article  Google Scholar 

  37. Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases. In: Proceedings of the 4th SIAM international conference on data mining, 2004, pp 482–486

  38. Ye F-Y, Wang J-D, Shao B-L (2005) New algorithm for mining frequent itemsets in sparse database. In: Proceeding of the 4th international conference on machine learning and cybernetics, 2005, pp 1554–1558

  39. Yun U (2007) Efficient mining of weighted interesting patterns with a strong weight and/or support affinity. Inf Sci 177:3477–3499

    Article  MathSciNet  Google Scholar 

  40. Yun U (2007) Mining lossless closed frequent patterns with weight constraints. Knowl-Based Syst 20:86–97

    Article  Google Scholar 

  41. Yun U, Leggett JJ (2005) WFIM: weighted frequent itemset mining with a weight range and a minimum weight. In: Proceedings of the 5th SIAM international conference on data mining, 2005, pp 636–640

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Byeong-Soo Jeong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ahmed, C.F., Tanbeer, S.K., Jeong, BS. et al. HUC-Prune: an efficient candidate pruning technique to mine high utility patterns. Appl Intell 34, 181–198 (2011). https://doi.org/10.1007/s10489-009-0188-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-009-0188-5

Keywords

Navigation