HUC-Prune: an efficient candidate pruning technique to mine high utility patterns

Ahmed, Chowdhury Farhan; Tanbeer, Syed Khairuzzaman; Jeong, Byeong-Soo; Lee, Young-Koo

doi:10.1007/s10489-009-0188-5

HUC-Prune: an efficient candidate pruning technique to mine high utility patterns

Published: 14 July 2009

Volume 34, pages 181–198, (2011)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Chowdhury Farhan Ahmed¹,
Syed Khairuzzaman Tanbeer¹,
Byeong-Soo Jeong¹ &
…
Young-Koo Lee¹

339 Accesses
Explore all metrics

Abstract

Traditional frequent pattern mining methods consider an equal profit/weight for all items and only binary occurrences (0/1) of the items in transactions. High utility pattern mining becomes a very important research issue in data mining by considering the non-binary frequency values of items in transactions and different profit values for each item. However, most of the existing high utility pattern mining algorithms suffer in the level-wise candidate generation-and-test problem and generate too many candidate patterns. Moreover, they need several database scans which are directly dependent on the maximum candidate length. In this paper, we present a novel tree-based candidate pruning technique, called HUC-Prune (High Utility Candidates Prune), to solve these problems. Our technique uses a novel tree structure, called HUC-tree (High Utility Candidates tree), to capture important utility information of the candidate patterns. HUC-Prune avoids the level-wise candidate generation process by adopting a pattern growth approach. In contrast to the existing algorithms, its number of database scans is completely independent of the maximum candidate length. Extensive experimental results show that our algorithm is very efficient for high utility pattern mining and it outperforms the existing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

More Efficient Algorithms for Mining High-Utility Itemsets with Multiple Minimum Utility Thresholds

An Algorithm for Mining Fixed-Length High Utility Itemsets

A Survey of High Utility Itemset Mining

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Adnan M, Alhajj R (2009) DRFP-tree: disk resident frequent pattern tree. Appl Intell 30:84–97
Article Google Scholar
Ahmed CF, Tanbeer SK, Jeong B-S, Lee Y-K (2008) Handling dynamic weights in weighted frequent pattern mining. IEICE Trans Inf Syst E91-D(11):2578–2588
Article Google Scholar
Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 12th ACM SIGMOD international conference on management of data, 1993, pp 207–216
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th international conference on very large data bases (VLDB), 1994, pp 487–499
Barber B, Hamilton HJ (2003) Extracting share frequent itemsets with infrequent subsets. Data Min Knowl Discov 7:153–185
Article MathSciNet Google Scholar
Brijs T, Swinnen G, Vanhoof K, Wets G (1999) Using association rules for product assortment decisions: a case study. In: Proceedings of the 5th ACM SIGKDD international conference on knowledge discovery and data mining, 1999, pp 254–260
Chan R, Yang Q, Shen YD (2003) Mining high utility itemsets. In: Proceedings of the 3rd IEEE international conference on data mining, 2003, pp 19–26
Cooper C, Zito M (2007) Realistic synthetic data for testing association rule mining algorithms for market basket databases. In: Proceedings of the 11th international conference on principles and practice of knowledge discovery in databases (PKDD), 2007, pp 398–405
Dong J, Han M (2007) BitTableFI: An efficient mining frequent itemsets algorithm. Knowl-Based Syst 20:329–335
Article Google Scholar
Erwin A, Gopalan RP, Achuthan NR (2007) CTU-Mine: an efficient high utility itemset mining algorithm using the pattern growth approach. In: Proceedings of the 7th IEEE international conference on computer and information technology (CIT), 2007, pp 71–76
Frequent itemset mining dataset repository. Available from: http://fimi.cs.helsinki.fi/data/
Grahne G, Zhu J (2005) Fast algorithms for frequent itemset mining using FP-Trees. IEEE Trans Knowl Data Eng 17(10):1347–1362
Article Google Scholar
Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: current status and future directions. Data Min Knowl Discov 15:55–86
Article MathSciNet Google Scholar
Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8:53–87
Article MathSciNet Google Scholar
Huang Y, Xiong H, Wu W, Deng P, Zhang Z (2007) Mining maximal hyperclique pattern: a hybrid search strategy. Inf Sci 177:703–721
Article MathSciNet MATH Google Scholar
IBM (2009) QUEST Data Mining Project. Available from: http://www.almaden.ibm.com/cs/disciplines/iis/
Li Y-C, Yeh J-S, Chang C-C (2008) Isolated items discarding strategy for discovering high utility itemsets. Data Knowl Eng 64:198–217
Article Google Scholar
Liu B, Ma Y, Wong CK (2003) Scoring the data using association rules. Appl Intell 18:119–135
Article MATH Google Scholar
Liu Y, Liao W-K, Choudhary A (2005) A fast high utility itemsets mining algorithm. In: Proceedings of the 1st international conference on utility-based data mining, 2005, pp 90–99
Liu Y, Liao W-K, Choudhary A (2005) A two phase algorithm for fast discovery of high utility of itemsets. In: Proceedings of the 9th Pacific-Asia conference on knowledge discovery and data mining (PAKDD), 2005, pp 689–695
Pei J, Han J (2000) CLOSET: An efficient algorithm for mining frequent closed itemsets. In: Proceedings of ACM SIGMOD workshop on research issues in data mining and knowledge discovery, 2000, pp 21–30
Pisharath J, Liu Y, Parhi J, Liao W-K, Choudhary A, Memik G (2006) NU-MineBench version 2.0 source code and datasets. Available from: http://cucis.ece.northwestern.edu/projects/DMS/MineBench.html
Song M, Rajasekaran S (2006) A transaction mapping algorithm for frequent itemsets mining. IEEE Trans Knowl Data Eng 18(4):472–481
Article Google Scholar
Sucahyo YG, Gopalan RP, Rudra A (2003) Efficient mining frequent patterns from dense datasets using a cluster of computers. In: AI 2003. LNAI, vol 2903. Springer, Berlin, pp 233–244
Google Scholar
Tan PN, Kumar V, Srivastava J (2002) Selecting the right interestingness measure for association patterns. In: Proceedings of the 8th ACM SIGKDD international conference on knowledge discovery and data mining, 2002, pp 32–41
Tanbeer SK, Ahmed CF, Jeong B-S, Lee Y-K (2009) Efficient single-pass frequent pattern mining using a prefix-tree. Inf Sci 179(5):559–583
Article MathSciNet MATH Google Scholar
Tao F (2003) Weighted association rule mining using weighted support and significant framework. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining, 2003, pp 661–666
Tseng M-C, Lin W-Y, Jeng R (2008) Updating generalized association rules with evolving taxonomies. Appl Intell 29:306–320
Article Google Scholar
UCI machine learning repository. Available from: http://archive.ics.uci.edu/ml/
Verma K, Vyas OP (2005) Efficient calendar based temporal association rule. SIGMOD Rec 34(3):63–70
Article Google Scholar
Wang J, Han J, Pei J (2003) CLOSET+: searching for the best strategies for mining frequent closed itemsets. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining, 2003, pp 236–245
Wang CY, Tseng SS, Hong TP (2006) Flexible online association rule mining based on multidimensional pattern relations. Inf Sci 176:1752–1780
Article MATH Google Scholar
Wang W, Yang J, Yu PS (2004) WAR: weighted association rules for item intensities. Knowl Inf Syst 6:203–229
Google Scholar
Wu F, Chiang S-W, Lin J-R (2007) A new approach to mine frequent patterns using item-transformation method. Inf Syst 32:1056–1072
Article Google Scholar
Xiong H, Tan P-N, Kumar V (2006) Hyperclique Pattern Discovery. Data Min Knowl Discov 13:219–242
Article MathSciNet Google Scholar
Yao H, Hamilton HJ (2006) Mining itemset utilities from transaction databases. Data Knowl Eng 59:603–626
Article Google Scholar
Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases. In: Proceedings of the 4th SIAM international conference on data mining, 2004, pp 482–486
Ye F-Y, Wang J-D, Shao B-L (2005) New algorithm for mining frequent itemsets in sparse database. In: Proceeding of the 4th international conference on machine learning and cybernetics, 2005, pp 1554–1558
Yun U (2007) Efficient mining of weighted interesting patterns with a strong weight and/or support affinity. Inf Sci 177:3477–3499
Article MathSciNet Google Scholar
Yun U (2007) Mining lossless closed frequent patterns with weight constraints. Knowl-Based Syst 20:86–97
Article Google Scholar
Yun U, Leggett JJ (2005) WFIM: weighted frequent itemset mining with a weight range and a minimum weight. In: Proceedings of the 5th SIAM international conference on data mining, 2005, pp 636–640

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Kyung Hee University, 1 Seochun-dong, Kihung-gu, Youngin-si, Kyunggi-do, 446-701, Republic of Korea
Chowdhury Farhan Ahmed, Syed Khairuzzaman Tanbeer, Byeong-Soo Jeong & Young-Koo Lee

Authors

Chowdhury Farhan Ahmed
View author publications
You can also search for this author in PubMed Google Scholar
Syed Khairuzzaman Tanbeer
View author publications
You can also search for this author in PubMed Google Scholar
Byeong-Soo Jeong
View author publications
You can also search for this author in PubMed Google Scholar
Young-Koo Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Byeong-Soo Jeong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ahmed, C.F., Tanbeer, S.K., Jeong, BS. et al. HUC-Prune: an efficient candidate pruning technique to mine high utility patterns. Appl Intell 34, 181–198 (2011). https://doi.org/10.1007/s10489-009-0188-5

Download citation

Received: 25 January 2009
Accepted: 15 June 2009
Published: 14 July 2009
Issue Date: April 2011
DOI: https://doi.org/10.1007/s10489-009-0188-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HUC-Prune: an efficient candidate pruning technique to mine high utility patterns

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

More Efficient Algorithms for Mining High-Utility Itemsets with Multiple Minimum Utility Thresholds

An Algorithm for Mining Fixed-Length High Utility Itemsets

A Survey of High Utility Itemset Mining

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now