Skip to main content
Log in

Indexed list-based high utility pattern mining with utility upper-bound reduction and pattern combination techniques

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

High utility pattern mining has been studied as an essential topic in the field of pattern mining in order to satisfy requirements of many real-world applications that need to process non-binary databases including item importance such as market analysis. In this paper, we propose an efficient algorithm with a novel indexed list-based data structure for mining high utility patterns. Previous approaches first generate an enormous number of candidate patterns on the basis of overestimation methods in their mining processes and then identify actual high utility patterns from the candidates through an additional database scan, which leads to high computational overheads. Although several list-based algorithms to discover high utility patterns without candidate generation have been suggested in recent years, they require a large number of comparison operations. Our method facilitates efficient mining of high utility patterns with the proposed indexed list by effectively reducing the total number of such operations. Moreover, we develop two techniques based on this novel data structure to more enhance mining performance of the proposed method. Experimental results on real and synthetic datasets show that the proposed algorithm mines high utility patterns more efficiently than the state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  1. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Bocca JB, Jarke M, Zaniolo C (eds) Proceedings of 20th international conference on very large data bases, Santiago de Chile, Chile, September 1994, pp 487–499

  2. Ahmed CF, Tanbeer SK, Jeong B-S, Choi H-J (2012) Interactive mining of high utility patterns over data streams. Expert Syst Appl 39(15):11979–11991

    Article  Google Scholar 

  3. Ahmed CF, Tanbeer SK, Jeong B-S, Lee Y-K, Choi H-J (2012) Single-pass incremental and interactive mining for weighted frequent patterns. Expert Syst Appl 39(9):7976–7994

    Article  Google Scholar 

  4. Ahmed CF, Tanbeer SK, Jeong B-S, Lee Y-K (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21(12):1708–1721

    Article  Google Scholar 

  5. Chen L, Mei Q (2014) Mining frequent items in data stream using time fading model. Inf Sci 257(1):54–69

    Article  MathSciNet  MATH  Google Scholar 

  6. Feng L, Wang L, Jin B (2013) UT-tree: efficient mining of high utility itemsets from data streams. Intell Data Anal 17(4):585–602

    Google Scholar 

  7. Fournier-Viger P, Wu C-W, Zida S, Tseng VS (2014) FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Andreasen T, Christiansen H, Talavera JCC, Ras ZW (eds) Proceedings of 21st international symposium on methodologies for intelligent systems, Roskilde, Denmark, June 2014. Lecture Notes in Computer Science 8502, Springer, Berlin, pp 83–92

  8. Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Chen W, Naughton JF, Bernstein PA (eds) Proceedings of 2000 ACM SIGMOD international conference on management of data, Dallas, Texas, USA, May 2000, pp 1–12

  9. Huynh-Thi-Le Q, Le T, Vo B, Le HB (2015) An efficient and effective algorithm for mining top-rank-k frequent patterns. Expert Syst Appl 42(1):156–164

    Article  Google Scholar 

  10. Kim J, Yun U, Pyun G, Ryang H, Lee G, Yoon E, Ryu K (2015) A blog ranking algorithm using analysis of both blog influence and characteristics of blog posts. Clust Comput 18(1):157–164

    Article  Google Scholar 

  11. Lan GC, Hong TP, Tseng VS, Wang SL (2014) Applying the maximum utility measure in high utility sequential pattern mining. Expert Syst Appl 41(11):5071–5081

    Article  Google Scholar 

  12. Lan GC, Hong TP, Tseng VS (2014) An efficient projection-based indexing approach for mining high utility itemsets. Knowl Inf Syst 38(1):85–107

    Article  Google Scholar 

  13. Lee G, Yun U, Ryang H (2015) Mining weighted erasable patterns by using underestimated constraint-based pruning technique. J Intell Fuzzy Syst 28(3):1145–1157

    Google Scholar 

  14. Lee G, Yun U, Ryu K (2014) Sliding window based weighted maximal frequent pattern mining over data streams. Expert Syst Appl 41(2):694–708

    Article  Google Scholar 

  15. Li YC, Yeh J-S, Chang C-C (2008) Isolated items discarding strategy for discovering high utility itemsets. Data Knowl Eng 64(1):198–217

    Article  Google Scholar 

  16. Lin C-W, Hong T-P, Lan G-C, Wong J-W, Lin W-Y (2015) Efficient updating of discovered high-utility itemsets for transaction deletion in dynamic databases. Adv Eng Inform 29(1):16–27

    Article  Google Scholar 

  17. Lin C-W, Lan G-C, Hong T-P (2015) Mining high utility itemsets for transaction deletion in a dynamic database. Intell Data Anal 19(1):43–55

    Google Scholar 

  18. Lin C-W, Hong T-P, Lan G-C, Wong J-W, Lin W-Y (2014) Incrementally mining high utility patterns based on pre-large concept. Appl Intell 40(2):343–357

    Article  Google Scholar 

  19. Liu M, Qu J-F (2012) Mining high utility itemsets without candidate generation. In: Chen X-W, Lebanon G, Wang H, Zaki MJ (eds) Proceedings of 21st ACM international conference on information and knowledge management (CIKM’12), Maui, HI, USA, October 2012, pp 55–64

  20. Liu J, Wang K, Fung BCM (2012) Direct discovery of high utility itemsets without candidate generation. In: Zaki MJ, Siebes A, Yu JX, Goethals B, Webb GI, Wu X (eds) Proceedings of 12th IEEE international conference on data mining (ICDM’12), Brussels, Belgium, December 2012, pp 984–898

  21. Liu Y, Liao W-K, Choudhary AN (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: Ho TB, Cheung DW-L, Liu H (eds) Proceedings of advances in knowledge discovery and data mining, 9th Pacific-Asia conference (PAKDD’05), Hanoi, Vietnam, May 2005, pp 689–695

  22. Pisharath J, Liu Y, Ozisikyilmaz B, Narayanan R, Liao WK, Choudhary A, Memik G (2005) NU-MineBench version 2.0 dataset and technical report. http://cucis.ece.northwestern.edu/projects/DMS/MineBench.html

  23. Pyun G, Yun U (2014) Mining top-k frequent patterns with combination reducing techniques. Appl Intell 41(1):76–98

    Article  Google Scholar 

  24. Pyun G, Yun U, Ryu K (2014) Efficient frequent pattern mining based on linear prefix tree. Knowl Based Syst 55:125–139

    Article  Google Scholar 

  25. Ryang H, Yun U (2015) Top-k high utility pattern mining with effective threshold raising strategies. Knowl Based Syst 76:109–126

    Article  Google Scholar 

  26. Ryang H, Yun U, Ryu K (2014) Discovering high utility itemsets with multiple minimum supports. Intell Data Anal 18(6):1027–1047

    Google Scholar 

  27. Sahoo J, Das AK, Goswami A (2015) An efficient approach for mining association rules from high utility itemsets. Expert Syst Appl 42(13):5754–5778

    Article  Google Scholar 

  28. Song W, Liu Y, Li J (2014) Mining high utility itemsets by dynamically pruning the tree structure. Appl Intell 40(1):29–43

    Article  Google Scholar 

  29. Troiano L, Scibelli G (2014) Mining frequent itemsets in data streams within a time horizon. Data Knowl Eng 89:21–37

    Article  MATH  Google Scholar 

  30. Tseng VS, Wu C-W, Fournier-Viger P, Yu PS (2015) Efficient algorithms for mining the concise and lossless representation of high utility itemsets. IEEE Trans Knowl Data Eng 27(3):726–739

    Article  Google Scholar 

  31. Tseng VS, Shie B-E, Wu C-W, Yu PS (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25(8):1772–1786

    Article  Google Scholar 

  32. Tseng VS, Wu C-W, Shie B-E, Yu PS (2010) UP-Growth: an efficient algorithm for high utility itemset mining. In: Rao B, Krishnapuram B, Tomkins A, Yang Q (eds) Proceedings of 16th ACM SIGKDD international conference on knowledge discovery and data mining, DC, USA, Washington, July 2010, pp 253–262

  33. Yun U, Ryang H (2015) Incremental high utility pattern mining with static and dynamic databases. Appl Intell 42(2):323–352

    Article  Google Scholar 

  34. Yun U, Kim J (2015) A fast perturbation algorithm using tree structure for privacy preserving utility mining. Expert Syst Appl 42(3):1149–1165

    Article  Google Scholar 

  35. Yun U, Ryang H, Ryu K (2014) High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates. Expert Syst Appl 41(8):3861–3878

    Article  Google Scholar 

  36. Zhang X, Deng Z-H (2015) Mining summarization of high utility itemsets. Knowl Based Syst 84:67–77

    Article  Google Scholar 

  37. ZiHayat M, An A (2014) Mining top-k high utility patterns over data streams. Inf Sci 285:138–161

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

This research was supported by the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF Nos. 20152062051 and 20155054624).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Unil Yun.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ryang, H., Yun, U. Indexed list-based high utility pattern mining with utility upper-bound reduction and pattern combination techniques. Knowl Inf Syst 51, 627–659 (2017). https://doi.org/10.1007/s10115-016-0989-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-016-0989-x

Keywords

Navigation