Skip to main content
Log in

An efficient utility-list based high-utility itemset mining algorithm

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

High-utility itemset mining (HUIM) is an important task in data mining that can retrieve more meaningful and useful patterns for decision-making. One-phase HUIM algorithms based on the utility-list structure have been shown to be the most efficient as they can mine high-utility itemsets (HUIs) without generating candidates. However, storing itemset information for the utility-list is time-consuming and memory consuming. To address this problem, we propose an efficient simplified utility-list-based HUIM algorithm (HUIM-SU). In the proposed HUIM-SU algorithm, the simplified utility-list is proposed to obtain all HUIs effectively and reduce memory usage in the depth-first search process. Based on the the simplified utility-list, repeated pruning according to the transaction-weighted utilisation (TWU) reduces the number of items. In addition, a construction tree and compressed storage are introduced to further reduce the search space and the memory usage. The extension utility and itemset TWU are then proposed to be the upper bounds, which reduce the search space considerably. Extensive experimental results on dense and sparse datasets indicate that the proposed HUIM-SU algorithm is highly efficient in terms of the number of candidates, memory usage, and execution time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Luna JM, Fournier-Viger P, Ventura S (2019) Frequent itemset mining: a 25 years review. WIREs Data Mining and Knowledge Discovery 9(6):1329. https://doi.org/10.1002/wdm.1329. https://wires.onlinelibrary.wiley.com/doi/pdf/10.1002/widm.1329

    Article  Google Scholar 

  2. Goyal P, Challa JS, Shrivastava S, Goyal N (2020) Anytime frequent itemset mining of transactional data streams. Big Data Research 21:100146. https://doi.org/10.1016/j.bdr.2020.100146

    Article  Google Scholar 

  3. Xun Y, Cui X, Zhang J, Yin Q (2021) Incremental frequent itemsets mining based on frequent pattern tree and multi-scale. Expert Sys Appl 163:113805. https://doi.org/10.1016/j.eswa.2020.113805

    Article  Google Scholar 

  4. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: International conference on very large data bases

  5. Fournier-Viger P, Chun-Wei Lin J, Truong-Chi T, Nkambou R (2019) A Survey of High Utility Itemset Mining. In: Fournier-viger P, Lin JC-W, Nkambou R, Vo B, Tseng V.S. (eds) Springer, Cham, pp 1–45

  6. Karagoz P, Cekinel RF (2019) High-utility pattern mining: theory, algorithms and applications. In: Studies in big data, 2019

  7. Han X, Liu X, Li J, Gao H (2020) Efficient top-k high utility itemset mining on massive data. Inf Sci 557:382–406. https://doi.org/10.1016/j.ins.08.028

    Article  MathSciNet  MATH  Google Scholar 

  8. Gan W, Lin J C-W, Fournier-Viger P, Chao H.-C, Tseng VS, Yu PS (2021) A survey of utility-oriented pattern mining. IEEE Trans Knowl Data Eng 33(4):1306–1327. https://doi.org/10.1109/TKDE.2019.2942594

    Article  Google Scholar 

  9. Amaranatha Reddy P, Hazarath Murali Krishna Prasad M (2021) High utility item-set mining from retail market data stream with various discount strategies using egui-tree. J Ambient Intell Human Comput, https://doi.org/10.1007/s12652-021-03341-3

  10. Krishna GJ, Ravi V (2021) High utility itemset mining using binary differential evolution: An application to customer segmentation. Expert Sys Appl 181:115122. https://doi.org/10.1016/j.eswa.2021.115122

    Article  Google Scholar 

  11. Kannimuthu S, Chakravarthy DG (2022) Discovery of interesting itemsets for web service composition using hybrid genetic algorithm. Neural Process Let, https://doi.org/10.1007/s11063-022-10793-x

  12. Tseng VS, Shie BE, Wu CW, Yu PS (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25(8):1772–1786

    Article  Google Scholar 

  13. Zhang C, Du Z, Gan W, Yu PS (2021) Tkus: Mining top-k high utility sequential patterns. Inf Sci 570:342–359. https://doi.org/10.1016/j.ins.2021.04.035

    Article  MathSciNet  Google Scholar 

  14. Kim H, Yun U, Baek Y, Kim J, Vo B, Yoon E, Fujita H (2021) Efficient list based mining of high average utility patterns with maximum average pruning strategies. Inf Sci 543:85–105. https://doi.org/10.1016/j.ins.2020.07.043

    Article  Google Scholar 

  15. Lin JC-W, Djenouri Y, Srivastava G, Yun U, Fournier-Viger P (2021) A predictive ga-based model for closed high-utility itemset mining. Appl Soft Comput 108:107422. https://doi.org/10.1016/j.asoc.2021.107422

    Article  Google Scholar 

  16. Singh K, Singh SS, Kumar A, Shakya HK, Biswas B (2018) Chn: an efficient algorithm for mining closed high utility itemsets with negative utility. IEEE Trans Knowl Data Eng:1–1 (ealy access). https://doi.org/10.1109/TKDE.2018.2882421

  17. Nam H, Yun U, Yoon E, Chun- Wei Lin J (2020) Efficient approach of recent high utility stream pattern mining with indexed list structure and pruning strategy considering arrival times of transactions. Inf Sci 529:1–27. https://doi.org/10.1016/j.ins.2020.03.030

    Article  MathSciNet  MATH  Google Scholar 

  18. Singh K, Singh SS, Kumar A, Biswas B (2019) Tkeh: an efficient algorithm for mining top-k high utility itemsets. Appl Intell 49(3):1078–1097. https://doi.org/10.1007/s10489-018-1316-x

    Article  Google Scholar 

  19. Song W, Zheng C, Huang C, Liu L (2021) Heuristically mining the top-k high-utility itemsets with cross-entropy optimization. Appl Intell, https://doi.org/10.1007/s10489-021-02576-z

  20. Dam TL, Li K, Fournier-Viger P, Duong QH (2017) An efficient algorithm for mining top- k on-shelf high utility itemsets. Knowl Inf Syst 52(3):1–35

    Article  Google Scholar 

  21. Dawar S, Sharma V, Goyal V (2017) Mining top-k high-utility itemsets from a data stream under sliding window model. Appl Intell 47(4):1240–1255. https://doi.org/10.1007/s10489-017-0939-7

    Article  Google Scholar 

  22. Fournier-Viger P, Zhang Y, Chun-Wei Lin J, Fujita H, Koh YS (2019) Mining local and peak high utility itemsets. Inf Sci 481:344–367

    Article  MathSciNet  Google Scholar 

  23. Truong T, Duong H, Le B, Fournier-Viger P (2020) Ehausm: an efficient algorithm for high average utility sequence mining. Inf Sci 515:302–323

  24. Singh K, Kumar R, Biswas B (2022) High average-utility itemsets mining: a survey. Appl Intell 52(4):3901–3938. https://doi.org/10.1007/s10489-021-02611-z

    Article  Google Scholar 

  25. Fournier-Viger P, Li Z, Lin JC-W, Kiran RU, Fujita H (2019) Efficient algorithms to identify periodic patterns in multiple sequences. Inf Sci 489:205–226

    Article  MathSciNet  MATH  Google Scholar 

  26. Ashraf M, Abdelkader T, Rady S, Gharib TF (2022) Tkn: an efficient approach for discovering top-k high utility itemsets with positive or negative profits. Inf Sci 587:654–678. https://doi.org/10.1016/j.ins.2021.12.024

    Article  Google Scholar 

  27. Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM international conference on information and knowledge management, CIKM ’12, pp 55–64. Association for computing machinery, New York, https://doi.org/10.1145/2396761.2396773

  28. Krishnamoorthy S (2015) Pruning strategies for mining high utility itemsets. Expert Syst Appl 42(5):2371–2381

    Article  Google Scholar 

  29. Fournier-Viger P, Wu C-W, Zida S, Tseng VS (2014) Fhm: faster high-utility itemset mining using estimated utility co-occurrence pruning. In: International symposium on methodologies for intelligent systems, pp 83–92

  30. Duong Q-H, Fournier-Viger P, Ramampiaro H, Nørvåg K, Dam T-L (2018) Efficient high utility itemset mining using buffered utility-lists. Appl Intell 48(7):1859–1877. https://doi.org/10.1007/s10489-017-1057-2

    Article  Google Scholar 

  31. Srikumar K (2017) Hminer: efficiently mining high utility itemsets. Expert Syst Appl 90:168–183

    Article  Google Scholar 

  32. Aryabarzan N, Minaei-Bidgoli B, Teshnehlab M (2018) negfin: an efficient algorithm for fast mining frequent itemsets. Expert Syst Appl 105:129–143

    Article  Google Scholar 

  33. Liu Y, Liao W, Alok C (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: Pacific-asia conference on advances in knowledge discovery & data mining

  34. Zida S, Fournier-Viger P, Lin JC-W, Wu C-W, Tseng VS (2015) Efim: a highly efficient algorithm for high-utility itemset mining. Adv Artif Intell Soft Comput 9413:530–546

    Article  Google Scholar 

  35. Lin JC-W, Gan W, Fournier-Viger P, Hong T-P, Zhan J (2016) Efficient mining of high-utility itemsets using multiple minimum utility thresholds. Knowl-Based Syst 113:100–115. https://doi.org/10.1016/j.knosys.2016.09.013

    Article  Google Scholar 

  36. Peng A, Koh YS, Riddle P (2017) mhuiminer: a fast high utility itemset mining algorithm for sparse datasets:196–207, https://doi.org/10.1007/978-3-319-57529-2_16

  37. Vuong N, Le B, Truong T, Nguyen D-P (2021) Efficient algorithms for discovering high-utility patterns with strong frequency affinities. Expert Syst Appl 169:114464. https://doi.org/10.1016/j.eswa.2020.114464

  38. Dawar S, Goyal V, Bera D (2017) A hybrid framework for mining high-utility itemsets in a sparse transaction database. Appl Intell 47(3):809–827

    Article  Google Scholar 

  39. Hong G, Hong T-P, Tseng VS (2014) An efficient projection-based indexing approach for mining high utility itemsets. Knowl Infn Syst 38(1):85–107

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Key R&D Program of China under Grant 2017YFC1601800 and 2017YFC1601000, in part by the National Natural Science foundation of China under Grant 62073155, 62002137, 62106088, and 61673194, in part by “Blue Project” in Jiangsu Universities, China, in part by Guangdong Provincial Key Laboratory under Grant 2020B121201001, in part by Advanced Research Project of Specialty Leading Person in Higher Vocational Colleges in Jiangsu Province.

Funding

This work was supported in part by the National Key R&D Program of China under Grant 2017YFC1601000 and 2017YFC1601800, in part by the National Natural Science foundation of China, under Grant 62073155, 62106088, 61673194, and 61672263.

Author information

Authors and Affiliations

Authors

Contributions

Zaihe Cheng: Methodology. Wei Fang: Supervision. Wei Shen: Software. Writing- Original draft preparation. Jerry Chun-Wei Lin, Bo Yuan: Resources, English language.

Corresponding author

Correspondence to Wei Fang.

Ethics declarations

Ethics approval and consent to participate

This work does not contain any studies with human participants performed by any of the authors.

Consent for Publication

Informed consent was obtained from all individual participants included in this work.

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cheng, Z., Fang, W., Shen, W. et al. An efficient utility-list based high-utility itemset mining algorithm. Appl Intell 53, 6992–7006 (2023). https://doi.org/10.1007/s10489-022-03850-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03850-4

Keywords

Navigation