An efficient utility-list based high-utility itemset mining algorithm

Cheng, Zaihe; Fang, Wei; Shen, Wei; Lin, Jerry Chun-Wei; Yuan, Bo

doi:10.1007/s10489-022-03850-4

An efficient utility-list based high-utility itemset mining algorithm

Published: 13 July 2022

Volume 53, pages 6992–7006, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Zaihe Cheng^1,2,
Wei Fang ORCID: orcid.org/0000-0003-0750-4749²,
Wei Shen²,
Jerry Chun-Wei Lin³ &
…
Bo Yuan⁴

658 Accesses
9 Citations
1 Altmetric
Explore all metrics

Abstract

High-utility itemset mining (HUIM) is an important task in data mining that can retrieve more meaningful and useful patterns for decision-making. One-phase HUIM algorithms based on the utility-list structure have been shown to be the most efficient as they can mine high-utility itemsets (HUIs) without generating candidates. However, storing itemset information for the utility-list is time-consuming and memory consuming. To address this problem, we propose an efficient simplified utility-list-based HUIM algorithm (HUIM-SU). In the proposed HUIM-SU algorithm, the simplified utility-list is proposed to obtain all HUIs effectively and reduce memory usage in the depth-first search process. Based on the the simplified utility-list, repeated pruning according to the transaction-weighted utilisation (TWU) reduces the number of items. In addition, a construction tree and compressed storage are introduced to further reduce the search space and the memory usage. The extension utility and itemset TWU are then proposed to be the upper bounds, which reduce the search space considerably. Extensive experimental results on dense and sparse datasets indicate that the proposed HUIM-SU algorithm is highly efficient in terms of the number of candidates, memory usage, and execution time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Article 12 April 2024

Data distribution tailoring revisited: cost-efficient integration of representative data

Article 12 April 2024

Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy: a systematic literature review

Article Open access 22 February 2023

References

Luna JM, Fournier-Viger P, Ventura S (2019) Frequent itemset mining: a 25 years review. WIREs Data Mining and Knowledge Discovery 9(6):1329. https://doi.org/10.1002/wdm.1329. https://wires.onlinelibrary.wiley.com/doi/pdf/10.1002/widm.1329
Article Google Scholar
Goyal P, Challa JS, Shrivastava S, Goyal N (2020) Anytime frequent itemset mining of transactional data streams. Big Data Research 21:100146. https://doi.org/10.1016/j.bdr.2020.100146
Article Google Scholar
Xun Y, Cui X, Zhang J, Yin Q (2021) Incremental frequent itemsets mining based on frequent pattern tree and multi-scale. Expert Sys Appl 163:113805. https://doi.org/10.1016/j.eswa.2020.113805
Article Google Scholar
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: International conference on very large data bases
Fournier-Viger P, Chun-Wei Lin J, Truong-Chi T, Nkambou R (2019) A Survey of High Utility Itemset Mining. In: Fournier-viger P, Lin JC-W, Nkambou R, Vo B, Tseng V.S. (eds) Springer, Cham, pp 1–45
Karagoz P, Cekinel RF (2019) High-utility pattern mining: theory, algorithms and applications. In: Studies in big data, 2019
Han X, Liu X, Li J, Gao H (2020) Efficient top-k high utility itemset mining on massive data. Inf Sci 557:382–406. https://doi.org/10.1016/j.ins.08.028
Article MathSciNet MATH Google Scholar
Gan W, Lin J C-W, Fournier-Viger P, Chao H.-C, Tseng VS, Yu PS (2021) A survey of utility-oriented pattern mining. IEEE Trans Knowl Data Eng 33(4):1306–1327. https://doi.org/10.1109/TKDE.2019.2942594
Article Google Scholar
Amaranatha Reddy P, Hazarath Murali Krishna Prasad M (2021) High utility item-set mining from retail market data stream with various discount strategies using egui-tree. J Ambient Intell Human Comput, https://doi.org/10.1007/s12652-021-03341-3
Krishna GJ, Ravi V (2021) High utility itemset mining using binary differential evolution: An application to customer segmentation. Expert Sys Appl 181:115122. https://doi.org/10.1016/j.eswa.2021.115122
Article Google Scholar
Kannimuthu S, Chakravarthy DG (2022) Discovery of interesting itemsets for web service composition using hybrid genetic algorithm. Neural Process Let, https://doi.org/10.1007/s11063-022-10793-x
Tseng VS, Shie BE, Wu CW, Yu PS (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25(8):1772–1786
Article Google Scholar
Zhang C, Du Z, Gan W, Yu PS (2021) Tkus: Mining top-k high utility sequential patterns. Inf Sci 570:342–359. https://doi.org/10.1016/j.ins.2021.04.035
Article MathSciNet Google Scholar
Kim H, Yun U, Baek Y, Kim J, Vo B, Yoon E, Fujita H (2021) Efficient list based mining of high average utility patterns with maximum average pruning strategies. Inf Sci 543:85–105. https://doi.org/10.1016/j.ins.2020.07.043
Article Google Scholar
Lin JC-W, Djenouri Y, Srivastava G, Yun U, Fournier-Viger P (2021) A predictive ga-based model for closed high-utility itemset mining. Appl Soft Comput 108:107422. https://doi.org/10.1016/j.asoc.2021.107422
Article Google Scholar
Singh K, Singh SS, Kumar A, Shakya HK, Biswas B (2018) Chn: an efficient algorithm for mining closed high utility itemsets with negative utility. IEEE Trans Knowl Data Eng:1–1 (ealy access). https://doi.org/10.1109/TKDE.2018.2882421
Nam H, Yun U, Yoon E, Chun- Wei Lin J (2020) Efficient approach of recent high utility stream pattern mining with indexed list structure and pruning strategy considering arrival times of transactions. Inf Sci 529:1–27. https://doi.org/10.1016/j.ins.2020.03.030
Article MathSciNet MATH Google Scholar
Singh K, Singh SS, Kumar A, Biswas B (2019) Tkeh: an efficient algorithm for mining top-k high utility itemsets. Appl Intell 49(3):1078–1097. https://doi.org/10.1007/s10489-018-1316-x
Article Google Scholar
Song W, Zheng C, Huang C, Liu L (2021) Heuristically mining the top-k high-utility itemsets with cross-entropy optimization. Appl Intell, https://doi.org/10.1007/s10489-021-02576-z
Dam TL, Li K, Fournier-Viger P, Duong QH (2017) An efficient algorithm for mining top- k on-shelf high utility itemsets. Knowl Inf Syst 52(3):1–35
Article Google Scholar
Dawar S, Sharma V, Goyal V (2017) Mining top-k high-utility itemsets from a data stream under sliding window model. Appl Intell 47(4):1240–1255. https://doi.org/10.1007/s10489-017-0939-7
Article Google Scholar
Fournier-Viger P, Zhang Y, Chun-Wei Lin J, Fujita H, Koh YS (2019) Mining local and peak high utility itemsets. Inf Sci 481:344–367
Article MathSciNet Google Scholar
Truong T, Duong H, Le B, Fournier-Viger P (2020) Ehausm: an efficient algorithm for high average utility sequence mining. Inf Sci 515:302–323
Singh K, Kumar R, Biswas B (2022) High average-utility itemsets mining: a survey. Appl Intell 52(4):3901–3938. https://doi.org/10.1007/s10489-021-02611-z
Article Google Scholar
Fournier-Viger P, Li Z, Lin JC-W, Kiran RU, Fujita H (2019) Efficient algorithms to identify periodic patterns in multiple sequences. Inf Sci 489:205–226
Article MathSciNet MATH Google Scholar
Ashraf M, Abdelkader T, Rady S, Gharib TF (2022) Tkn: an efficient approach for discovering top-k high utility itemsets with positive or negative profits. Inf Sci 587:654–678. https://doi.org/10.1016/j.ins.2021.12.024
Article Google Scholar
Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM international conference on information and knowledge management, CIKM ’12, pp 55–64. Association for computing machinery, New York, https://doi.org/10.1145/2396761.2396773
Krishnamoorthy S (2015) Pruning strategies for mining high utility itemsets. Expert Syst Appl 42(5):2371–2381
Article Google Scholar
Fournier-Viger P, Wu C-W, Zida S, Tseng VS (2014) Fhm: faster high-utility itemset mining using estimated utility co-occurrence pruning. In: International symposium on methodologies for intelligent systems, pp 83–92
Duong Q-H, Fournier-Viger P, Ramampiaro H, Nørvåg K, Dam T-L (2018) Efficient high utility itemset mining using buffered utility-lists. Appl Intell 48(7):1859–1877. https://doi.org/10.1007/s10489-017-1057-2
Article Google Scholar
Srikumar K (2017) Hminer: efficiently mining high utility itemsets. Expert Syst Appl 90:168–183
Article Google Scholar
Aryabarzan N, Minaei-Bidgoli B, Teshnehlab M (2018) negfin: an efficient algorithm for fast mining frequent itemsets. Expert Syst Appl 105:129–143
Article Google Scholar
Liu Y, Liao W, Alok C (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: Pacific-asia conference on advances in knowledge discovery & data mining
Zida S, Fournier-Viger P, Lin JC-W, Wu C-W, Tseng VS (2015) Efim: a highly efficient algorithm for high-utility itemset mining. Adv Artif Intell Soft Comput 9413:530–546
Article Google Scholar
Lin JC-W, Gan W, Fournier-Viger P, Hong T-P, Zhan J (2016) Efficient mining of high-utility itemsets using multiple minimum utility thresholds. Knowl-Based Syst 113:100–115. https://doi.org/10.1016/j.knosys.2016.09.013
Article Google Scholar
Peng A, Koh YS, Riddle P (2017) mhuiminer: a fast high utility itemset mining algorithm for sparse datasets:196–207, https://doi.org/10.1007/978-3-319-57529-2_16
Vuong N, Le B, Truong T, Nguyen D-P (2021) Efficient algorithms for discovering high-utility patterns with strong frequency affinities. Expert Syst Appl 169:114464. https://doi.org/10.1016/j.eswa.2020.114464
Dawar S, Goyal V, Bera D (2017) A hybrid framework for mining high-utility itemsets in a sparse transaction database. Appl Intell 47(3):809–827
Article Google Scholar
Hong G, Hong T-P, Tseng VS (2014) An efficient projection-based indexing approach for mining high utility itemsets. Knowl Infn Syst 38(1):85–107
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Key R&D Program of China under Grant 2017YFC1601800 and 2017YFC1601000, in part by the National Natural Science foundation of China under Grant 62073155, 62002137, 62106088, and 61673194, in part by “Blue Project” in Jiangsu Universities, China, in part by Guangdong Provincial Key Laboratory under Grant 2020B121201001, in part by Advanced Research Project of Specialty Leading Person in Higher Vocational Colleges in Jiangsu Province.

Funding

This work was supported in part by the National Key R&D Program of China under Grant 2017YFC1601000 and 2017YFC1601800, in part by the National Natural Science foundation of China, under Grant 62073155, 62106088, 61673194, and 61672263.

Author information

Authors and Affiliations

School of Internet of Things, Wuxi Institute of Technology, Gaolang Road, Wuxi, Jiangsu, 214121, China
Zaihe Cheng
Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence, Jiangnan University, Lihu Avenue, Wuxi, Jiangsu, 214122, China
Zaihe Cheng, Wei Fang & Wei Shen
Department of Computer Science, Electrical Engineering and Mathematical Sciences, Western Norway University of Applied Sciences, Bergen, Norway
Jerry Chun-Wei Lin
Computer Science and Engineering Department, Southern University of Science and Technology, Shenzhen, China
Bo Yuan

Authors

Zaihe Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Wei Fang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Shen
View author publications
You can also search for this author in PubMed Google Scholar
Jerry Chun-Wei Lin
View author publications
You can also search for this author in PubMed Google Scholar
Bo Yuan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Zaihe Cheng: Methodology. Wei Fang: Supervision. Wei Shen: Software. Writing- Original draft preparation. Jerry Chun-Wei Lin, Bo Yuan: Resources, English language.

Corresponding author

Correspondence to Wei Fang.

Ethics declarations

Ethics approval and consent to participate

This work does not contain any studies with human participants performed by any of the authors.

Consent for Publication

Informed consent was obtained from all individual participants included in this work.

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cheng, Z., Fang, W., Shen, W. et al. An efficient utility-list based high-utility itemset mining algorithm. Appl Intell 53, 6992–7006 (2023). https://doi.org/10.1007/s10489-022-03850-4

Download citation

Accepted: 03 June 2022
Published: 13 July 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s10489-022-03850-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient utility-list based high-utility itemset mining algorithm

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Data distribution tailoring revisited: cost-efficient integration of representative data

Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy: a systematic literature review

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for Publication

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An efficient utility-list based high-utility itemset mining algorithm

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Data distribution tailoring revisited: cost-efficient integration of representative data

Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy: a systematic literature review

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for Publication

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation