Abstract
The High Utility Itemset mining (HUIM) is an important research area in the field of data mining and knowledge discovery. HUIM aims to discover the high utility patterns from a given database, based on a utility threshold value, where the utility is a user-defined objective function. The existing HUIM algorithms fail to consider the actual behaviour of the occurrence of patterns in database. They consider all the patterns having the same utility value to be of equal importance. However, this may not always be the case, since some patterns may occur in localized clusters in the database while others can have a more uniform sequence of occurrence. The Frequent Itemset Mining (FIM) approaches also fail to address this problem since they are based on a support framework that considers only the frequency of occurrence of an itemset in the database. To address this research gap, this study introduces a novel concept of maintaining a count value of the itemsets, called re-induction count, in order to keep track of the relative occurrence of items in the database. A novel algorithm, named Ri-Miner, is proposed to mine itemsets based on both a minimum utility threshold and their re-induction count. The experimental results show that Ri-Miner outperforms existing methods by achieving a 15% improvement in execution time and a 10% reduction in memory usage. The proposed method can be useful in various applications that require capturing the underlying occurrence behaviour of the patterns the database, like market-basket analysis, healthcare, web stream analytics, etc.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability and access
The data used for the experiments of this study are openly available in the SPMF library [33].
References
Liu Y, Liao W-K, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 689–695
Agrawal R, Srikant R, et al (1994) Fast algorithms for mining association rules. In: Proc. 20th Int. Conf. Very Large Data Bases, VLDB, vol 1215. Citeseer, pp 487–499
Ahmed CF, Tanbeer SK, Jeong B-S, Lee Y-K (2011) Huc-prune: an efficient candidate pruning technique to mine high utility patterns. Appl Intell 34:181–198
Li G, Shang T, Zhang Y (2023) Efficient mining high average-utility itemsets with effective pruning strategies and novel list structure. Appl Intell 53(5):6099–6118
Liu X, Niu X, Fournier-Viger P (2021) Fast top-k association rule mining using rule generation property pruning. Appl Intell 51:2077–2093
Kumar R, Singh K (2023) High utility itemsets mining from transactional databases: a survey. Appl Intell 1–49
Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases. In: Proceedings of the 2004 SIAM international conference on data mining. SIAM, pp 482–486
Zaki MJ (1999) Parallel and distributed association mining: a survey. IEEE Concurr 4:14–25
Tseng VS, Wu C-W, Shie B-E, Yu PS (2010) Up-growth: an efficient algorithm for high utility itemset mining. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 253–262
Qu J-F, Fournier-Viger P, Liu M, Hang B, Hu C (2023) Mining high utility itemsets using prefix trees and utility vectors. IEEE Trans Knowl Data Eng
Hu J, Mojsilovic A (2007) High-utility pattern mining: a method for discovery of high-utility item sets. Pattern Recogn 40(11):3317–3324
Ahmed CF, Tanbeer SK, Jeong B-S, Lee Y-K (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21(12):1708–1721
Subramanian K, Kandhasamy P (2023) Mining high utility itemsets using genetic algorithm based-particle swarm optimization (ga-pso). J Intell Fuzzy Syst (Preprint), 1–21
Freitas AA (2003) A survey of evolutionary algorithms for data mining and knowledge discovery. Advances in evolutionary computing: theory and applications, 819–845
Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM International conference on information and knowledge management. ACM, pp 55–64
Fournier-Viger P, Wu C-W, Zida S, Tseng VS (2014) Fhm: Faster high-utility itemset mining using estimated utility co-occurrence pruning. In: International symposium on methodologies for intelligent systems. Springer, pp 83–92
Krishnamoorthy S (2015) Pruning strategies for mining high utility itemsets. Expert Syst Appl 42(5):2371–2381
Zida S, Fournier-Viger P, Lin JC-W, Wu C-W, Tseng VS (2017) Efim: a fast and memory efficient algorithm for high-utility itemset mining. Knowl Inf Syst 51(2):595–625
Duong Q-H, Fournier-Viger P, Ramampiaro H, Nørvåg K, Dam T-L (2018) Efficient high utility itemset mining using buffered utility-lists. Appl Intell 48(7):1859–1877
Cheng Z, Fang W, Shen W, Lin JC-W, Yuan B (2022) An efficient utility-list based high-utility itemset mining algorithm. Appl Intell, 1–15
Fournier-Viger, P., Zida, S.: Foshu: faster on-shelf high utility itemset mining–with or without negative unit profit. In: Proceedings of the 30th Annual ACM Symposium on Applied Computing, pp. 857–864 (2015)
Tseng VS, Wu C-W, Fournier-Viger P, Philip SY (2015) Efficient algorithms for mining the concise and lossless representation of high utility itemsets. IEEE Trans Knowl Data Eng 27(3):726–739
Lin JC-W, Gan W, Fournier-Viger P, Hong T-P, Tseng VS (2017) Efficiently mining uncertain high-utility itemsets. Soft Comput 21(11):2801–2820
Chu C-J, Tseng VS, Liang T (2009) An efficient algorithm for mining high utility itemsets with negative item values in large databases. Appl Math ComputAppl Math Comput Sci 215(2):767–778
Lin JC-W, Gan W, Hong T-P, Tseng VS (2015) Efficient algorithms for mining up-to-date high-utility patterns. Adv Eng Inform 29(3):648–661
Han M, Zhang N, Wang L, Li X, Cheng H (2022) Mining closed high utility patterns with negative utility in dynamic databases. Appl Intell, 1–18
Li Y, Zhang Z, Chen W, Min F (2014) Mining high utility itemsets with discount strategies. J Inf Comput Sci 11(17):6297–6307
Lin JC-W, Gan W, Hong T-P (2016) A fast maintenance algorithm of the discovered high-utility itemsets with transaction deletion. Intell Data Anal 20(4):891–913
Huang W-M, Hong T-P, Lan G-C, Chiang M-C, Lin JC-W (2017) Temporal-based fuzzy utility mining. IEEE Access 5:26639–26652
Cheng Z, Fang W, Shen W, Lin JC-W, Yuan B (2023) An efficient utility-list based high-utility itemset mining algorithm. Appl Intell 53(6):6992–7006
Ali A, Ullah I, Shabaz M, Sharafian A, Khan MA, Bai X, Qiu L (2024) A resource-aware multi-graph neural network for urban traffic flow prediction in multi-access edge computing systems. IEEE Trans Consum Electron
Zakarya M, Khan AA, Qazani MRC, Ali H, Al-Bahri M, Khan AUR, Ali A, Khan R (2024) Sustainable computing across datacenters: a review of enabling models and techniques. Comput Sci Rev 52:100620
Fournier-Viger P, Lin JC-W, Gomariz A, Gueniche T, Soltani A, Deng Z, Lam HT (2016) The spmf open-source data mining library version 2. In: Joint european conference on machine learning and knowledge discovery in databases. Springer, pp 36–40
Chauhan S, Singh M, Aggarwal AK (2023) Investigative analysis of different mutation on diversity-driven multi-parent evolutionary algorithm and its application in area coverage optimization of wsn. Soft Comput 1–27
Chauhan S, Singh M, Aggarwal AK (2023) Designing of optimal digital iir filter in the multi-objective framework using an evolutionary algorithm. Eng Appl Artif Intell 119:105803
Author information
Authors and Affiliations
Contributions
Both the authors’ conceived the proposed idea and developed theoretical formalism of the re-induction based utility mining. The design and implementation of the algorithm, the analysis of the results and to the writing of the manuscript are done by Pushp S. Mathur. Prof. Satish Chand directed and guided the research study.
Corresponding author
Ethics declarations
Competing Interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Ethical and informed consent for data used
The experiments designed for this study are openly available on the internet, hence ethical and informed consent for data used are not applicable for this study.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mathur, P.S., Chand, S. Re-induction based mining for high utility item-sets. Appl Intell 55, 75 (2025). https://doi.org/10.1007/s10489-024-05855-7
Accepted:
Published:
DOI: https://doi.org/10.1007/s10489-024-05855-7