Skip to main content

Advertisement

Re-induction based mining for high utility item-sets

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The High Utility Itemset mining (HUIM) is an important research area in the field of data mining and knowledge discovery. HUIM aims to discover the high utility patterns from a given database, based on a utility threshold value, where the utility is a user-defined objective function. The existing HUIM algorithms fail to consider the actual behaviour of the occurrence of patterns in database. They consider all the patterns having the same utility value to be of equal importance. However, this may not always be the case, since some patterns may occur in localized clusters in the database while others can have a more uniform sequence of occurrence. The Frequent Itemset Mining (FIM) approaches also fail to address this problem since they are based on a support framework that considers only the frequency of occurrence of an itemset in the database. To address this research gap, this study introduces a novel concept of maintaining a count value of the itemsets, called re-induction count, in order to keep track of the relative occurrence of items in the database. A novel algorithm, named Ri-Miner, is proposed to mine itemsets based on both a minimum utility threshold and their re-induction count. The experimental results show that Ri-Miner outperforms existing methods by achieving a 15% improvement in execution time and a 10% reduction in memory usage. The proposed method can be useful in various applications that require capturing the underlying occurrence behaviour of the patterns the database, like market-basket analysis, healthcare, web stream analytics, etc.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Algorithm 2
Algorithm 3
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability and access

The data used for the experiments of this study are openly available in the SPMF library [33].

References

  1. Liu Y, Liao W-K, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 689–695

  2. Agrawal R, Srikant R, et al (1994) Fast algorithms for mining association rules. In: Proc. 20th Int. Conf. Very Large Data Bases, VLDB, vol 1215. Citeseer, pp 487–499

  3. Ahmed CF, Tanbeer SK, Jeong B-S, Lee Y-K (2011) Huc-prune: an efficient candidate pruning technique to mine high utility patterns. Appl Intell 34:181–198

    Article  MATH  Google Scholar 

  4. Li G, Shang T, Zhang Y (2023) Efficient mining high average-utility itemsets with effective pruning strategies and novel list structure. Appl Intell 53(5):6099–6118

    MATH  Google Scholar 

  5. Liu X, Niu X, Fournier-Viger P (2021) Fast top-k association rule mining using rule generation property pruning. Appl Intell 51:2077–2093

    Article  MATH  Google Scholar 

  6. Kumar R, Singh K (2023) High utility itemsets mining from transactional databases: a survey. Appl Intell 1–49

  7. Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases. In: Proceedings of the 2004 SIAM international conference on data mining. SIAM, pp 482–486

  8. Zaki MJ (1999) Parallel and distributed association mining: a survey. IEEE Concurr 4:14–25

    Article  MATH  Google Scholar 

  9. Tseng VS, Wu C-W, Shie B-E, Yu PS (2010) Up-growth: an efficient algorithm for high utility itemset mining. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 253–262

  10. Qu J-F, Fournier-Viger P, Liu M, Hang B, Hu C (2023) Mining high utility itemsets using prefix trees and utility vectors. IEEE Trans Knowl Data Eng

  11. Hu J, Mojsilovic A (2007) High-utility pattern mining: a method for discovery of high-utility item sets. Pattern Recogn 40(11):3317–3324

    Article  MATH  Google Scholar 

  12. Ahmed CF, Tanbeer SK, Jeong B-S, Lee Y-K (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21(12):1708–1721

    Article  MATH  Google Scholar 

  13. Subramanian K, Kandhasamy P (2023) Mining high utility itemsets using genetic algorithm based-particle swarm optimization (ga-pso). J Intell Fuzzy Syst (Preprint), 1–21

  14. Freitas AA (2003) A survey of evolutionary algorithms for data mining and knowledge discovery. Advances in evolutionary computing: theory and applications, 819–845

  15. Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM International conference on information and knowledge management. ACM, pp 55–64

  16. Fournier-Viger P, Wu C-W, Zida S, Tseng VS (2014) Fhm: Faster high-utility itemset mining using estimated utility co-occurrence pruning. In: International symposium on methodologies for intelligent systems. Springer, pp 83–92

  17. Krishnamoorthy S (2015) Pruning strategies for mining high utility itemsets. Expert Syst Appl 42(5):2371–2381

    Article  MATH  Google Scholar 

  18. Zida S, Fournier-Viger P, Lin JC-W, Wu C-W, Tseng VS (2017) Efim: a fast and memory efficient algorithm for high-utility itemset mining. Knowl Inf Syst 51(2):595–625

    Article  MATH  Google Scholar 

  19. Duong Q-H, Fournier-Viger P, Ramampiaro H, Nørvåg K, Dam T-L (2018) Efficient high utility itemset mining using buffered utility-lists. Appl Intell 48(7):1859–1877

    Article  MATH  Google Scholar 

  20. Cheng Z, Fang W, Shen W, Lin JC-W, Yuan B (2022) An efficient utility-list based high-utility itemset mining algorithm. Appl Intell, 1–15

  21. Fournier-Viger, P., Zida, S.: Foshu: faster on-shelf high utility itemset mining–with or without negative unit profit. In: Proceedings of the 30th Annual ACM Symposium on Applied Computing, pp. 857–864 (2015)

  22. Tseng VS, Wu C-W, Fournier-Viger P, Philip SY (2015) Efficient algorithms for mining the concise and lossless representation of high utility itemsets. IEEE Trans Knowl Data Eng 27(3):726–739

    Article  MATH  Google Scholar 

  23. Lin JC-W, Gan W, Fournier-Viger P, Hong T-P, Tseng VS (2017) Efficiently mining uncertain high-utility itemsets. Soft Comput 21(11):2801–2820

    Article  MATH  Google Scholar 

  24. Chu C-J, Tseng VS, Liang T (2009) An efficient algorithm for mining high utility itemsets with negative item values in large databases. Appl Math ComputAppl Math Comput Sci 215(2):767–778

    MATH  Google Scholar 

  25. Lin JC-W, Gan W, Hong T-P, Tseng VS (2015) Efficient algorithms for mining up-to-date high-utility patterns. Adv Eng Inform 29(3):648–661

    Article  MATH  Google Scholar 

  26. Han M, Zhang N, Wang L, Li X, Cheng H (2022) Mining closed high utility patterns with negative utility in dynamic databases. Appl Intell, 1–18

  27. Li Y, Zhang Z, Chen W, Min F (2014) Mining high utility itemsets with discount strategies. J Inf Comput Sci 11(17):6297–6307

    Article  MATH  Google Scholar 

  28. Lin JC-W, Gan W, Hong T-P (2016) A fast maintenance algorithm of the discovered high-utility itemsets with transaction deletion. Intell Data Anal 20(4):891–913

    Article  MATH  Google Scholar 

  29. Huang W-M, Hong T-P, Lan G-C, Chiang M-C, Lin JC-W (2017) Temporal-based fuzzy utility mining. IEEE Access 5:26639–26652

    Article  MATH  Google Scholar 

  30. Cheng Z, Fang W, Shen W, Lin JC-W, Yuan B (2023) An efficient utility-list based high-utility itemset mining algorithm. Appl Intell 53(6):6992–7006

    Article  Google Scholar 

  31. Ali A, Ullah I, Shabaz M, Sharafian A, Khan MA, Bai X, Qiu L (2024) A resource-aware multi-graph neural network for urban traffic flow prediction in multi-access edge computing systems. IEEE Trans Consum Electron

  32. Zakarya M, Khan AA, Qazani MRC, Ali H, Al-Bahri M, Khan AUR, Ali A, Khan R (2024) Sustainable computing across datacenters: a review of enabling models and techniques. Comput Sci Rev 52:100620

    Article  MathSciNet  MATH  Google Scholar 

  33. Fournier-Viger P, Lin JC-W, Gomariz A, Gueniche T, Soltani A, Deng Z, Lam HT (2016) The spmf open-source data mining library version 2. In: Joint european conference on machine learning and knowledge discovery in databases. Springer, pp 36–40

  34. Chauhan S, Singh M, Aggarwal AK (2023) Investigative analysis of different mutation on diversity-driven multi-parent evolutionary algorithm and its application in area coverage optimization of wsn. Soft Comput 1–27

  35. Chauhan S, Singh M, Aggarwal AK (2023) Designing of optimal digital iir filter in the multi-objective framework using an evolutionary algorithm. Eng Appl Artif Intell 119:105803

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

Both the authors’ conceived the proposed idea and developed theoretical formalism of the re-induction based utility mining. The design and implementation of the algorithm, the analysis of the results and to the writing of the manuscript are done by Pushp S. Mathur. Prof. Satish Chand directed and guided the research study.

Corresponding author

Correspondence to Pushp S. Mathur.

Ethics declarations

Competing Interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical and informed consent for data used

The experiments designed for this study are openly available on the internet, hence ethical and informed consent for data used are not applicable for this study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mathur, P.S., Chand, S. Re-induction based mining for high utility item-sets. Appl Intell 55, 75 (2025). https://doi.org/10.1007/s10489-024-05855-7

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-024-05855-7

Keywords