Skip to main content
Log in

Efficient evolutionary computation model of closed high-utility itemset mining

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

HUIM has been an important issue in recent years, particularly in basket-market analysis, since it identifies useful information or goods for decision-making. Numerous research focused on extracting high-utility itemsets from datasets, revealing a tremendous amount of pattern information. This approach is incapable of providing correct choices in a short amount of time, e.g., real-time and online decision-making systems since it is difficult to extract relevant and important information in a short period of time from a huge body of found knowledge. Discovering closed patterns with high utilization (or closed pattern mining with high utility) is a market engineering method that discovers fewer but lucrative patterns. However, prior research has been unable to handle huge data, which is incompatible with today’s Internet of Things (IoT) environments, where huge volumes of data are collected every second. We begin by introducing the multi-objective model for mining the closed high utility itemsets (called MCUI-Miner), which employs MapReduce frameworks of a Spark structure. To begin, the multi-objective k-means the model is used to categorize transactions based on their significant relationship to the frequency component. The MapReduce model and GA are used to examine potential and probable candidates for mining closed high-utility itemsets in a large-scale database. Experiments have shown that the proposed framework outperforms the conventional CLS-Miner in terms of runtime, memory use, and scalability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. https://github.com/zakimjz/IBMGenerator

References

  1. Agrawal R, Imielinski T, Swami AN (1993) Database mining: A performance perspective. IEEE Transactions on Knowledge and Data Engineering 5(6):914–925

    Article  Google Scholar 

  2. Agrawal R, Srikant R (1995) Mining sequential patterns. The international conference on data engineering, pp 3–14

  3. Baek Y, Yun U, Kim H, Kim J, Vo B, Truong T, Deng ZH (2021) Approximate high utility itemset mining in noisy environments. Knowledge-Based Systems 212:106596

    Article  Google Scholar 

  4. Chen Y, An A (2016) Approximate parallel high utility itemset mining. Big Data Research 6:26–42

    Article  Google Scholar 

  5. Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Communications of the ACM 51(1):107–113

    Article  Google Scholar 

  6. Djenouri Y, Lin JCW, Nørvåg K, Ramampiaro H (2019) Highly efficient pattern mining based on transaction decomposition. IEEE international conference on data engineering, pp 1646–1649

  7. Dam TL, Li K, Fournier-Viger P, Duong QH (2019) CLS-Miner: efficient and effective closed high-utility itemset mining. Frontiers of Computer Science 13:357–381

    Article  Google Scholar 

  8. Fournier-Viger P, Wu CW, Zida S, Tseng VS (2014) FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. International symposium on methodologies for intelligent systems, pp 83–92

  9. Fournier-Viger P, Lin JCW, Gomariz A, Gueniche T, Soltani A, Deng Z, Lam HT (2016) The SPMF open-source data mining library version 2. The european conference on machine learning and knowledge discovery in databases, pp 36–40

  10. Frënti P, Sieranoja S (2019) How much can k-means be improved by using better initialization and repeats? Pattern Recognition 93:95–112

    Article  Google Scholar 

  11. Fournier-Viger P, Li Z, Lin JCW, Kiran RU, Fujita H (2019) Efficient algorithms to identify periodic patterns in multiple sequences. Information Sciences 489:205–226

    Article  MathSciNet  Google Scholar 

  12. Guha R, Ghosh M, Kapri S, Shaw S, Mutsuddi S, Bhateja V, Sarkar R (2019) Deluge based genetic algorithm for feature selection. Evolutionary intelligence, pp 1–11

  13. Gan W, Lin JCW, Chao HC, Fujita H, Yu PS (2019) Correlated utility-based pattern mining. Information Sciences 504:470–486

  14. Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. MIT Press, Cambridge

    Book  Google Scholar 

  15. Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery 8(1):53–87

    Article  MathSciNet  Google Scholar 

  16. Lucchese C, Orlando S, Perego R (2006) Fast and memory efficient mining of frequent closed itemsets. IEEE Transactions on Knowledge and Data Engineering 18(1):21–36

    Article  Google Scholar 

  17. Li H, Wang Y, Zhang D, Zhang M, Chang EY (2008) PFP: parallel fp-growth for query recommendation. ACM conference on recommender systems, pp 107–114

  18. Liu Y, Liao W, Choudhary AN (2005) A two-phase algorithm for fast discovery of high utility itemsets. Pacific-asia conference on advances in knowledge discovery and data mining. pp 689–695

  19. Lin JCW, Hong T, Lu W (2011) An effective tree structure for mining high utility itemsets. Expert Systems with Applications 38(6):7419–7424

    Article  Google Scholar 

  20. Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. ACM international conference on information and knowledge management, pp 55–64

  21. Liu J, Wang K, Fung BCM (2012) Direct discovery of high utility itemsets without candidate generation. IEEE international conference on data mining, pp 984–989

  22. Lin MY, Lee PY, Hsueh SC (2012) Apriori-based frequent itemset mining algorithms on MapReduce. The international conference on ubiquitous information management and communication, pp 1–8

  23. Lin YC, Wu CW, Tseng VS (2015) Mining high utility itemsets in big data. Pacific-asia conference on knowledge discovery and data mining, pp 649–661

  24. Lin JCW, Gan W, Fournier-Viger P, Hong TP, Tseng VS (2016) Efficient algorithms for mining high-utility itemsets in uncertain databases. Knowledge-Based Systems 96:171–187

    Article  Google Scholar 

  25. Lin JCW, Yang L, Fournier-Viger P, Hong TP (2019) Mining of skyline patterns by considering both frequent and utility constraints. Engineering Applications of Artificial Intelligence 77:229–238

    Article  Google Scholar 

  26. Lin JCW, Srivastava G, Zhang Y, Djenouri Y, Aloqaily M (2021) Privacy preserving multi-objective sanitization model in 6G IoT environments. IEEE Internet of Things Journal 8(7):5340–5349

    Article  Google Scholar 

  27. Lin JCW, Djenouri Y, Srivastava G, Yun U, Fournier-Viger P (2021) A predictive GA-based model for closed high-utility itemset mining. Applied Soft Computing 108:107422

    Article  Google Scholar 

  28. Lin JCW, Djenouri Y, Srivastava G (2021) Efficient closed high-utility pattern fusion model in large-scale databases. Information Fusion 76:122–132

    Article  Google Scholar 

  29. Schubert E, Sander J, Ester M, Kriegel HP, Wu X (2017) DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Transactions on Database Systems 42(3):1–21

    Article  MathSciNet  Google Scholar 

  30. Srivastava G, Lin JCW, Pirouz M, Li Y, Yun U (2020) A pre-large weighted-fusion system of sensed high-utility patterns. IEEE Sensors Journal

  31. Djenouri Y, Comuzzi M (2017) Combining Apriori heuristic and bio-inspired algorithms for solving the frequent itemsets mining problem. Information Sciences 420:1–15

    Article  Google Scholar 

  32. Tseng VS, Shie B, Wu C, Yu PS (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Transactions Knowledge and Data Engineering 25(8):1772–1786

    Article  Google Scholar 

  33. Tseng VS, Wu CW, Fournier-Viger P, Yu PS (2015) Efficient algorithms for mining the concise and lossless representation of high utility itemsets. IEEE Transactions on Knowledge and Data Engineering 27(3):726–739

    Article  Google Scholar 

  34. Wu JMT, Srivastava G, Wei M, Yun U, Lin JCW (2021) Fuzzy high-utility pattern mining in parallel and distributed Hadoop framework. Information Sciences 553:31–48

    Article  MathSciNet  Google Scholar 

  35. Wu CW, Fournier-Viger P, Gu JY, Tseng VS (2015) Mining closed+ high utility itemsets without candidate generation. Conference on Technologies and Applications of Artificial Intelligence, pp 187–194

  36. Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases. SIAM International Conference on Data Mining, pp 482–486

  37. Yun U, Kim D, Yoon E, Fujita H (2018) Damped window based high average utility pattern mining over data streams. Information Sciences 144:188–205

    Google Scholar 

  38. Yun U, Ryang H, Lee G, Fujita H (2017) An efficient algorithm for mining high utility patterns from incremental databases with one database scan. Knowledge-Based Systems 124:188–206

    Article  Google Scholar 

  39. Zaki MJ, Hsiao CJ (2005) Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Transactions on Knowledge and Data Engineering 17(4):462–478

    Article  Google Scholar 

  40. Zida S, Fournier-Viger P, Lin JCW, Wu CW, Tseng VS (2017) EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowledge and Information Systems 51(2):595–625

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jerry Chun-Wei Lin.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, J.CW., Djenouri, Y., Srivastava, G. et al. Efficient evolutionary computation model of closed high-utility itemset mining. Appl Intell 52, 10604–10616 (2022). https://doi.org/10.1007/s10489-021-03134-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-03134-3

Keywords

Navigation