Skip to main content
Log in

An efficient method for mining multi-level high utility Itemsets

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

High-utility itemset mining (HUIM) is a useful tool for analyzing customer behavior in the field of data mining. HUIM algorithms can discover the most beneficial itemsets in transaction databases, namely the high-utility itemsets (HUIs), in contrast to frequent itemset mining (FIM) algorithms that rely on detecting frequent patterns. Several algorithms have been proposed to effectively carry out this task, but most of them ignore the categorization of items. In many real-world transaction databases, this helpful information about the categories and subcategories of items, represented as a taxonomy, is useful. Therefore, traditional HUIM algorithms can only discover itemsets at the lowest level of abstraction and leave out several important patterns from higher levels. To address this limitation, this work suggests the use of items taxonomy. Besides, to further enhance the performance of the task several effective pruning techniques are also revised and utilized to tighten the search space when considering the taxonomy of items. To accurately find multi-level HUIs from transaction databases enhanced with taxonomy information, a new algorithm called MLHMiner (Multiple-Level HMiner) is proposed, which is an extended version of the HMiner algorithm. We also prove that the pruning techniques of HMiner can be applied in different abstraction levels to efficiently mine multi-level HUIs. It can be seen from the experimental evaluations on several databases (both real and synthetic) that the designed approach is capable of identifying useful patterns from different abstraction levels with high efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2.
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. ACM SIGMOD Rec 22(2):207–216

    Article  Google Scholar 

  2. Yao H, Hamilton HJ, Butz GJ (2004) A foundational approach to mining itemset utilities from databases. SIAM Intl Conf Data Mining 4:482–486

    MathSciNet  Google Scholar 

  3. Srikant R, Agrawal R (1997) Mining generalized association rules. Futur Gener Comput Syst 13(2–3):161–180

    Article  Google Scholar 

  4. Hipp J, Myka A, Wirth R, Güntzer U (1998) A new algorithm for faster mining of generalized association rules. Eur Sympo Princ Data Mining Knowl Disc 1510:74–82

    Google Scholar 

  5. Vo B, Le B (2009) Fast algorithm for mining generalized association rules. Int J Database Theory 2(3):19–21

    MathSciNet  Google Scholar 

  6. Cagliero L, Chiusano S, Garza P, Ricupero G (2017) Discovering high-utility itemsets at multiple abstraction levels. Eur Conf Adv Databases Inform Syst 767:224–234

    Google Scholar 

  7. P. Fournier-Viger, Y. Yang, J. C.-W. Lin, J. M. Luna, and S. Ventura, “Mining Cross-Level High Utility Itemsets,” in 33rd International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, p. 12, 2020

  8. R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules in Large Databases,” in the 20th International Conference on Very Large Data Bases (VLDB ‘94), pp. 487–499, 1994

  9. Zaki MJ (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12(3):372–390

    Article  Google Scholar 

  10. Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87

    Article  MathSciNet  Google Scholar 

  11. K. Sriphaew and T. Theeramunkong, “A new method for finding generalized frequent itemsets in generalized association rule mining,” in IEEE Symposium on Computers and Communications, pp. 1040–1045, 2002

  12. Appice A, Ceci M, Lanza A, Lisi FA, Malerba D (2003) Discovery of spatial association rules in geo-referenced census data: a relational mining approach. Intell Data Anal 7(6):541–566

    Article  Google Scholar 

  13. A. Appice, M. Berardi, M. Ceci, and D. Malerba, “Mining and Filtering Multi-level Spatial Association Rules with ARES,” in Foundations of Intelligent Systems, pp. 342–353, 2005

  14. Zaki MJ, Hsiao CJ (2005) Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Trans Knowl Data Eng 17(4):462–478

    Article  Google Scholar 

  15. Wu CM, Huang YF (2011) Generalized association rule mining using an efficient data structure. Expert Syst Appl 38(6):7277–7290

    Article  Google Scholar 

  16. I. Pramudiono and M. Kitsuregawa, “FP-tax: Tree structure based generalized association rule mining,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 60–63, 2004

  17. Baralis E, Cagliero L, Cerquitelli T, Garza P (2012) Generalized association rule mining with constraints. Inf Sci 194:68–84

    Article  Google Scholar 

  18. Han J, Fu Y (1999) Mining multiple-level association rules in large databases. IEEE Trans Knowl Data Eng 11(5):798–805

    Article  Google Scholar 

  19. Lui CL, Chung FL (2000) Discovery of generalized association rules with multiple minimum supports. Eur Conf Princ Data Mining Knowl Disc 1910:510–515

    Article  Google Scholar 

  20. Y. Liu, W. K. Liao, and A. Choudhary, “A two-phase algorithm for fast discovery of high utility itemsets,” in the 9th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, vol. 3518, pp. 689–695, 2005

  21. Tseng VS, Shie BE, Wu CW, Yu PS (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25(8):1772–1786

    Article  Google Scholar 

  22. M. Liu and J. Qu, “Mining high utility itemsets without candidate generation,” in the 21st ACM International Conference on Information and Knowledge Management, pp. 55–64, 2012

  23. J. Liu, K. Wang, and B. C. M. Fung, “Direct discovery of high utility itemsets without candidate generation,” in Proceedings - IEEE International Conference on Data Mining, ICDM, pp. 984–989, 2012

  24. P. Fournier-Viger, C. W. Wu, S. Zida, and V. S. Tseng, “FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning,” in International Symposium on Methodologies for Intelligent Systems, vol. 8502 LNAI, pp. 83–92, 2014

  25. Deng Z-H (2018) An efficient structure for fast mining high utility itemsets. Appl Intell 48(9):3161–3177

    Article  Google Scholar 

  26. Zida S, Fournier-Viger P, Lin JCW, Wu CW, Tseng VS (2017) EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl Inf Syst 51(2):595–625

    Article  Google Scholar 

  27. Krishnamoorthy S (2017) HMiner: efficiently mining high utility itemsets. Expert Syst Appl 90:168–183

    Article  Google Scholar 

  28. Nguyen LTT, Nguyen P, Nguyen TDD, Vo B, Fournier-Viger P, Tseng VS (2019) Mining high-utility itemsets in dynamic profit databases. Knowl-Based Syst 175:130–144

    Article  Google Scholar 

  29. Fournier-Viger P, Zhang Y, Chun-Wei Lin J, Fujita H, Koh YS (2019) Mining local and peak high utility itemsets. Inf Sci 481:344–367

    Article  MathSciNet  Google Scholar 

  30. Baek Y, Yun U, Kim H, Kim J, Vo B, Truong T, Deng Z-H (2021) Approximate high utility itemset mining in noisy environments. Knowl-Based Syst 212:106596

    Article  Google Scholar 

  31. Nguyen LTT, Vu VV, Lam MTH, Duong TTM, Manh LT, Nguyen TTT, Vo B, Fujita H (2019) An efficient method for mining high utility closed itemsets. Inf Sci 495:78–99

    Article  Google Scholar 

  32. Wei T, Wang B, Zhang Y, Hu K, Yao Y, Liu H (2020) FCHUIM: efficient frequent and closed high-utility Itemsets mining. IEEE Access 8:109928–109939

    Article  Google Scholar 

  33. Nguyen LTT, Vu DB, Nguyen TDD, Vo B (2020) Mining maximal high utility Itemsets on dynamic profit databases. Cybern Syst 51(2):140–160

    Article  Google Scholar 

  34. Yun U, Nam H, Kim J, Kim H, Baek Y, Lee J, Yoon E, Truong T, Vo B, Pedrycz W (2020) Efficient transaction deleting approach of pre-large based high utility pattern mining in dynamic databases. Futur Gener Comput Syst 103:58–78

    Article  Google Scholar 

  35. G. Srivastava, J. C. Lin, M. Pirouz, Y. Li, and U. Yun, “A Pre-large Weighted-Fusion System of Sensed High-Utility Patterns,” IEEE Sensors Journal, p. 1, 2020

  36. Vo B, Nguyen LV, Vu VV, Lam MTH, Duong TTM, Manh LT, Nguyen TTT, Nguyen LTT, Hong T-P (2020) Mining correlated high utility Itemsets in one phase. IEEE Access 8:90465–90477

    Article  Google Scholar 

  37. Gan W, Lin JC-W, Chao H-C, Fujita H, Yu PS (2019) Correlated utility-based pattern mining. Inf Sci 504:470–486

    Article  MathSciNet  Google Scholar 

  38. Gan W, Lin JC-W, Zhang J, Chao H-C, Fujita H, Yu PS (2020) ProUM: projection-based utility mining on sequence data. Inf Sci 513:222–240

    Article  Google Scholar 

  39. Nam H, Yun U, Yoon E, Lin JC-W (2020) Efficient approach of recent high utility stream pattern mining with indexed list structure and pruning strategy considering arrival times of transactions. Inf Sci 529:1–27

    Article  MathSciNet  Google Scholar 

  40. Vo B, Nguyen LTT, Nguyen TDD, Fournier-Viger P, Yun U (2020) A multi-Core approach to efficiently mining high-utility Itemsets in dynamic profit databases. IEEE Access 8:85890–85899

    Article  Google Scholar 

  41. Wu JM-T, Srivastava G, Wei M, Yun U, Lin JC-W (2021) Fuzzy high-utility pattern mining in parallel and distributed Hadoop framework. Inf Sci 553:31–48

    Article  MathSciNet  Google Scholar 

  42. C.-W. Lin, T.-P. Hong, and W.-H. Lu, “Efficiently Mining High Average Utility Itemsets with a Tree Structure,” in Intelligent Information and Database Systems, pp. 131–139, 2010

  43. Lin JCW, Li T, Fournier-Viger P, Hong TP, Zhan J, Voznak M (2016) An efficient algorithm to mine high average-utility itemsets. Adv Eng Inform 30(2):233–243

    Article  Google Scholar 

  44. Lin JCW, Ren S, Fournier-Viger P, Hong TP (2017) EHAUPM: efficient high average-utility pattern mining with tighter upper bounds. IEEE Access 5:12927–12940

    Article  Google Scholar 

  45. Kim J, Yun U, Yoon E, Lin JC-W, Fournier-Viger P (2020) One scan based high average-utility pattern mining in static and dynamic databases. Futur Gener Comput Syst 111:143–158

    Article  Google Scholar 

  46. Kim H, Yun U, Baek Y, Kim J, Vo B, Yoon E, Fujita H (2021) Efficient list based mining of high average utility patterns with maximum average pruning strategies. Inf Sci 543:85–105

    Article  Google Scholar 

  47. M. Nouioua, Y. Wang, P. Fournier-Viger, J. C.-W. Lin, and J. M.-T. Wu, “TKC: Mining Top-K Cross-Level High Utility Itemsets,” in 2020 International Conference on Data Mining Workshops (ICDMW), pp. 673–682, 2020

Download references

Acknowledgements

This research is funded by Vietnam National University HoChiMinh City (VNU-HCM) under grant number C2020-28-04.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Loan T. T. Nguyen or Bay Vo.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tung, N.T., Nguyen, L.T.T., Nguyen, T.D.D. et al. An efficient method for mining multi-level high utility Itemsets. Appl Intell 52, 5475–5496 (2022). https://doi.org/10.1007/s10489-021-02681-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02681-z

Keywords

Navigation