An efficient method for mining multi-level high utility Itemsets

Tung, N. T.; Nguyen, Loan T. T.; Nguyen, Trinh D. D.; Vo, Bay

doi:10.1007/s10489-021-02681-z

An efficient method for mining multi-level high utility Itemsets

Published: 13 August 2021

Volume 52, pages 5475–5496, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

N. T. Tung⁴,
Loan T. T. Nguyen^2,3,
Trinh D. D. Nguyen⁴ &
…
Bay Vo ORCID: orcid.org/0000-0002-2723-1138¹

405 Accesses
9 Citations
Explore all metrics

Abstract

High-utility itemset mining (HUIM) is a useful tool for analyzing customer behavior in the field of data mining. HUIM algorithms can discover the most beneficial itemsets in transaction databases, namely the high-utility itemsets (HUIs), in contrast to frequent itemset mining (FIM) algorithms that rely on detecting frequent patterns. Several algorithms have been proposed to effectively carry out this task, but most of them ignore the categorization of items. In many real-world transaction databases, this helpful information about the categories and subcategories of items, represented as a taxonomy, is useful. Therefore, traditional HUIM algorithms can only discover itemsets at the lowest level of abstraction and leave out several important patterns from higher levels. To address this limitation, this work suggests the use of items taxonomy. Besides, to further enhance the performance of the task several effective pruning techniques are also revised and utilized to tighten the search space when considering the taxonomy of items. To accurately find multi-level HUIs from transaction databases enhanced with taxonomy information, a new algorithm called MLHMiner (Multiple-Level HMiner) is proposed, which is an extended version of the HMiner algorithm. We also prove that the pruning techniques of HMiner can be applied in different abstraction levels to efficiently mine multi-level HUIs. It can be seen from the experimental evaluations on several databases (both real and synthetic) that the designed approach is capable of identifying useful patterns from different abstraction levels with high efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

More Efficient Algorithms for Mining High-Utility Itemsets with Multiple Minimum Utility Thresholds

Mining Minimal High-Utility Itemsets

Efficient Mining of Top-K Cross-Level High Utility Itemsets

References

Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. ACM SIGMOD Rec 22(2):207–216
Article Google Scholar
Yao H, Hamilton HJ, Butz GJ (2004) A foundational approach to mining itemset utilities from databases. SIAM Intl Conf Data Mining 4:482–486
MathSciNet Google Scholar
Srikant R, Agrawal R (1997) Mining generalized association rules. Futur Gener Comput Syst 13(2–3):161–180
Article Google Scholar
Hipp J, Myka A, Wirth R, Güntzer U (1998) A new algorithm for faster mining of generalized association rules. Eur Sympo Princ Data Mining Knowl Disc 1510:74–82
Google Scholar
Vo B, Le B (2009) Fast algorithm for mining generalized association rules. Int J Database Theory 2(3):19–21
MathSciNet Google Scholar
Cagliero L, Chiusano S, Garza P, Ricupero G (2017) Discovering high-utility itemsets at multiple abstraction levels. Eur Conf Adv Databases Inform Syst 767:224–234
Google Scholar
P. Fournier-Viger, Y. Yang, J. C.-W. Lin, J. M. Luna, and S. Ventura, “Mining Cross-Level High Utility Itemsets,” in 33rd International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, p. 12, 2020
R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules in Large Databases,” in the 20th International Conference on Very Large Data Bases (VLDB ‘94), pp. 487–499, 1994
Zaki MJ (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12(3):372–390
Article Google Scholar
Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87
Article MathSciNet Google Scholar
K. Sriphaew and T. Theeramunkong, “A new method for finding generalized frequent itemsets in generalized association rule mining,” in IEEE Symposium on Computers and Communications, pp. 1040–1045, 2002
Appice A, Ceci M, Lanza A, Lisi FA, Malerba D (2003) Discovery of spatial association rules in geo-referenced census data: a relational mining approach. Intell Data Anal 7(6):541–566
Article Google Scholar
A. Appice, M. Berardi, M. Ceci, and D. Malerba, “Mining and Filtering Multi-level Spatial Association Rules with ARES,” in Foundations of Intelligent Systems, pp. 342–353, 2005
Zaki MJ, Hsiao CJ (2005) Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Trans Knowl Data Eng 17(4):462–478
Article Google Scholar
Wu CM, Huang YF (2011) Generalized association rule mining using an efficient data structure. Expert Syst Appl 38(6):7277–7290
Article Google Scholar
I. Pramudiono and M. Kitsuregawa, “FP-tax: Tree structure based generalized association rule mining,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 60–63, 2004
Baralis E, Cagliero L, Cerquitelli T, Garza P (2012) Generalized association rule mining with constraints. Inf Sci 194:68–84
Article Google Scholar
Han J, Fu Y (1999) Mining multiple-level association rules in large databases. IEEE Trans Knowl Data Eng 11(5):798–805
Article Google Scholar
Lui CL, Chung FL (2000) Discovery of generalized association rules with multiple minimum supports. Eur Conf Princ Data Mining Knowl Disc 1910:510–515
Article Google Scholar
Y. Liu, W. K. Liao, and A. Choudhary, “A two-phase algorithm for fast discovery of high utility itemsets,” in the 9th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, vol. 3518, pp. 689–695, 2005
Tseng VS, Shie BE, Wu CW, Yu PS (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25(8):1772–1786
Article Google Scholar
M. Liu and J. Qu, “Mining high utility itemsets without candidate generation,” in the 21st ACM International Conference on Information and Knowledge Management, pp. 55–64, 2012
J. Liu, K. Wang, and B. C. M. Fung, “Direct discovery of high utility itemsets without candidate generation,” in Proceedings - IEEE International Conference on Data Mining, ICDM, pp. 984–989, 2012
P. Fournier-Viger, C. W. Wu, S. Zida, and V. S. Tseng, “FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning,” in International Symposium on Methodologies for Intelligent Systems, vol. 8502 LNAI, pp. 83–92, 2014
Deng Z-H (2018) An efficient structure for fast mining high utility itemsets. Appl Intell 48(9):3161–3177
Article Google Scholar
Zida S, Fournier-Viger P, Lin JCW, Wu CW, Tseng VS (2017) EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl Inf Syst 51(2):595–625
Article Google Scholar
Krishnamoorthy S (2017) HMiner: efficiently mining high utility itemsets. Expert Syst Appl 90:168–183
Article Google Scholar
Nguyen LTT, Nguyen P, Nguyen TDD, Vo B, Fournier-Viger P, Tseng VS (2019) Mining high-utility itemsets in dynamic profit databases. Knowl-Based Syst 175:130–144
Article Google Scholar
Fournier-Viger P, Zhang Y, Chun-Wei Lin J, Fujita H, Koh YS (2019) Mining local and peak high utility itemsets. Inf Sci 481:344–367
Article MathSciNet Google Scholar
Baek Y, Yun U, Kim H, Kim J, Vo B, Truong T, Deng Z-H (2021) Approximate high utility itemset mining in noisy environments. Knowl-Based Syst 212:106596
Article Google Scholar
Nguyen LTT, Vu VV, Lam MTH, Duong TTM, Manh LT, Nguyen TTT, Vo B, Fujita H (2019) An efficient method for mining high utility closed itemsets. Inf Sci 495:78–99
Article Google Scholar
Wei T, Wang B, Zhang Y, Hu K, Yao Y, Liu H (2020) FCHUIM: efficient frequent and closed high-utility Itemsets mining. IEEE Access 8:109928–109939
Article Google Scholar
Nguyen LTT, Vu DB, Nguyen TDD, Vo B (2020) Mining maximal high utility Itemsets on dynamic profit databases. Cybern Syst 51(2):140–160
Article Google Scholar
Yun U, Nam H, Kim J, Kim H, Baek Y, Lee J, Yoon E, Truong T, Vo B, Pedrycz W (2020) Efficient transaction deleting approach of pre-large based high utility pattern mining in dynamic databases. Futur Gener Comput Syst 103:58–78
Article Google Scholar
G. Srivastava, J. C. Lin, M. Pirouz, Y. Li, and U. Yun, “A Pre-large Weighted-Fusion System of Sensed High-Utility Patterns,” IEEE Sensors Journal, p. 1, 2020
Vo B, Nguyen LV, Vu VV, Lam MTH, Duong TTM, Manh LT, Nguyen TTT, Nguyen LTT, Hong T-P (2020) Mining correlated high utility Itemsets in one phase. IEEE Access 8:90465–90477
Article Google Scholar
Gan W, Lin JC-W, Chao H-C, Fujita H, Yu PS (2019) Correlated utility-based pattern mining. Inf Sci 504:470–486
Article MathSciNet Google Scholar
Gan W, Lin JC-W, Zhang J, Chao H-C, Fujita H, Yu PS (2020) ProUM: projection-based utility mining on sequence data. Inf Sci 513:222–240
Article Google Scholar
Nam H, Yun U, Yoon E, Lin JC-W (2020) Efficient approach of recent high utility stream pattern mining with indexed list structure and pruning strategy considering arrival times of transactions. Inf Sci 529:1–27
Article MathSciNet Google Scholar
Vo B, Nguyen LTT, Nguyen TDD, Fournier-Viger P, Yun U (2020) A multi-Core approach to efficiently mining high-utility Itemsets in dynamic profit databases. IEEE Access 8:85890–85899
Article Google Scholar
Wu JM-T, Srivastava G, Wei M, Yun U, Lin JC-W (2021) Fuzzy high-utility pattern mining in parallel and distributed Hadoop framework. Inf Sci 553:31–48
Article MathSciNet Google Scholar
C.-W. Lin, T.-P. Hong, and W.-H. Lu, “Efficiently Mining High Average Utility Itemsets with a Tree Structure,” in Intelligent Information and Database Systems, pp. 131–139, 2010
Lin JCW, Li T, Fournier-Viger P, Hong TP, Zhan J, Voznak M (2016) An efficient algorithm to mine high average-utility itemsets. Adv Eng Inform 30(2):233–243
Article Google Scholar
Lin JCW, Ren S, Fournier-Viger P, Hong TP (2017) EHAUPM: efficient high average-utility pattern mining with tighter upper bounds. IEEE Access 5:12927–12940
Article Google Scholar
Kim J, Yun U, Yoon E, Lin JC-W, Fournier-Viger P (2020) One scan based high average-utility pattern mining in static and dynamic databases. Futur Gener Comput Syst 111:143–158
Article Google Scholar
Kim H, Yun U, Baek Y, Kim J, Vo B, Yoon E, Fujita H (2021) Efficient list based mining of high average utility patterns with maximum average pruning strategies. Inf Sci 543:85–105
Article Google Scholar
M. Nouioua, Y. Wang, P. Fournier-Viger, J. C.-W. Lin, and J. M.-T. Wu, “TKC: Mining Top-K Cross-Level High Utility Itemsets,” in 2020 International Conference on Data Mining Workshops (ICDMW), pp. 673–682, 2020

Download references

Acknowledgements

This research is funded by Vietnam National University HoChiMinh City (VNU-HCM) under grant number C2020-28-04.

Author information

Authors and Affiliations

Faculty of Information Technology, Ho Chi Minh City University of Technology (HUTECH), Ho Chi Minh City, Vietnam
Bay Vo
School of Computer Science and Engineering, International University, Ho Chi Minh City, Vietnam
Loan T. T. Nguyen
Vietnam National University, Ho Chi Minh City, Vietnam
Loan T. T. Nguyen
University of Information Technology, Vietnam National University of Ho Chi Minh City, Ho Chi Minh City, Vietnam
N. T. Tung & Trinh D. D. Nguyen

Authors

N. T. Tung
View author publications
You can also search for this author in PubMed Google Scholar
Loan T. T. Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Trinh D. D. Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Bay Vo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Loan T. T. Nguyen or Bay Vo.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tung, N.T., Nguyen, L.T.T., Nguyen, T.D.D. et al. An efficient method for mining multi-level high utility Itemsets. Appl Intell 52, 5475–5496 (2022). https://doi.org/10.1007/s10489-021-02681-z

Download citation

Accepted: 11 July 2021
Published: 13 August 2021
Issue Date: March 2022
DOI: https://doi.org/10.1007/s10489-021-02681-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient method for mining multi-level high utility Itemsets

Abstract

Access this article

Similar content being viewed by others

More Efficient Algorithms for Mining High-Utility Itemsets with Multiple Minimum Utility Thresholds

Mining Minimal High-Utility Itemsets

Efficient Mining of Top-K Cross-Level High Utility Itemsets

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An efficient method for mining multi-level high utility Itemsets

Abstract

Access this article

Similar content being viewed by others

More Efficient Algorithms for Mining High-Utility Itemsets with Multiple Minimum Utility Thresholds

Mining Minimal High-Utility Itemsets

Efficient Mining of Top-K Cross-Level High Utility Itemsets

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation