Efficient high utility itemset mining using buffered utility-lists

Duong, Quang-Huy; Fournier-Viger, Philippe; Ramampiaro, Heri; Nørvåg, Kjetil; Dam, Thu-Lan

doi:10.1007/s10489-017-1057-2

Efficient high utility itemset mining using buffered utility-lists

Published: 15 September 2017

Volume 48, pages 1859–1877, (2018)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Quang-Huy Duong¹,
Philippe Fournier-Viger²,
Heri Ramampiaro¹,
Kjetil Nørvåg¹ &
…
Thu-Lan Dam¹

814 Accesses
73 Citations
Explore all metrics

Abstract

Discovering high utility itemsets in transaction databases is a key task for studying the behavior of customers. It consists of finding groups of items bought together that yield a high profit. Several algorithms have been proposed to mine high utility itemsets using various approaches and more or less complex data structures. Among existing algorithms, one-phase algorithms employing the utility-list structure have shown to be the most efficient. In recent years, the simplicity of the utility-list structure has led to the development of numerous utility-list based algorithms for various tasks related to utility mining. However, a major limitation of utility-list based algorithms is that creating and maintaining utility-lists are time consuming and can consume a huge amount of memory. The reasons are that numerous utility lists are built and that the utility-list intersection/join operation to construct a utility-list is costly. This paper addresses this issue by proposing an improved utility-list structure called utility-list buffer to reduce the memory consumption and speed up the join operation. This structure is integrated into a novel algorithm named ULB-Miner (Utility-List Buffer for high utility itemset Miner), which introduces several new ideas to more efficiently discover high utility itemsets. ULB-Miner uses the designed utility-list buffer structure to efficiently store and retrieve utility-lists, and reuse memory during the mining process. Moreover, the paper also introduces a linear time method for constructing utility-list segments in a utility-list buffer. An extensive experimental study on various datasets shows that the proposed algorithm relying on the novel utility-list buffer structure is highly efficient in terms of both execution time and memory consumption. The ULB-Miner algorithm is up to 10 times faster than the FHM and HUI-Miner algorithms and consumes up to 6 times less memory. Moreover, it performs well on both dense and sparse datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

More Efficient Algorithms for Mining High-Utility Itemsets with Multiple Minimum Utility Thresholds

A fast algorithm for mining high average-utility itemsets

Article 11 March 2017

Efficient Algorithms for High Utility Itemset Mining Without Candidate Generation

Notes

References

Agrawal R, Srikan R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of 20th international conference on very large data bases (VLDB 1994). Morgan Kaufmann, pp 487–499
Agrawal R, Srikant R (1994) Quest synthetic data generator. Available at. http://www.almaden.ibm.com/cs/quest/syndata.html
Ahmed C, Tanbeer S, Jeong BS, Lee YK (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21(12):1708–1721
Article Google Scholar
Ahmed CF, Tanbeer SK, Jeong BS, Lee YK (2009) Efficient mining of utility-based web path traversal patterns. In: Proceedings of the 11th international conference on advanced communication technology - vol 3, ICACT’09, pp. 2215–2218
Chan R, Yang Q, Shen YD (2003) Mining high utility itemsets. In: Proceedings of the 3rd IEEE international conference on data mining, pp 19–26
Dam TL, Li K, Fournier-Viger P, Duong QH (2016) An efficient algorithm for mining top-rank-k frequent patterns. Appl Intell 45(1):96–111
Article Google Scholar
Dam TL, Li K, Fournier-Viger P, Duong QH (2017) CLS-Miner: efficient and effective closed high utility itemset mining. Frontiers of Computer Science, pp 1–27
Dam TL, Li K, Fournier-Viger P, Duong QH (2017) An efficient algorithm for mining top-k on-shelf high utility itemsets. Knowl Inf Syst 52(3):621–655
Duong QH, Liao B, Fournier-Viger P, Dam TL (2016) An efficient algorithm for mining the top-k high utility itemsets, using novel threshold raising and pruning strategies. Knowl-Based Syst 104:106–122
Article Google Scholar
Fournier-Viger P, Gomariz A, Gueniche T, Soltani A, Wu CW, Tseng V (2014) SPMF: A java open-source pattern mining library. J Mach Learn Res 15:3569–3573
MATH Google Scholar
Fournier-Viger P, Lin JC, Duong Q, Dam T (2016) FHM+: Faster high-utility itemset mining using length upper-bound reduction. In: Proceedings of the 29th international conference on industrial engineering and other applications of applied intelligent systems, pp 115–127
Fournier-Viger P, Lin JCW, Duong QH, Dam TL (2016) PHM: Mining periodic high-utility itemsets. In: Proceedings of the 16th industrial conference on data mining. Springer, pp 64–79. Springer
Fournier-Viger P, Wu CW, Zida S, Tseng V (2014) FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Proceedings of the 21st international symposium on methodologies for intelligent systems, pp 83–92
Fournier-Viger P, Zida S (2015) FOSHU: Faster on-shelf high utility itemset mining – with or without negative unit profit. In: Proceedings of the 30th annual ACM symposium on applied computing, SAC ’15, pp 857–864
Grahne G, Zhu J (2005) Fast algorithms for frequent itemset mining using fp-trees. IEEE Trans Knowl Data Eng 17(10):1347–1362
Article Google Scholar
Han J, Wang J, Lu Y, Tzvetkov P (2002) Mining top-k frequent closed patterns without minimum support. In: Proceedings of the IEEE international conference on data mining, pp 211–218
Han JW, Pei J, Yin YW (2004) Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87
Article MathSciNet Google Scholar
Joshi M, Bhalodia D (2016) Mining high utility itemset using graphics processor. In: Proceedings of the international symposium on intelligent systems technologies and applications, pp 665–674
Krishnamoorthy S (2015) Pruning strategies for mining high utility itemsets. Expert Syst Appl 42(5):2371–2381
Article Google Scholar
Lan GC, Hong TP, Tseng V (2014) An efficient projection-based indexing approach for mining high utility itemsets. Knowl Inf Syst 38(1):85–107
Article Google Scholar
Lee S, Park JS (2016) Top-k high utility itemset mining based on utility-list structures. In: Proceedings of the international conference on big data and smart computing, pp 101–108
Lin JCW, Gan W, Fournier-Viger P, Hong TP, Tseng V (2016) Efficient algorithms for mining high-utility itemsets in uncertain databases. Knowl-Based Syst 96:171–187
Article Google Scholar
Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM international conference on information and knowledge management, CIKM ’12, pp 55–64
Liu Y, Liao WK, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: Proceedings of the 9th pacific-asia conference on advances in knowledge discovery and data mining, PAKDD’05, pp 689–695
Sahoo J, Das AK, Goswami A (2016) An efficient fast algorithm for discovering closed+ high utility itemsets. Appl Intell 45(1):44–74
Song W, Liu Y, Li J (2014) BAHUI: Fast and memory efficient mining of high utility itemsets based on bitmap. Int J Data Warehouse Min 10(1):1–15
Article Google Scholar
Song W, Liu Y, Li J (2014) Mining high utility itemsets by dynamically pruning the tree structure. Appl Intell 40(1):29– 43
Article Google Scholar
Song W, Zhang Z, Li J (2016) A high utility itemset mining algorithm based on subsume index. Knowl Inf Syst 49(1):315– 340
Article Google Scholar
Thilagu M, Nadarajan R (2012) Efficiently mining of effective web traversal patterns with average utility. Procedia Technol 6:444–451
Article Google Scholar
Tseng V, Shie BE, Wu CW, Yu P (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25(8):1772–1786
Article Google Scholar
Tseng V, Wu CW, Fournier-Viger P, Yu P (2016) Efficient algorithms for mining top-k high utility itemsets. IEEE Trans Knowl Data Eng 28(1):54–67
Article Google Scholar
Wang JZ, Huang JL, Chen YC (2016) On efficiently mining high utility sequential patterns. Knowl Inf Syst 49(2):597–627
Article Google Scholar
Wu CW, Fournier-Viger P, Gu JY, Tseng V (2015) Mining closed+ high utility itemsets without candidate generation. In: 2015 conference on technologies and applications of artificial intelligence (TAAI), pp 187–194
Wu CW, Shie BE, Tseng V, Yu PS (2012) Mining top-k high utility itemsets. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’12, pp 78–86
Liu Y-C, Cheng C-P, Tseng V (2013) Mining differential top-k co-expression patterns from time course comparative gene expression datasets. BMC Bioinformatics 14:230
Article Google Scholar
Yun U, Ryang H, Lee G, Fujita H (2017) An efficient algorithm for mining high utility patterns from incremental databases with one database scan. Knowl-Based Syst 124:188–206
Yun U, Ryang H, Ryu KH (2014) High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates. Expert Syst Appl 41(8):3861–3878
Article Google Scholar
Zaki MJ, Gouda K (2003) Fast vertical mining using diffsets. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining, pp 326–335

Download references

Acknowledgments

This research was partly funded by the Norwegian University of Science and Technology (NTNU) through the MUSED project and partly supported by the Youth 1000 funding of Prof. Philippe Fournier-Viger. The work of Mrs. Dam was carried out during the tenure of an ERCIM “Alain Bensoussan” Fellowship Programme.

Author information

Authors and Affiliations

Department of Computer and Information Science, Norwegian University of Science and Technology, Trondheim, Norway
Quang-Huy Duong, Heri Ramampiaro, Kjetil Nørvåg & Thu-Lan Dam
School of Humanities and Social Sciences, Harbin Institute of Technology (Shenzhen), Shenzhen, GD, 518055, China
Philippe Fournier-Viger

Authors

Quang-Huy Duong
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Fournier-Viger
View author publications
You can also search for this author in PubMed Google Scholar
Heri Ramampiaro
View author publications
You can also search for this author in PubMed Google Scholar
Kjetil Nørvåg
View author publications
You can also search for this author in PubMed Google Scholar
Thu-Lan Dam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Philippe Fournier-Viger or Heri Ramampiaro.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Duong, QH., Fournier-Viger, P., Ramampiaro, H. et al. Efficient high utility itemset mining using buffered utility-lists. Appl Intell 48, 1859–1877 (2018). https://doi.org/10.1007/s10489-017-1057-2

Download citation

Published: 15 September 2017
Issue Date: July 2018
DOI: https://doi.org/10.1007/s10489-017-1057-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient high utility itemset mining using buffered utility-lists

Abstract

Access this article

Similar content being viewed by others

More Efficient Algorithms for Mining High-Utility Itemsets with Multiple Minimum Utility Thresholds

A fast algorithm for mining high average-utility itemsets

Efficient Algorithms for High Utility Itemset Mining Without Candidate Generation

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient high utility itemset mining using buffered utility-lists

Abstract

Access this article

Similar content being viewed by others

More Efficient Algorithms for Mining High-Utility Itemsets with Multiple Minimum Utility Thresholds

A fast algorithm for mining high average-utility itemsets

Efficient Algorithms for High Utility Itemset Mining Without Candidate Generation

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation