Efficient algorithm for mining high average-utility itemsets in incremental transaction databases

Kim, Donggyu; Yun, Unil

doi:10.1007/s10489-016-0890-z

Efficient algorithm for mining high average-utility itemsets in incremental transaction databases

Published: 14 February 2017

Volume 47, pages 114–131, (2017)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Donggyu Kim¹ &
Unil Yun¹

452 Accesses
35 Citations
Explore all metrics

Abstract

In this paper, we present a novel algorithm for efficiently mining high average-utility itemsets (HAUIs) from incremental databases, in which their volumes can be expanded dynamically. The previous algorithms have inefficiencies in that they must scan a given database multiple times so as to generate candidate itemsets and determine valid itemsets level by level. The reason is that they follow the basic framework of an Apriori-like approach. This drawback can cause critical problems in processing incremental databases because scanning a database becomes a tougher task as the size of the database is increased. In contrast, the algorithm proposed in this paper builds a compact tree structure maintaining all necessary information in order to avoid such excessive database scanning during its mining process. The previous algorithms suffer from the huge generation of unnecessary candidate itemsets at each level accompanied by the naive combination based candidate generation manner of an Apriori-like approach, which generates candidate itemsets with (k+1)-lengths by simply joining itemsets with k-lengths. On the other hand, our algorithm employs the pattern growth approach, which allows the algorithm to generate a set of only essential candidate itemsets. In order for our algorithm to constantly preserve the compactness of its tree structure during the entire incremental mining process, a restructuring technique is exploited. In the performance evaluation, we show that our algorithm is faster and consumes less memory space than competitors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Trends and Future Perspective Challenges in Big Data

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Article 12 April 2024

Rashmin Gajera, Suresh Patel, … Ayush Solanki

A comprehensive survey of data mining

Article 06 February 2020

Manoj Kumar Gupta & Pravin Chandra

References

Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: 20th international conference on very large data bases, pp 487–499
Ahmed CF, Tanbeer SK, Jeong B, Lee Y (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21(12):1708–1721
Article Google Scholar
Bennett KP, Mangasarian OL (1992) Robust linear programming discrimination of two linearly inseparable sets. Optim Methods Software 1:23–34
Article Google Scholar
Cheung DW, Han J, Ng VT, Wong CY (1996) Maintenance of discovered association rules in large databases: an incremental updating approach. In: The 12th IEEE international conference on data engineering, pp 106–114
Duong Q, Liao B, Fournier-Viger P, Dam T (2016) An efficient algorithm for mining the top-k high utility itemsets, using novel threshold raising and pruning strategies. Knowl-Based Syst 104:106–122
Article Google Scholar
Fournier-Viger P, Wu C, Zida S, Tseng V (2014) FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. In: ISMIS, pp 83–92
Fan Y, Ye Y, Chen L (2016) Malicious sequential pattern mining for automatic malware detection. Expert Syst Appl 52:16–25
Article Google Scholar
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data, pp 1–12
Hong T, Lee C, Wang S (2011) Effective utility mining with the measure of average utility. Expert Syst Appl 38(7):8259–8265
Article Google Scholar
Hong T, Lee C, Wang S (2009) An incremental mining algorithm for high average-utility itemsets. In: ISPAN 2009, pp 421–425
Koh J, Shieh S (2003) An efficient approach for maintaining association rules based on adjusting FP-tree structures. In: DASFAA, pp 417–424
Krishnamoorthy S (2015) Pruning strategies for mining high utility itemsets. Expert Syst Appl 42(5):2371–2381
Article Google Scholar
Kim D, Yun U (2016) Efficient mining of high utility pattern with considering of rarity and length. Appl Intell 45(1):152–173
Article Google Scholar
Kim D, Yun U (2016) Mining high utility itemsets based on the time decaying model. Intell Data Anal 20 (5):1157–1180
Article Google Scholar
Lan G, Hong T, Tseng V (2012) A projection-based approach for discovering high average-utility itemsets. J Inf Sci Eng 28:193–209
Google Scholar
Lan G, Hong T, Tseng V (2012) Efficiently mining high average-utility itemsets with an improved upper-bound strategy. Int J Inf Technol Decis Making 11(5):1009–1030
Article Google Scholar
Le T, Vo B (2015) An N-list-based algorithm for mining frequent closed patterns. Expert Syst Appl 42 (19):6648–6657
Article Google Scholar
Lee G, Yun U, Ryu K (2014) Sliding window based weighted maximal frequent pattern mining over data streamss. Expert Syst Appl 41(2):694–708
Article Google Scholar
Lee G, Yun U, Ryang H (2015) An uncertainty-based approach: frequent itemset mining from uncertain data with different item importance. Knowl-Based Syst 90:239–256
Article Google Scholar
Lee G, Yun U, Ryang H, Kim D (2016) Approximate maximal frequent pattern mining with weight conditions and error tolerance. Int J Pattern Recognit Artif Intell 30(6):1–42
Article Google Scholar
Lee G, Yun U, Ryang H, Kim D (2016) Erasable itemset mining over incremental databases with weight conditions. Eng Appl Artif Intell 52:213–234
Article Google Scholar
Lin J, Gan W, Hong T, Tseng V (2015) Efficient algorithms for mining up-to-date high utility patterns. Adv Eng Inform 29(3):648–661
Article Google Scholar
Lin J, Gan W, Fournier-Viger P, Hong T, Tseng V (2016) Efficient algorithms for mining high-utility itemsets in uncertain databases. Knowl-Based Syst 96:171–187
Article Google Scholar
Liu Y, Liao W, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: Advances in knowledge discovery and data mining, pp 689–695
Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM international conference on Information and knowledge management, pp 55–64
Lu T, Vo B, Nguyen HT, Hong T (2014) A new method for mining high average utility itemsets. In: Computer Information Systems and Industrial Management, pp 33–42
Pisharath J, Liu Y, Ozisikyilmaz B, Narayanan R, Liao WK, Choudhary A Memik G NU-MineBench version 2.0 dataset and technical report, http://cucis.ece.northwestern.edu/projects/DMS/
Ryang H, Yun U (2015) Top-K high utility pattern mining with effective threshold raising strategies. Knowl-Based Syst 76:109–126
Article Google Scholar
Ryang H, Yun U, Ryu K (2016) Fast algorithm for high utility pattern mining with sum of item quantities. Intell Data Anal 20(2):395–415
Article Google Scholar
Tseng V, Shie BE, Wu CW, Yu PS (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25(8):1772–1786
Article Google Scholar
Tseng V, Wu C, Fournier-Viger P, Yu PS (2016) Efficient algorithms for mining top-K high utility itemsets. IEEE Trans Knowl Data Eng 28(1):54–67
Article Google Scholar
Tanbeer SK, Ahmed CF, Jeong B, Lee Y (2009) Efficient single-pass frequent pattern mining using a prefix-tree. Inf Sci 179(5):559–583
Article MathSciNet MATH Google Scholar
Tsai C, Lai B (2015) A location-item-time sequential pattern mining algorithm for route recommendation. Knowl-Based Syst 73:97–110
Article Google Scholar
Yun U, Ryang H (2015) Incremental high utility pattern mining with static and dynamic databases. Appl Intell 42(2):323–352
Article Google Scholar
Yun U, Ryang H, Ryu K (2014) High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates. Expert Syst Appl 41(8):3861–3878
Article Google Scholar
Yun U, Kim D, Ryang H, Lee G, Lee K (2016) Mining recent high average utility patterns based on sliding window from stream data. J Intell Fuzzy Syst 30(6):3605–3617
Article Google Scholar
Yun U, Lee G (2016) Incremental mining of weighted maximal frequent itemsets from dynamic databases. Expert Syst Appl 54:304–327
Article Google Scholar
Yun U, Lee G (2016) Sliding window based weighted erasable stream pattern mining for stream data applications. Futur Gener Comput Syst 59:1–20
Article Google Scholar
Yun U, Lee G, Kim C (2016) The smallest valid extension-based efficient, rare graph pattern mining, considering length-decreasing support constraints and symmetry characteristics of graphs. Symmetry 8(5):1–26
Article MathSciNet Google Scholar
Yun U, Pyun G, Yoon E (2015) Efficient mining of robust closed weighted sequential patterns without information loss. Int J Artif Intell Tools 24(1):1–28
Article Google Scholar
Yun U, Lee G, Lee K (2016) Efficient representative pattern mining based on weight and maximality conditions. Expert Syst 33(5):439–462
Article Google Scholar
Zhang J, Wang Y, Yang D (2015) CCSpan: mining closed contiguous sequential patterns. Knowl-Based Syst 89:1–13
Article Google Scholar
Zhang X, Deng Z (2015) Mining summarization of high utility itemsets. Knowl-Based Syst 84:67–77
Article Google Scholar

Download references

Acknowledgments

This research was supported by the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF No. 20152062051 and NRF No. 20155054624), and the Business for Academic-industrial Cooperative establishments funded Korea Small and Medium Business Administration in 2015 (Grants No. C0261068).

Author information

Authors and Affiliations

Department of Computer Engineering, Sejong University, Seoul, Republic of Korea
Donggyu Kim & Unil Yun

Authors

Donggyu Kim
View author publications
You can also search for this author in PubMed Google Scholar
Unil Yun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Unil Yun.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, D., Yun, U. Efficient algorithm for mining high average-utility itemsets in incremental transaction databases. Appl Intell 47, 114–131 (2017). https://doi.org/10.1007/s10489-016-0890-z

Download citation

Published: 14 February 2017
Issue Date: July 2017
DOI: https://doi.org/10.1007/s10489-016-0890-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient algorithm for mining high average-utility itemsets in incremental transaction databases

Abstract

Access this article

Similar content being viewed by others

Trends and Future Perspective Challenges in Big Data

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

A comprehensive survey of data mining

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient algorithm for mining high average-utility itemsets in incremental transaction databases

Abstract

Access this article

Similar content being viewed by others

Trends and Future Perspective Challenges in Big Data

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

A comprehensive survey of data mining

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation