Abstract
Most approaches for discovering frequent itemsets derive association rules from a binary database. Profit, cost, and quantity are not considered in traditional association-rule mining. Utility mining was proposed to measure the utilities of purchase products to derive highutility itemsets (HUIs). Many algorithms have been proposed to efficiently find HUIs from a static database. In real-world applications, transactions are inserted, deleted, or modified in dynamic situations. Existing batch approaches have to re-process the updated database since previously discovered HUIs are not maintained. In this paper, a Fast UPdated (FUP) strategy with utility measure and a maintenance algorithm, called FUP-HUI-MOD, are developed to efficiently maintain and update discovered HUIs. When transactions are modified, the proposed algorithm partitions the transactions before and after the modification into two parts, creating four cases. Each case is maintained using a specific procedure to update the discovered HUIs. Based on the designed FUP-HUI-MOD algorithm, the original database is not required to be rescanned each time compared to the state-of-the-art high-utility itemset mining algorithms in batch mode. Experiments are conducted to show that the proposed algorithm outperforms batch algorithms in maintaining HUIs.
Similar content being viewed by others
References
Frequent itemset mining dataset repository (2012) Available: http://fimi.ua.ac.be/data/
Agrawal R, Imielinski T, Swami A (1993) Database mining: A performance perspective. IEEE Trans Knowl Data Eng 5:914–925
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: The International Conference on Very Large Data Bases, pp 487–499
Agrawal R, Srikant R (1994) Quest synthetic data generator. Available: http://www.Almaden.ibm.com/cs/quest/syndata.html
Agrawal R, Srikant R (1995) Mining sequential patterns. In: The International Conference on Data Engineering, pp 3–14
Ahmed CF, Tanbeer SK, Jeong BS, Lee YK (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21:1708–1721
Berkhin P (2006) A survey of clustering data mining techniques. Grouping Multidimensional Data:25–71
Chan R, Yang Q, Shen YD (2003) Mining high utility itemsets. In: IEEE International Conference on Data Mining, pp 19–26
Chen MS, Han J, Yu PS (1996) Data mining: An overview from a database perspective. IEEE Trans Knowl Data Eng 8:866–883
Cheung DWL, Han J, Ng V, Wong CY (1996) Maintenance of discovered association rules in large databases: An incremental updating technique. In: International Conference on Data Engineering, pp 106–114
Cheung DWL, Lee SD, Kao B (1997) A general incremental technique for maintaining discovered association rules. In: The International Conference on Database Systems for Advanced Applications, pp 185–194
Gharib TF, Nassar H, Taha M, Abrahamd A (2010) An efficient algorithm for incremental mining of temporal association rules. Data Knowl Eng 69:800–815
Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Min Knowl Dis 8:53–87
Hong TP, Lin CW, Wu YL (2008) An efficient fufp-tree mainteance algorithm for record modification. Int J Innov Comput Inf Control 4:2875–2887
Hong TP, Lin CW, Wu YL (2008) Incrementally fast updated frequent pattern trees. Expert Syst Appl 34:2424–2435
Hong TP, Lin CW, Wu YL (2009) Maintenance of fast updated frequent pattern trees for record deletion. Comput Stat Data Anal 53:2485–2499
Kotsiantis SB (2007) Supervised machine learning: A review of classification techniques. In: The Conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies , pp 3–24
Li YC, Yeh JS, Chang CC (2005) Efficient algorithms for mining share-frequent itemsets. In: World Congress of Internatinal Fuzzy Systems Association, pp 534–539
Li YC, Yeh JS, Chang CC (2005) Direct candidates generation: A novel algorithm for discovering complete share-frequent itemsets. Fuzzy Syst Knowl Disc 3614:551–560
Li YC, Yeh JS, Chang C-C (2008) Isolated items discarding strategy for discovering high utility itemsets. Data Knowl Eng 64:198–217
Lin CW, Hong TP, Lu WH (2011) An effective tree structure for mining high utility itemsets. Expert Syst Appl 38:7419–7424
Lin CW, Lan GC, Hong TP (2012) An incremental mining algorithm for high utility itemsets. Expert Syst Appl 39:7173–7180
Lin CW, Hong TP (2013) A survey of fuzzy web mining. Wiley Interdiscip Rev Data Mining Knowl Disc 3:190–199
Lin CW, Lan GC, Hong TP, Kong L (2014) Mining high utility itemsets based on transaction deletion. Lect Notes Electr Eng 260:983–990
Liu J, Wang K, Fung BCM (2012) Direct discovery of high utility itemsets without candidate generation. In: IEEE 12th International Conference on Data Mining, pp 984–989
Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: ACM International Conference on Information and Knowledge Management, pp 55–64
Liu Y, Liao WK, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: Advances in Knowledge Discovery and Data Mining, pp 689–695
Mahgoub H (2013) Iarmmd: A novel system for incremental association rules mining from medical documents. Int J Comput Appl 64:28–35
Microsoft Example database foodmart of microsoft analysis services. Available: http://msdn.microsoft.com/en-us/library/aa217032(SQL.80).aspx
Nath B, Bhattacharyya DK, Ghosh A (2013) Incremental association rule mining: A survey. WIREs Data Min Knowl Discovery 3
Song W, Liu Y, Li J (2013) Mining high utility itemsets by dynamically pruning the tree structure. Appl Intell:1–15
Tseng VS, Bai-En S, Cheng-Wei W, Yu PS (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25:1772–1786
Wu CW, Lin YF, Yu PS, Tseng VS (2013) Mining high utility episodes in complex event sequences. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 536–544
Yao H, Hamilton HJ (2006) Mining itemset utilities from transaction databases. Data Knowl Eng 59:603–626
Acknowledgments
This research was partially supported by the Shenzhen Peacock Project, China, under grant KQC201109020055A, by the Natural Scientific Research Innovation Foundation in Harbin Institute of Technology under grant HIT.NSRIf.2014100, and by the Shenzhen Strategic Emerging Industries Program under grant ZDSY20120613125016389, and by the Tencent Project under grant CCF-TencentRAGR20140114.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lin, J.CW., Gan, W. & Hong, TP. Maintaining the discovered high-utility itemsets with transaction modification. Appl Intell 44, 166–178 (2016). https://doi.org/10.1007/s10489-015-0697-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-015-0697-3