Skip to main content
Log in

Maintaining the discovered high-utility itemsets with transaction modification

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Most approaches for discovering frequent itemsets derive association rules from a binary database. Profit, cost, and quantity are not considered in traditional association-rule mining. Utility mining was proposed to measure the utilities of purchase products to derive highutility itemsets (HUIs). Many algorithms have been proposed to efficiently find HUIs from a static database. In real-world applications, transactions are inserted, deleted, or modified in dynamic situations. Existing batch approaches have to re-process the updated database since previously discovered HUIs are not maintained. In this paper, a Fast UPdated (FUP) strategy with utility measure and a maintenance algorithm, called FUP-HUI-MOD, are developed to efficiently maintain and update discovered HUIs. When transactions are modified, the proposed algorithm partitions the transactions before and after the modification into two parts, creating four cases. Each case is maintained using a specific procedure to update the discovered HUIs. Based on the designed FUP-HUI-MOD algorithm, the original database is not required to be rescanned each time compared to the state-of-the-art high-utility itemset mining algorithms in batch mode. Experiments are conducted to show that the proposed algorithm outperforms batch algorithms in maintaining HUIs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Frequent itemset mining dataset repository (2012) Available: http://fimi.ua.ac.be/data/

  2. Agrawal R, Imielinski T, Swami A (1993) Database mining: A performance perspective. IEEE Trans Knowl Data Eng 5:914–925

    Article  Google Scholar 

  3. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: The International Conference on Very Large Data Bases, pp 487–499

  4. Agrawal R, Srikant R (1994) Quest synthetic data generator. Available: http://www.Almaden.ibm.com/cs/quest/syndata.html

  5. Agrawal R, Srikant R (1995) Mining sequential patterns. In: The International Conference on Data Engineering, pp 3–14

  6. Ahmed CF, Tanbeer SK, Jeong BS, Lee YK (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21:1708–1721

    Article  Google Scholar 

  7. Berkhin P (2006) A survey of clustering data mining techniques. Grouping Multidimensional Data:25–71

  8. Chan R, Yang Q, Shen YD (2003) Mining high utility itemsets. In: IEEE International Conference on Data Mining, pp 19–26

  9. Chen MS, Han J, Yu PS (1996) Data mining: An overview from a database perspective. IEEE Trans Knowl Data Eng 8:866–883

    Article  Google Scholar 

  10. Cheung DWL, Han J, Ng V, Wong CY (1996) Maintenance of discovered association rules in large databases: An incremental updating technique. In: International Conference on Data Engineering, pp 106–114

  11. Cheung DWL, Lee SD, Kao B (1997) A general incremental technique for maintaining discovered association rules. In: The International Conference on Database Systems for Advanced Applications, pp 185–194

  12. Gharib TF, Nassar H, Taha M, Abrahamd A (2010) An efficient algorithm for incremental mining of temporal association rules. Data Knowl Eng 69:800–815

    Article  Google Scholar 

  13. Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Min Knowl Dis 8:53–87

    Article  MathSciNet  Google Scholar 

  14. Hong TP, Lin CW, Wu YL (2008) An efficient fufp-tree mainteance algorithm for record modification. Int J Innov Comput Inf Control 4:2875–2887

    Google Scholar 

  15. Hong TP, Lin CW, Wu YL (2008) Incrementally fast updated frequent pattern trees. Expert Syst Appl 34:2424–2435

    Article  Google Scholar 

  16. Hong TP, Lin CW, Wu YL (2009) Maintenance of fast updated frequent pattern trees for record deletion. Comput Stat Data Anal 53:2485–2499

    Article  MATH  MathSciNet  Google Scholar 

  17. Kotsiantis SB (2007) Supervised machine learning: A review of classification techniques. In: The Conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies , pp 3–24

  18. Li YC, Yeh JS, Chang CC (2005) Efficient algorithms for mining share-frequent itemsets. In: World Congress of Internatinal Fuzzy Systems Association, pp 534–539

  19. Li YC, Yeh JS, Chang CC (2005) Direct candidates generation: A novel algorithm for discovering complete share-frequent itemsets. Fuzzy Syst Knowl Disc 3614:551–560

    Article  Google Scholar 

  20. Li YC, Yeh JS, Chang C-C (2008) Isolated items discarding strategy for discovering high utility itemsets. Data Knowl Eng 64:198–217

    Article  Google Scholar 

  21. Lin CW, Hong TP, Lu WH (2011) An effective tree structure for mining high utility itemsets. Expert Syst Appl 38:7419–7424

    Article  Google Scholar 

  22. Lin CW, Lan GC, Hong TP (2012) An incremental mining algorithm for high utility itemsets. Expert Syst Appl 39:7173–7180

    Article  Google Scholar 

  23. Lin CW, Hong TP (2013) A survey of fuzzy web mining. Wiley Interdiscip Rev Data Mining Knowl Disc 3:190–199

    Article  Google Scholar 

  24. Lin CW, Lan GC, Hong TP, Kong L (2014) Mining high utility itemsets based on transaction deletion. Lect Notes Electr Eng 260:983–990

    Article  Google Scholar 

  25. Liu J, Wang K, Fung BCM (2012) Direct discovery of high utility itemsets without candidate generation. In: IEEE 12th International Conference on Data Mining, pp 984–989

  26. Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: ACM International Conference on Information and Knowledge Management, pp 55–64

  27. Liu Y, Liao WK, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: Advances in Knowledge Discovery and Data Mining, pp 689–695

  28. Mahgoub H (2013) Iarmmd: A novel system for incremental association rules mining from medical documents. Int J Comput Appl 64:28–35

    Google Scholar 

  29. Microsoft Example database foodmart of microsoft analysis services. Available: http://msdn.microsoft.com/en-us/library/aa217032(SQL.80).aspx

  30. Nath B, Bhattacharyya DK, Ghosh A (2013) Incremental association rule mining: A survey. WIREs Data Min Knowl Discovery 3

  31. Song W, Liu Y, Li J (2013) Mining high utility itemsets by dynamically pruning the tree structure. Appl Intell:1–15

  32. Tseng VS, Bai-En S, Cheng-Wei W, Yu PS (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25:1772–1786

    Article  Google Scholar 

  33. Wu CW, Lin YF, Yu PS, Tseng VS (2013) Mining high utility episodes in complex event sequences. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 536–544

  34. Yao H, Hamilton HJ (2006) Mining itemset utilities from transaction databases. Data Knowl Eng 59:603–626

    Article  Google Scholar 

Download references

Acknowledgments

This research was partially supported by the Shenzhen Peacock Project, China, under grant KQC201109020055A, by the Natural Scientific Research Innovation Foundation in Harbin Institute of Technology under grant HIT.NSRIf.2014100, and by the Shenzhen Strategic Emerging Industries Program under grant ZDSY20120613125016389, and by the Tencent Project under grant CCF-TencentRAGR20140114.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jerry Chun-Wei Lin.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, J.CW., Gan, W. & Hong, TP. Maintaining the discovered high-utility itemsets with transaction modification. Appl Intell 44, 166–178 (2016). https://doi.org/10.1007/s10489-015-0697-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-015-0697-3

Keywords

Navigation