Abstract
Association rule mining is an important topic in data mining. The problem is to discover all (or almost all) associations among items in the transaction database that satisfy some user-specified constraints. Usually, the constraints are related to minimal support and minimal confidence. Class association rules (CARs) are a special type of association rules that can be applied for classification problem. Previous research showed that classification based on association rules has higher accuracy than can be achieved with an inductive learning algorithm or C4.5. As such, many methods have been proposed for mining CARs, although these use batch processing. However, datasets are often changed, with records added or/and deleted, and consequently updating CARs is a challenging problem. This paper proposes an efficient method for updating CARs when records are deleted. First, we use an MECR-tree to store nodes for the original dataset. The information in the nodes of this tree are updated based on the deleted records. Second, the concept of pre-large itemsets is used to avoid rescanning the original dataset. Finally, we propose an algorithm to efficiently update and generate CARs. We also analyze the time complexity to show the efficiency of our proposed algorithm. The experimental results show that the proposed method outperforms mining CARs from the dataset after record deletion.
Similar content being viewed by others
References
Abdelhamid N, Ayesh A, Thabtah F (2014) Phishing detection based associative classification data mining. Expert Syst Appl 41(13):5948–5959
Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. The second ACM SIGKDD. Washington DC, pp 207–216
Baralis E, Chiusano S, Garza P (2008) A lazy approach to associative classification. IEEE Trans Knowl Data Eng 20(2):156– 171
Chaves R, Ramírez J, Gorriz JM (2013) Integrating discretization and association rule-based classification for Alzheimer’s disease diagnosis. Expert Syst Appl 40(5):1571–1578
Dong J, Han M (2007) BitTableFI: an efficient mining frequent itemsets algorithm. Knowl Based Syst 20(4):329–335
Han J, Pei JJ, Yin Y (2000) Mining frequent patterns without candidate generation. The SIGMOD KDD’00. Boston, pp 1– 12
Hoens TR, Qian Q, Chawla NV, Zhou Z-H (2012) Building decision trees for the multi-class imbalance problem. Proc PAKDD 2012:122–134
Hong TP, Lin CW, Wu YL (2009) Incrementally fast updated frequent pattern trees. Expert Syst Appl 34:2424–2435
Hong TP, Lin CW, Wu YL (2009) Maintenance of fast updated frequent pattern trees for record deletion. Comput Stat Data Anal 53(7):2485–2499
Hong TP, Wang CY (2010) An efficient and effective association-rule maintenance algorithm for record modification. Expert Syst Appl 37(1):618–626
HooshSadat M, Zaïane OR (2012) An associative classifier for uncertain datasets. Proc PAKDD 2012:342–353
Kianmehr K, Alhajj R (2008) CARSVM: a class association rule-based classification framework and its application to gene expression data. Artif Intell Med 44(1):7–25
Kompalli PL, Cherku RK (2015) Efficient mining of data streams using associative classification approach. Int J Softw Eng Knowl Eng 25(3):605–632
La P, Le B, Vo B (2014) Incrementally building frequent closed itemset lattice. Expert Syst Appl 41(6):2703–2712
Li W, Han J, Pei J (2001) CMAR: accurate and efficient classification based on multiple class-association rules. In: The 1st IEEE international conference on data mining. San Jose, pp 369–376
Lin CW, Hong TP, Lu WH, Wu CH (2007) Maintenance of fast updated frequent trees for record deletion based on prelarge concepts. IEA/AIE’07, pp 675–684
Lin CW, Hong TP, Lu WH (2009) The Pre-FUFP algorithm for incremental mining. Expert Syst Appl 36(5):9498–9505
Liu B, Hsu W, Ma Y (1998) Integrating classification and association rule mining. In: The 4th international conference on knowledge discovery and data mining. New York, pp 80–86
Nguyen D, Vo B, Le B (2014) Efficient strategies for parallel mining class association rules. Expert Syst Appl 41(10):4716–4729
Nguyen D, Nguyen LTT, Vo B, Hong TP (2015) A novel method for constrained class association rule mining. Inform Sci 320:107–125
Nguyen D, Nguyen LTT, Vo B, Pedrycz W (2016) Efficient mining of class association rules with the itemset constraint. Knowl-Based Syst 103:73–88
Nguyen LTT, Nguyen NT (2015) Updating mined class association rules for record insertion. Appl Intell 42(4):707–721
Nguyen LTT, Nguyen NT (2015) An improved algorithm for mining class association rules using the difference of Obidsets. Expert Syst Appl 42(9):4361–4369
Nguyen LTT, Vo B, Hong TP, Thanh HC (2012) Classification based on association rules: a lattice-based approach. Expert Syst Appl 39(13):11357–11366
Nguyen LTT, Vo B, Hong TP, Thanh H C (2013) CAR-Miner: an efficient algorithm for mining class-association rules. Expert Syst Appl 40(6):2305–2311
Paul R, Groza T, Hunter J, Zankl A (2014) Inferring characteristic phenotypes via class association rule mining in the bone dysplasia domain. J Biomed Inform 48:73–83
Quinlan JR (1986) Introduction of decision tree. Mach Learn 1(1):81–106
Quinlan JR (1992) C4.5: program for machine learning. Morgan Kaufmann
Song W, Yang B, Xu Z (2008) Index-BitTableFI: an improved algorithm for mining frequent itemsets. Knowl Based Syst 21(6):507–513
Thabtah F, Cowling P, Peng Y (2004) MMAC: a new multi-class, multi-label associative classification approach. In: The 4th IEEE international conference on data mining. Brighton, pp 217–224
Tolun MR, Abu-Soud SM (1998) ILA: an inductive learning algorithm for production rule discovery. Expert Syst Appl 14(3):361–370
Veloso A, Meira W Jr, Goncalves M, Almeida H M, Zaki M J (2011) Calibrated lazy associative classification. Inform Sci 181(13):2656–2670
Vo B, Le B (2008) A novel classification algorithm based on association rule mining. In: The 2008 Pacific Rim knowledge acquisition workshop (Held with PRICAI’08), LNAI 5465. Ha Noi, pp 61–75
Vo B, Hong TP, Le B (2012) DBV-Miner: a dynamic bit-vector approach for fast mining frequent closed itemsets. Expert Syst Appl 39(8):7196–7206
Vo B, Le T, Hong TP, Le B (2014) An effective approach for maintenance of pre-large-based frequent-itemset lattice in incremental mining. Appl Intell 41(3):759–775
Yin X, Han J (2003) CPAR: classification based on predictive association rules. In: SIAM International conference on data mining (SDM’03). San Francisco, pp 331–335
Yen SJ, Lee YS, Wang C K (2014) An efficient algorithm for incrementally mining frequent closed itemsets. Appl Intell 40(4):649–668
Yu K, Wu X, Ding W, Wang H, Yao H (2011) Causal associative classification. Proc ICDM 2011:914–923
Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997) New algorithms for fast discovery of association rules. In: Proceedings of SIGKDD’97. California, pp 283–286
Zhang X, Chen G, Wei Q (2011) Building a highly-compact and accurate associative classifier. Appl Intell 34(1):74–86
Acknowledgments
This work was carried out during the tenure of an ERCIM ‘Alain Bensoussan’ Fellowship Programme.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nguyen, L.T.T., Nguyen, NT., Vo, B. et al. Efficient method for updating class association rules in dynamic datasets with record deletion. Appl Intell 48, 1491–1505 (2018). https://doi.org/10.1007/s10489-017-1023-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-017-1023-z