Abstract
As an effective extension of rough set theory, the variable precision neighborhood rough set model has been applied to the attribute dependency-based improvement of decision tree algorithm of the solution concerning continuous data. However, the boundary region, as an effective description of the uncertainty of knowledge, has not been taken into account in the existing algorithms. In this paper, we define a novel decision rule based on boundary region and attribute dependency, and construct a decision tree algorithm via this decision rule. First, we define a measure called boundary coefficient based on the boundary region, which can be used for comparative quantitative analysis. Second, we define the boundary mixed attribute dependency by combining the boundary coefficient and the attribute dependency, which can consider both the boundary case of the target set and the attribute dependency. Finally, a novel decision tree algorithm is proposed by using the boundary mixed attribute dependency as the decision rule. The experimental results show that with a slight increase in leaf nodes, the total running time decreases and the maximum accuracy increases to 0.9518, which indicates the effectiveness of the proposed algorithm.
Similar content being viewed by others
References
Breiman L (2017) Classification and regression trees. Routledge
Broelemann K, Kasneci G (2019) A gradient-based split criterion for highly accurate and transparent model trees. In: Proceedings of the twenty-eighth international joint conference on artificial intelligence. International Joint Conferences on Artificial Intelligence Organization, IJCAI-2019, pp 2030–2037. https://doi.org/10.24963/ijcai.2019/281
Gao C, Zhou J, Miao D et al (2021) Granular-conditional-entropy-based attribute reduction for partially labeled data with proxy labels. Inf Sci 580:111–128. https://doi.org/10.1016/j.ins.2021.08.067
Gao C, Li Y, Zhou J et al (2022) Global structure-guided neighborhood preserving embedding for dimensionality reduction. Int J Mach Learn Cybern 13(7):2013–2032. https://doi.org/10.1007/s13042-021-01502-6
Gao C, Wang Z, Zhou J (2022) Three-way approximate reduct based on information-theoretic measure. Int J Approx Reason 142:324–337. https://doi.org/10.1016/j.ijar.2021.12.008
Han X, Zhu X, Pedrycz W et al (2023) A three-way classification with fuzzy decision trees. Appl Soft Comput 132:109788. https://doi.org/10.1016/j.asoc.2022.109788
Hu Q, Yu D, Liu J et al (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594. https://doi.org/10.1016/j.ins.2008.05.024
Hu Q, Yu D, Xie Z (2008) Neighborhood classifiers. Expert Syst Appl 34(2):866–876. https://doi.org/10.1016/j.eswa.2006.10.043
Jiang F, Sui Y, Cao C (2013) An incremental decision tree algorithm based on rough sets and its application in intrusion detection. Artif Intell Rev 40(4):517–530. https://doi.org/10.1007/s10462-011-9293-z
Kang Y, Dai J (2023) Attribute reduction in inconsistent grey decision systems based on variable precision grey multigranulation rough set model. Appl Soft Comput 133:109928. https://doi.org/10.1016/j.asoc.2022.109928
Laber E, Murtinho L, Oliveira F (2023) Shallow decision trees for explainable k-means clustering. Pattern Recogn 137:109239. https://doi.org/10.1016/j.patcog.2022.109239
Lin T (1997) Neighborhood systems-a qualitative theory for fuzzy and rough sets. Adv Mach Intell Soft Comput 4:132–155
Lin TY (2003) Neighborhood systems: mathematical models of information granulations. In: SMC’03 Conference proceedings. 2003 IEEE international conference on systems, man and cybernetics. conference theme - system security and assurance (Cat. No.03CH37483), pp 3188–3193 vol.4. https://doi.org/10.1109/ICSMC.2003.1244381
Liu C, Lin B, Lai J et al (2022) An improved decision tree algorithm based on variable precision neighborhood similarity. Inf Sci 615:152–166. https://doi.org/10.1016/j.ins.2022.10.043
Liu C, Lai J, Lin B et al (2023) An improved id3 algorithm based on variable precision neighborhood rough sets. Appl Intell 53:23641–23654. https://doi.org/10.1007/s10489-023-04779-y
Luo C, Cao Q, Li T et al (2023) Mapreduce accelerated attribute reduction based on neighborhood entropy with apache spark. Expert Syst Appl 211:118554. https://doi.org/10.1016/j.eswa.2022.118554
Luo C, Wang S, Li T et al (2023) Rhdofs: a distributed online algorithm towards scalable streaming feature selection. IEEE Trans Parallel and Distrib Syst 34(6):1830–1847. https://doi.org/10.1109/TPDS.2023.3265974
Ma Z, Mi J (2016) Boundary region-based rough sets and uncertainty measures in the approximation space. Inf Sci 370–371:239–255. https://doi.org/10.1016/j.ins.2016.07.040
Mani A (2018) Algebraic methods for granular rough sets. Springer International Publishing, Cham, pp 157–335. https://doi.org/10.1007/978-3-030-01162-8_3
Mani A (2022) Granularity and rational approximation: rethinking graded rough sets. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 33–59. https://doi.org/10.1007/978-3-662-66544-2_4
Miao D, Wang J (1997) Rough sets based approach for multivariate decision tree construction. Chin J Softw 8(6):26–32(In Chinese with English Abstract)
Parthaláin N, Shen Q, Jensen R (2010) A distance measure approach to exploring the rough set boundary region for attribute reduction. IEEE Trans Knowl Data Eng 22(3):305–317. https://doi.org/10.1109/TKDE.2009.119
Pawlak Z (1982) Rough sets. Int J Comput Inform Sci 11(5):341–356. https://doi.org/10.1007/BF01001956
Quinlan JR (1986) Induction of decision trees. Mach Learn 1:81–106
Quinlan JR (2014) C4. 5: programs for machine learning. Elsevier
Ren Y, Zhu X, Bai K et al (2023) A new random forest ensemble of intuitionistic fuzzy decision trees. IEEE Trans Fuzzy Syst 31(5):1729–1741. https://doi.org/10.1109/TFUZZ.2022.3215725
Wang J, Qian Y, Li F et al (2020) Fusing fuzzy monotonic decision trees. IEEE Trans Fuzzy Syst 28(5):887–900. https://doi.org/10.1109/TFUZZ.2019.2953024
Xie X, Xianyong Z, Wuanye W et al (2021) Neighborhood decision tree construction algorithm based on variable precision neighborhood equivalent granules. Chin J Comput Appl 42(2):382–388 (In Chinese with English Abstract)
Xu B, Zhang X, Feng S (2018) Weighted denpendence of neighborhood rough sets and its heuristic reduction algorithm. Chin Pattern Recognit Artif Intell 31(3):256–264 (In Chinese with English Abstract)
Xu W, Yuan Z, Liu Z (2023) Feature selection for unbalanced distribution hybrid data based on k-nearest neighborhood rough set. IEEE Trans Artif Intell pp 1–15. https://doi.org/10.1109/TAI.2023.3237203
Yang X, Chen Y, Fujita H et al (2022) Mixed data-driven sequential three-way decision via subjective-objective dynamic fusion. Knowl Based Syst 237:107728. https://doi.org/10.1016/j.knosys.2021.107728
Yang X, Li M, Fujita H et al (2022) Incremental rough reduction with stable attribute group. Inf Sci 589:283–299. https://doi.org/10.1016/j.ins.2021.12.119
Yao Y, Zhang X, Chen S et al (2021) Decision-tree induction algorithm based on attribute purity degree. Chinese Comput Eng Des 42(1):142–149 (In Chinese with English Abstract)
Zhai J, Wang X, Zhang S et al (2018) Tolerance rough fuzzy decision tree. Inf Sci 465:425–438. https://doi.org/10.1016/j.ins.2018.07.006
Zhang X, Yao Y (2022) Tri-level attribute reduction in rough set theory. Expert Syst Appl 190:116187. https://doi.org/10.1016/j.eswa.2021.116187
Zhang X, Yuan Z, Miao D (2023) Outlier detection using three-way neighborhood characteristic regions and corresponding fusion measurement. IEEE Trans Knowl Data Eng 1–14. https://doi.org/10.1109/TKDE.2023.3312108
Ziarko W (1993) Variable precision rough set model. J Comput Syst Sci 46(1):39–59. https://doi.org/10.1016/0022-0000(93)90048-2
Acknowledgements
This research is supported by the National Natural Science Foundation of China under Grant Nos. 62166001, Graduate Innovation Funding Program of Gannan Normal University, China under Grant No. YCX22A025.
Author information
Authors and Affiliations
Contributions
Bowen Lin: Conceptualization, Methodology, Writing - original draft, Software, Validation, Data-curation. Caihui Liu: Conceptualization, Methodology, Writing - review & editing, Validation, Supervision. Duoqian Miao: Writing - review & editing.
Corresponding author
Ethics declarations
Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lin, B., Liu, C. & Miao, D. An improved decision tree algorithm based on boundary mixed attribute dependency. Appl Intell 54, 2136–2153 (2024). https://doi.org/10.1007/s10489-023-05238-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-05238-4