Abstract
The partial consolidated tree bagging (PCTBagging) was presented as a multiple classifier that, based on a parameter, the consolidation percentage, can exploit more the possibilities of the inner ensembles, and obtain higher levels of interpretability, or can exploit more the possibilities of the ensembles, and obtain higher discriminant capacity. Thus, at the extreme values, with a consolidation percentage of 100% it obtains a consolidated tree (CTC algorithm) and with 0% consolidation it obtains a Bagging. For intermediate values, the consolidated tree is collapsed to the number of internal nodes corresponding to the percentage value, selecting the biggest possible nodes. In this paper we propose a strategy to directly develop the partial consolidated tree, i.e. without the need to build the complete consolidated tree and, in addition, we explore up to 4 other different criteria, besides the size of the nodes, to decide which will be the next node to be developed in the partial consolidated tree: Pre-order, Gain ratio, Gain ratio \(\times \) Size, and, Level by level. The results show that the use of different criteria affects the discriminant capacity of the classifier for the same level of interpretability, and that this effect is greater the higher the percentage of consolidation is.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
We are trying to update this package with the proposal of this paper.
- 3.
References
Alatrany, A.S., Khan, W., Hussain, A., Kolivand, H., Al-Jumeily, D.: An explainable machine learning approach for Alzheimer’s disease classification. Sci. Rep. 14(1), 2637 (2024). https://doi.org/10.1038/s41598-024-51985-w
Banfield, R.E., Hall, L.O., Bowyer, K.W., Kegelmeyer, W.: A comparison of decision tree ensemble creation techniques. IEEE Trans. Pattern Anal. Mach. Intell. 29(1), 173–180 (2007). https://doi.org/10.1109/TPAMI.2007.2
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach. Learn. 36(1–2), 105–139 (1999). https://doi.org/10.1023/A:1007515423169
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996). https://doi.org/10.1023/A:1018054314350
Fernández, A., García, S., Luengo, J., Bernadó-Mansilla, E., Herrera, F.: Genetics-based machine learning for rule induction: state of the art, taxonomy, and comparative study. IEEE Trans. Evol. Comput. 14(6), 913–941 (2010). https://doi.org/10.1109/TEVC.2009.2039140
Frank, E., Hall, M.A., Witten, I.H.: The WEKA Workbench, chap. Online Appendix, 4th edn. Morgan Kaufmann (2016). https://www.cs.waikato.ac.nz/ml/weka/Witten_et_al_2016_appendix.pdf
Ibarguren, I., Lasarguren, A., Pérez, J.M., Muguerza, J., Gurrutxaga, I., Arbelaitz, O.: BFPART: best-first PART. Inf. Sci. 367–368, 927–952 (2016). https://doi.org/10.1016/j.ins.2016.07.023
Ibarguren, I., Pérez, J.M., Muguerza, J., Gurrutxaga, I., Arbelaitz, O.: Coverage-based resampling: building robust consolidated decision trees. Knowl.-Based Syst. 79, 51–67 (2015). https://doi.org/10.1016/j.knosys.2014.12.023
Ibarguren, I., Pérez, J.M., Muguerza, J., Arbelaitz, O., Yera, A.: PCTBagging: from inner ensembles to ensembles. A trade-off between discriminating capacity and interpretability. Inf. Sci. 583, 219–238 (2022). https://doi.org/10.1016/j.ins.2021.11.010
Khosravi, H., et al.: Explainable artificial intelligence in education. Comput. Educ. Artif. Intell. 3, 100074 (2022). https://doi.org/10.1016/j.caeai.2022.100074
Love, P.E., Fang, W., Matthews, J., Porter, S., Luo, H., Ding, L.: Explainable artificial intelligence (XAI): precepts, models, and opportunities for research in construction. Adv. Eng. Inform. 57, 102024 (2023). https://doi.org/10.1016/j.aei.2023.102024
Pérez, J.M., et al.: Consolidated trees versus bagging when explanation is required. Computing 89, 113–145 (2010). https://doi.org/10.1007/s00607-010-0094-z
Pérez, J.M., Muguerza, J., Arbelaitz, O., Gurrutxaga, I.: A new algorithm to build consolidated trees: study of the error rate and steadiness. In: Kłopotek, M.A., Wierzchoń, S.T., Trojanowski, K. (eds.) Intelligent Information Processing and Web Mining. AINSC, vol. 25, pp. 79–88. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-39985-8_9
Pérez, J.M., Muguerza, J., Arbelaitz, O., Gurrutxaga, I., Martín, J.I.: Combining multiple class distribution modified subsamples in a single tree. Pattern Recogn. Lett. 28(4), 414–422 (2007). https://doi.org/10.1016/j.patrec.2006.08.013
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1993)
Acknowledgments
This work was partially funded by grant PID2021-123087OB-I00 funded by MCIN/AEI/ 10.13039/501100011033 and, ERDF A way of making Europe, and by the Department of Education, Universities and Research of the Basque Government (ADIAN, IT-1437-22). We would like to thank our former undergraduate student Josué Cabezas, who participated in the implementation of the Driven PCTBagging algorithm for the WEKA platform.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Pérez, J.M., Arbelaitz, O., Muguerza, J. (2024). Driven PCTBagging: Seeking Greater Discriminating Capacity for the Same Level of Interpretability. In: Alonso-Betanzos, A., et al. Advances in Artificial Intelligence. CAEPIA 2024. Lecture Notes in Computer Science(), vol 14640. Springer, Cham. https://doi.org/10.1007/978-3-031-62799-6_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-62799-6_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-62798-9
Online ISBN: 978-3-031-62799-6
eBook Packages: Computer ScienceComputer Science (R0)