Abstract
Classifier ensembles are still in the spotlight due to their proven applications for many practical problems. One of the most interesting approaches is methods proposing the construction of such models based on learned classifiers using randomly selected attributes. Among the essential algorithms in this area is Random Forest, which uses decision trees as base classifiers. Although these methods obtain outstanding quality for many tasks, the random selection of attributes for individual base classifiers used above does not guarantee the optimal ensemble selection. In this paper, we proposed DE-Forest - an ensemble classifier learning method using evolutionary techniques to select non-random and optimized subsets. Differential Evolution was chosen as the optimization algorithm, selecting the best set of attributes to learn the individual base classifiers. The quality of the proposed method has been evaluated based on computer experiments performed on many benchmark datasets. The obtained results are promising and show the superiority of the proposed method over the benchmark solutions.
This work was supported by the Polish National Science Centre under the grant No. 2019/35/B/ST6/04442.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alcalá-Fdez, J., et al.: Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Multiple-Valued Logic Soft Comput. 17, 255–287 (2011)
Blank, J., Deb, K.: Pymoo: multi-objective optimization in python. IEEE Access 8, 89497–89509 (2020)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and regression trees. Routledge (2017)
Canuto, A.M., Vale, K.M., Feitos, A., Signoretti, A.: Reinsel: a class-based mechanism for feature selection in ensemble of classifiers. Appl. Soft Comput. 12(8), 2517–2529 (2012)
Das, S., Suganthan, P.N.: Differential evolution: a survey of the state-of-the-art. IEEE Trans. Evol. Comput. 15(1), 4–31 (2011). https://doi.org/10.1109/TEVC.2010.2059031
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Elyan, E., Gaber, M.M.: A genetic algorithm approach to optimising random forests applied to class engineered data. Inf. Sci. 384, 220–234 (2017)
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998). https://doi.org/10.1109/34.709601
Hunter, J.D.: Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9(3), 90–95 (2007)
Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A.C. (ed.) LION 2011. LNCS, vol. 6683, pp. 507–523. Springer, Berlin Heidelberg, Berlin, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25566-3_40
Kaur, M., Gianey, H.K., Singh, D., Sabharwal, M.: Multi-objective differential evolution based random forest for e-health applications. Mod. Phys. Lett. B 33(05), 1950022 (2019). https://doi.org/10.1142/S0217984919500222
Koziarski, M., Krawczyk, B., WoźNiak, M.: The deterministic subspace method for constructing classifier ensembles. Pattern Anal. Appl. 20(4), 981–990 (2017). https://doi.org/10.1007/s10044-017-0655-2
Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. Wiley, Hoboken (2014)
Lemaître, G., Nogueira, F., Aridas, C.K.: Imbalanced-learn: a python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18(17), 1–5 (2017)
Lin, W., Wu, Z., Lin, L., Wen, A., Li, J.: An ensemble random forest algorithm for insurance big data analysis. IEEE Access 5, 16568–16575 (2017). https://doi.org/10.1109/ACCESS.2017.2738069
Smac3: a versatile Bayesian optimization package for hyperparameter optimization. J. Mach. Learn. Res. 23(54), 1–9 (2022)
Liu, Z., Yang, Z., Liu, S., Shi, Y.: Semi-random subspace method for writeprint identification. Neurocomputing 108, 93–102 (2013)
McKinney, W.: Data structures for statistical computing in python. In: van der Walt, S., Millman, J (eds.) Proceedings of the 9th Python in Science Conference, pp. 56–61 (2010)
Nag, K., Pal, N.R.: A multiobjective genetic programming-based ensemble for simultaneous feature selection and classification. IEEE Trans. Cybern. 46(2), 499–510 (2016). https://doi.org/10.1109/TCYB.2015.2404806
Napierala, K., Stefanowski, J.: Types of minority class examples and their influence on learning classifiers from imbalanced data. J. Intell. Inf. Syst. 46, 563–597 (2016)
Oliphant, T.E.: A guide to NumPy, vol. 1. Trelgol Publishing USA (2006)
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Sani, H.M., Lei, C., Neagu, D.: Computational complexity analysis of decision tree algorithms. In: Bramer, M., Petridis, M. (eds.) SGAI 2018. LNCS, vol. 11311, pp. 191–197. Springer International Publishing, Cham (2018). https://doi.org/10.1007/978-3-030-04191-5_17
Stapor, K., Ksieniewicz, P., García, S., Woźniak, M.: How to design the fair experimental classifier evaluation. Appl. Soft Comput. 104, 107219 (2021)
Storn, R., Price, K.: Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces. J. Global Optim. 11(4), 341–359 (1997)
Woźniak, M., Graña, M., Corchado, E.: A survey of multiple classifier systems as hybrid systems. Inf. Fusion 16, 3–17 (2014). special Issue on Information Fusion in Hybrid Intelligent Fusion Systems
Wu, G., Mallipeddi, R., Suganthan, P.N.: Ensemble strategies for population-based optimization algorithms - a survey. Swarm Evol. Comput. 44, 695–711 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Grzyb, J., Woźniak, M. (2023). DE-Forest – Optimized Decision Tree Ensemble. In: Nguyen, N.T., et al. Computational Collective Intelligence. ICCCI 2023. Lecture Notes in Computer Science(), vol 14162. Springer, Cham. https://doi.org/10.1007/978-3-031-41456-5_61
Download citation
DOI: https://doi.org/10.1007/978-3-031-41456-5_61
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-41455-8
Online ISBN: 978-3-031-41456-5
eBook Packages: Computer ScienceComputer Science (R0)