Abstract
Multi-criteria optimization is increasingly used to build classifier ensembles, including for the imbalanced data classification task. Then we have the problem of optimizing at least two criteria related to the prediction quality of the minority and majority classes, or the so-called classification precision. The paper proposes MOOforest - a new method for building decision tree ensembles. It uses the MOEA/D optimization algorithm to return a diverse pool of base classifiers by selecting different feature subsets on which they are trained. From the pool of non-dominated solutions, the final ensemble is chosen using the promethee method. Modifying the weights of the promethee algorithm allows the user to select the appropriate solution in the context of the user’s expectations (i.e., it indicates how important each optimization criterion is to the user). It is worth noting that during the classifier ensemble training, the features selected for the base classifiers result from the optimization process, and not as in the case of popular algorithms employing the Random Subspace approach (such as Random Forest), where attributes are selected randomly. Thus, the proposed method has an advantage over the mentioned approach, confirmed through a comprehensive set of computer experiments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
Detailed results are available in the GitHub repository. https://github.com/w4k2/MOOforest.
References
Alcalá-Fdez, J., et al.: Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Multiple-Valued Logic. Soft. Comput. 17, 255–287 (2011)
Alves Ribeiro, V.H., Reynoso-Meza, G.: Ensemble learning by means of a multi-objective optimization design approach for dealing with imbalanced data sets. Expert Syst. Appl. 147, 113232 (2020)
Blank, J., Deb, K.: Pymoo: multi-objective optimization in Python. IEEE Access 8, 89497–89509 (2020)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Routledge, New York (2017)
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Grzyb, J., Topolski, M., Woźniak, M.: Application of multi-objective optimization to feature selection for a difficult data classification task. In: Paszynski, M., Kranzlmüller, D., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds.) ICCS 2021. LNCS, vol. 12744, pp. 81–94. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77967-2_8
Haque, M.N., Noman, N., Berretta, R., Moscato, P.: Heterogeneous ensemble combination search using genetic algorithm for class imbalanced data classification. PLoS ONE 11(1), 1–28 (2016)
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998). https://doi.org/10.1109/34.709601
Hunter, J.D.: Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9(3), 90–95 (2007)
Klikowski, J., Ksieniewicz, P., Woźniak, M.: A genetic-based ensemble learning applied to imbalanced data classification. In: Yin, H., Camacho, D., Tino, P., Tallón-Ballesteros, A.J., Menezes, R., Allmendinger, R. (eds.) IDEAL 2019. LNCS, vol. 11872, pp. 340–352. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33617-2_35
Krawczyk, B.: Learning from imbalanced data: open challenges and future directions. Prog. Artif. Intell. 5(4), 221–232 (2016)
Lin, A., Yu, P., Cheng, S., Xing, L.: One-to-one ensemble mechanism for decomposition-based multi-objective optimization. Swarm Evol. Comput. 68, 101007 (2022)
McKinney, W.: Data structures for statistical computing in Python. In: van der Walt, S., Millman. J. (eds.) Proceedings of the 9th Python in Science Conference, pp. 56 – 61 (2010)
Oliphant, T.E.: A Guide To NumPy, vol. 1. Trelgol Publishing, Philadelphia (2006)
Pedregosa, F.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Sani, H.M., Lei, C., Neagu, D.: Computational complexity analysis of decision tree algorithms. In: Bramer, M., Petridis, M. (eds.) SGAI 2018. LNCS (LNAI), vol. 11311, pp. 191–197. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-04191-5_17
Stapor, K., Ksieniewicz, P., García, S., Woźniak, M.: How to design the fair experimental classifier evaluation. Appl. Soft Comput. 104, 107219 (2021)
Węgier, W., Koziarski, M., Woźniak, M.: Multicriteria classifier ensemble learning for imbalanced data. IEEE Access 10, 16807–16818 (2022)
Zhang, Q., Li, H.: MOEA/D: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans. Evol. Comput. 11(6), 712–731 (2007). https://doi.org/10.1109/TEVC.2007.892759
Acknowledgement
This work was supported by the Polish National Science Centre under the grant No. 2019/35/B/ST6/04442.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Grzyb, J., Woźniak, M. (2023). MOOforest – Multi-objective Optimization to Form Decision Tree Ensemble. In: Pawelczyk, M., Bismor, D., Ogonowski, S., Kacprzyk, J. (eds) Advanced, Contemporary Control. PCC 2023. Lecture Notes in Networks and Systems, vol 709. Springer, Cham. https://doi.org/10.1007/978-3-031-35173-0_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-35173-0_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-35172-3
Online ISBN: 978-3-031-35173-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)