Abstract
Decision trees and random forests are some of the most popular machine learning tools for binary classification, being used in many practical applications. Both methods provide a neighborhood for tested data during the prediction phase, and probabilities are usually computed based on the proportion of classes in those neighborhoods. The approach presented in this paper proposes replacing the prediction mechanism with one based on a probabilistic classifier based on the Nash equilibrium concept applied to the local data selected by the random forest classifier. Numerical experiments performed on synthetic data illustrate the behavior of the approach in a variety of settings.
This work was supported by a grant of the Romanian Ministry of Education and Research, CNCS - UEFISCDI, project number PN-III-P4-ID-PCE-2020-2360, within PNCDI III.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Au, T.C.: Random forests, decision trees, and categorical predictors: the “absent levels’’ problem. J. Mach. Learn. Res. 19(1), 1737–1766 (2018)
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth and Brooks, Monterey, CA (1984)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Czajkowski, M., Jurczuk, K., Kretowski, M.: Accelerated evolutionary induction of heterogeneous decision trees for gene expression-based classification. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 946–954. GECCO 2021. Association for Computing Machinery, New York (2021). https://doi.org/10.1145/3449639.3459376
Fawcett, T.: An introduction to roc analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006). ROC Analysis in Pattern Recognition
Fazeen, M., Dantu, R., Guturu, P.: Identification of leaders, lurkers, associates and spammers in a social network: context-dependent and context-independent approaches. Soc. Netw. Anal. Min. 1(3), 241–254 (2011). https://doi.org/10.1007/s13278-011-0017-9
Hansen, N., Müller, S.D., Koumoutsakos, P.: Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES). Evol. Comput. 11(1), 1–18 (2003). https://doi.org/10.1162/106365603321828970
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference and Prediction, 2nd edn. Springer, New York (2009)
Lotte, F., et al.: A review of classification algorithms for EEG-based brain-computer interfaces: a 10 year update. J. Neural Eng. 15(3), 031005 (2018). https://doi.org/10.1088/1741-2552/aab2f2
Ma, L., Li, M., Ma, X., Cheng, L., Du, P., Liu, Y.: A review of supervised object-based land-cover image classification. ISPRS J. Photogram. Remote. Sens. 130, 277–293 (2017). https://doi.org/10.1016/j.isprsjprs.2017.06.001
McKelvey, R.D., McLennan, A.: Computation of equilibria in finite games. Handb. Comput. Econ. 1, 87–142 (1996)
Mitchell, J.B.O.: Machine learning methods in chemoinformatics. Wiley Interdisc. Rev. Comput. Mol. Sci. 4(5), 468–481 (2014). https://doi.org/10.1002/wcms.1183
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Resende, P., Drummond, A.: A survey of random forest based methods for intrusion detection systems. ACM Comput. Surv. 51(3), 1–36 (2018). https://doi.org/10.1145/3178582
Rosset, S.: Model selection via the AUC. In: Proceedings of the Twenty-First International Conference on Machine Learning, ICML 2004, p. 89. Association for Computing Machinery, New York (2004). https://doi.org/10.1145/1015330.1015400
Scholz, M., Wimmer, T.: A comparison of classification methods across different data complexity scenarios and datasets. Expert Syst. Appl. 168, 114217 (2021). https://doi.org/10.1016/j.eswa.2020.114217
Suciu, M.-A., Lung, R.I.: Nash equilibrium as a solution in supervised classification. In: Bäck, T., Preuss, M., Deutz, A., Wang, H., Doerr, C., Emmerich, M., Trautmann, H. (eds.) PPSN 2020. LNCS, vol. 12269, pp. 539–551. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58112-1_37
Van, A., Gay, V.C., Kennedy, P.J., Barin, E., Leijdekkers, P.: Understanding risk factors in cardiac rehabilitation patients with random forests and decision trees. In: Proceedings of the Ninth Australasian Data Mining Conference - Volume 121, AusDM 2011, pp. 11–22. Australian Computer Society Inc, AUS (2011)
Wu, X., et al.: Top 10 algorithms in data mining. Knowl. Inf. Syst. 14(1), 1–37 (2008). https://doi.org/10.1007/s10115-007-0114-2
Zaki, M.J., Meira, Jr, W.: Data Mining and Machine Learning: Fundamental Concepts and Algorithms. Cambridge University Press, Cambridge, 2 edn. (2020). https://doi.org/10.1017/9781108564175
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Suciu, MA., Lung, R.I. (2022). A New Game Theoretic Based Random Forest for Binary Classification. In: García Bringas, P., et al. Hybrid Artificial Intelligent Systems. HAIS 2022. Lecture Notes in Computer Science(), vol 13469. Springer, Cham. https://doi.org/10.1007/978-3-031-15471-3_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-15471-3_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15470-6
Online ISBN: 978-3-031-15471-3
eBook Packages: Computer ScienceComputer Science (R0)