Abstract
Decision trees (DTs) are popular techniques in the field of explainable machine learning. Traditionally, DTs are induced using a top-down greedy search that is usually fast; however, it may lead to sub-optimal solutions. Here, we deal with an alternative approach which is an evolutionary induction. It provides global exploration that results in less complex DTs but it is much more time-demanding. Various parallel computing approaches were considered, where GPU-based one seems to be the most efficient. To speed up the induction further, different GPU memory organization/layouts could be dealt with.
In this paper, we introduce a compact in-memory representation of DTs. It is a one-dimensional array representation where links between parent and children tree nodes are explicitly stored next to the node data (testes in internal nodes, classes in leaves, etc.). On the other side, when the complete representation is applied, children positions are calculated based on the parent place. However, it needs a spacious one-dimensional array as if all DT levels would be completely filled, no matter if all nodes actually exist. Experimental validation is performed on real-life and artificial datasets with various sizes and dimensions. Results show that by using the compact representation not only the memory requirements are reduced but also the time of induction is decreased.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Barros, R.C., Basgalupp, M.P., De Carvalho, A.C., Freitas, A.A.: A survey of evolutionary algorithms for decision-tree induction. IEEE Trans. SMC, Part C 42(3), 291–312 (2012)
Belle, V., Papantonis, I.: Principles and practice of explainable machine learning. Front. Big Data. 4, 39 (2021)
Czajkowski, M., Kretowski, M.: Decision tree underfitting in mining of gene expression data. An evolutionary multi-test tree approach. Expert Syst. Appl. 137, 392–404 (2019)
Dua, D., Karra Taniskidou, E.: UCI machine learning repository (2022). https://archive.ics.uci.edu/ml
Hyafil, L., Rivest, R.L.: Constructing optimal binary decision trees is NP-complete. Inf. Process. Lett. 5(1), 15–17 (1976)
Jurczuk, K., Czajkowski, M., Kretowski, M.: Evolutionary induction of a decision tree for large-scale data: a GPU-based approach. Soft. Comput. 21(24), 7363–7379 (2017)
Jurczuk, K., Czajkowski, M., Kretowski, M.: Multi-GPU approach to global induction of classification trees for large-scale data mining. Appl. Intell. 51(8), 5683–5700 (2021). https://doi.org/10.1007/s10489-020-01952-5
Jurczuk, K., Czajkowski, M., Kretowski, M.: GPU-based acceleration of evolutionary induction of model trees. Appl. Soft Comput. 119, 108503 (2022)
Kotsiantis, S.B.: Decision trees: a recent overview. Artif. Intell. Rev. 39(4), 261–283 (2013)
Kretowski, M.: Evolutionary Decision Trees in Large-Scale Data Mining. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21851-5
Loh, W.Y.: Fifty years of classification and regression trees. Int. Stat. Rev. 82(3), 329–348 (2014)
Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs, 3rd edn. Springer-Verlag, Berlin, Heidelberg (1996). https://doi.org/10.1007/978-3-662-03315-9
NVIDIA: NVIDIA Developer Zone - CUDA Toolkit Documentation (2022). https://docs.nvidia.com/cuda/
Rivera-Lopez, R., Canul-Reich, J., Mezura-Montes, E., Cruz-Chávez, M.A.: Induction of decision trees as classification models through metaheuristics. Swarm Evol. Comput. 69, 101006 (2022)
Storti, D., Yurtoglu, M.: CUDA for Engineers: An Introduction to High-Performance Parallel Computing. Addison-Wesley, New York (2016)
Strzodka, R.: Abstraction for AoS and SoA layout in C++. In: Hwu, W.W. (ed.) GPU Computing Gems Jade Edition, pp. 429–441. Morgan Kaufmann (2012)
Acknowledgements
This work was supported by Bialystok University of Technology, Poland under the Grant WZ/WI-IIT/4/2023 founded by Ministry of Science and Higher Education.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Jurczuk, K., Czajkowski, M., Kretowski, M. (2023). Compact In-Memory Representation of Decision Trees in GPU-Accelerated Evolutionary Induction. In: Wyrzykowski, R., Dongarra, J., Deelman, E., Karczewski, K. (eds) Parallel Processing and Applied Mathematics. PPAM 2022. Lecture Notes in Computer Science, vol 13826. Springer, Cham. https://doi.org/10.1007/978-3-031-30442-2_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-30442-2_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30441-5
Online ISBN: 978-3-031-30442-2
eBook Packages: Computer ScienceComputer Science (R0)