Parallel construction of Random Forest on GPU

Senagi, Kennedy; Jouandeau, Nicolas

doi:10.1007/s11227-021-04290-6

Parallel construction of Random Forest on GPU

Published: 24 January 2022

Volume 78, pages 10480–10500, (2022)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

427 Accesses
7 Citations
Explore all metrics

Abstract

There is tremendous growth of data generated from different industries, i.e., health, agriculture, engineering, etc. Consequently, there is demand for more processing power. Compared to computer processing units, general-purpose graphics processing units (GPUs) are rapidly emerging as a promising solution to achieving high performance and energy efficiency in various computing domains. Multiple forms of parallelism and complexity in memory access have posed a challenge in developing Random Forest (RF) GPU-based algorithm. RF is a popular and robust machine learning algorithm. In this paper, coarse-grained and dynamic parallelism approaches on GPU are integrated into RF(dpRFGPU). Experiment results of dpRFGPU are compared with sequential execution of RF(seqRFCPU) and parallelised RF trees on GPU(parRFGPU). Results show an improved average speedup from 1.62 to 3.57 of parRFGPU and dpRFGPU, respectively. Acceleration is also evident when RF is configured with an average of 32 number of trees and above in both dpRFGPU and parRFGPU on low-dimensional datasets. Nonetheless, larger datasets save significant time compared to smaller datasets on GPU (dpRFGPU saves more time compared to parRFGPU). dpRFGPU approach significantly accelerated RF trees on GPU. This approach significantly optimized RF trees parallelization on GPU by reducing its training time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scalable Random Forest with Data-Parallel Computing

Random forest implementation and optimization for Big Data analytics on LexisNexis’s high performance computing cluster platform

Article Open access 30 July 2019

GPU-Accelerated Evolutionary Induction of Regression Trees

Notes

Data independence in RF is facilitated by bagging. In bagging, each tree is built from an independent subset of data. Each subset of data is generated by randomly sampling data from the original dataset with replacement [5].
This is caused by irregular execution paths of threads in a warp.
This is caused by irregular execution paths of warps in a block.
Single-instruction multiple threaded is execution where a set of instructions are dispatched in a multi-threaded manner.
Compute capability determines the general specifications and available features of a GPU.
The selected datasets had a variety of characteristics (e.g., dimensionality and number of records) that could reduce biasness in the experiments. These datasets also worked well with the program prototypes this research developed for experiments.

References

Kirk DB, Hwu WW (2010) Programming massive parallel processors. Elsevier Inc., eBook ISBN: 9780123814739
Zheng R, Hu Q, Jin H (2018) GPUPerfML: a performance analytical model based on decision tree for GPU architectures. In: The Proceedings of the 20th International Conference on High Performance Computing and Communications, IEEE. https://doi.org/10.1109/HPCC/SmartCity/DSS.2018.00110
Senagi K, Jouandeau N (2018) Confidence in Random Forest for performance optimization. In: Bramer M, Petridis M (eds) Artificial intelligence. XXXV SGAI 2018. Lecture notes in computer science, vol 11311. Springer, Cham. https://doi.org/10.1007/978-3-030-04191-5_31
Chapter Google Scholar
Vouzis PD, Sahinidis NV (2011) GPU-BLAST: using graphics processors to accelerate protein sequence alignment. J Bioinf (Oxford England) 27(2):182–188. https://doi.org/10.1093/2Fbioinformatics/2Fbtq644
Article Google Scholar
Breiman L (2001) Random Forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324
Article MATH Google Scholar
Zhang J, Wang H, Feng W (2017) cuBLASTP: fine-grained parallelization of protein sequence search on CPU+GPU”. In: The Proceedings of IEE/ACM Transactions on Computational Biology and Bioinformatics, vol.14(4). https://doi.org/10.1109/TCBB.2015.2489662
Wang J, Rubin N, Sidelnik A, Yalamanchili S (2016) LaPerm: locality aware scheduler for dynamic parallelism on GPUs. In: The Proceeding of the ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), vol. 44(3), pp 583–595, IEEE. https://doi.org/10.1109/ISCA.2016.57
Rich C, Alexandru NM (2006) An empirical comparison of supervised learning algorithms. In: ICML ’06 Proceedings of the 23rd International Conference on Machine learning, pp 161–168, ACM. https://doi.org/10.1145/1143844.1143865
Manuel FD, Eva C, Senen B (2014) Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 15:3133–3181
MathSciNet MATH Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
MATH Google Scholar
Nawar S, Mouazen AM (2017) Comparison between Random Forests, artificial neural networks and gradient boosted machines methods of on-line vis-NIR spectroscopy measurements of soil total nitrogen and total carbon. Sensors. https://doi.org/10.3390/s17102428
Article Google Scholar
Lie C, Deng J, Cao K, Xiao Y, Ma L, Wang W, Ma T, Shu C (2018) A comparison of Random Forest and support vector machine approaches to predict coal spontaneous combustion in gob. ScienceDirect 239:297–311. https://doi.org/10.1016/j.fuel.2018.11.006
Article Google Scholar
Wen Z, He B, Ramamohanarao K, Lu S, Shi J (2018) Efficient gradient boosted decision tree training on GPUs”. In: The Proceedings of International Parallel and Distributed Processing Symposium, IEEE. https://doi.org/10.1109/IPDPS.2018.00033
Daga M, Nutter M (2012) Exploiting Coarse-grained parallelism in B+ tree Searches on an APU. In: The Proceedings of the SC Companion: High Performance Computing, Networking Storage and Analysis, USA, IEEE. https://doi.org/10.1109/SC.Companion.2012.40
Chen J, Li K, Tang Z, Bilal K, Yu S, Weng C, Li K (2017) A parallel Random Forest algorithm for big data in a spark cloud computing environment. IEEE Tran Parallel Distrib Syst 28(4):919–933. https://doi.org/10.1109/TPDS.2016.2603511
Article Google Scholar
Genuer R, Poggi J, Tuleau-Malot C, Villa-Vialaneix N (2017) Random Forests for big data. Big Data Res 9:28–46. https://doi.org/10.1016/j.bdr.2017.07.003
Article Google Scholar
Lo WT, Chang YS, Sheu RK, Chiu CC, Yuan SM (2014) CUDT: a CUDA based decision tree algorithm. Sci World J. https://doi.org/10.1155/2014/745640
Article Google Scholar
Hughes C, Hughes T (2008) Professional multicore programming: design and implementation for C++ developers. Wiley Publishing, Inc,
NVIDIA Corporation. CUDA Toolkit. [Online]. https://developer.nvidia.com/cuda-toolkit. Date Accessed[April 2019]
Quinlan JR (1994) C4.5 programs for machine learning. Mach Learn 16:235–240
Google Scholar
Rauber T, Rünger G (2010) Parallel programming for multicore and cluster systems. Springer-Verlag, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04818-0
Book MATH Google Scholar
LeBard DN, Levine BG, Mertmann P, Barr SA, Jusufi A, Sanders S, Klein ML, Panagiotopoulos AZ (2012) Self-assembly of coarse-grained ionic surfactants accelerated by graphics processing units. J Soft Matter. https://doi.org/10.1039/c1sm06787g
Article Google Scholar
Nickolls J, Dally WJ (2010) The GPU computing Era. IEEE Micro. https://doi.org/10.1109/MM.2010.41
Article Google Scholar
NVIDIA. [Online]. Available https://docs.nvidia.com/cuda/index.html. [Accessed: April 2019]
Barlas G (2015) Multicore and GPU programming an integrated approach. Elsevier Inc
Luo GH, Huang SK, Chang YS, Yuan SM (2013) A parallel bees algorithm implementation on GPU. Elsevier. https://doi.org/10.1016/j.sysarc.2013.09.007
Nasridinov A, Lee Y, Park YH (2013) Decision tree construction on GPU: ubiquitous parallel computing approach. Springer. https://doi.org/10.1007/s00607-013-0343-z
Lettich F, Lucchese C, Maria Nardini F, Orlando S, Perego R, Tonellotto N, Venturini R (2018) Parallel traversal of large ensembles of decision trees. IEEE. https://doi.org/10.1109/TPDS.2018.2860982
Article Google Scholar
You Y, Zhang Z, Hsieh CJ, Demmel J, Keutzer K (2019) Fast deep neural network training on distributed systems and cloud TPUs. IEEE. https://doi.org/10.1109/TPDS.2019.2913833
Article Google Scholar
Mahale K, Kanaskar S, Kapadnis P, Desale M, Walunj SM (2015) Acceleration of game tree search using GPGPU. In: The Proceedings of the International Conference on Green Computing and Internet of Things (ICGCIoT), IEEE. https://doi.org/10.1109/ICGCIoT.2015.7380525
Senagi K, Jouandeau N (2018) A non-deterministic strategy for searching optimal number of trees hyperparameter in Random Forest. In: Proceedings of the Federated Conference on Computer Science and Information Systems (FedCSIS), IEEE. https://doi.org/10.15439/2018F202
Oshiro TP, Perez SJ, Baranauskas A (2012) How many trees in a Random Forest?. In: Proceedings of the International Workshop on Machine Learning and Data Mining in Pattern Recognition, Springer, Berlin, Heidelberg, pp 154–168, 2012. https://doi.org/10.1007/978-3-642-31537-413
Dua D, Taniskidou KE (2017) UCI machine learning repository. [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science
NVIDIA Corporation: Profiler user’s guide. [online]. https://docs.nvidia.com/cuda/profiler-users-guide/#nvprof-overview. [Date Accessed: April 2019]
Senagi K, Jouandeau N and Kamoni P (2017) Using parallel Random Forest classifier in predicting land suitability for crop production. Journal of Agricultural Informatics 8(3), 23–32

Download references

Author information

Authors and Affiliations

International Centre of Insect Physiology and Ecology, Nairobi, Kenya
Kennedy Senagi
LIASD, Université Paris8, Paris, France
Nicolas Jouandeau

Authors

Kennedy Senagi
View author publications
You can also search for this author inPubMed Google Scholar
Nicolas Jouandeau
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Kennedy Senagi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Senagi, K., Jouandeau, N. Parallel construction of Random Forest on GPU. J Supercomput 78, 10480–10500 (2022). https://doi.org/10.1007/s11227-021-04290-6

Download citation

Accepted: 21 December 2021
Published: 24 January 2022
Issue Date: May 2022
DOI: https://doi.org/10.1007/s11227-021-04290-6

Keywords

Part of a collection:

Computer Science SDG 7: Affordable and Clean Energy

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parallel construction of Random Forest on GPU

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Scalable Random Forest with Data-Parallel Computing

Random forest implementation and optimization for Big Data analytics on LexisNexis’s high performance computing cluster platform

GPU-Accelerated Evolutionary Induction of Regression Trees

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now