Skip to main content

Hyperparameter Tuning of Random Forests Using Radial Basis Function Models

  • Conference paper
  • First Online:
Machine Learning, Optimization, and Data Science (LOD 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13810))

Abstract

This paper considers the problem of tuning the hyperparameters of a random forest (RF) algorithm, which can be formulated as a discrete black-box optimization problem. Although default settings of RF hyperparameters in software packages work well in many cases, tuning these hyperparameters can improve the predictive performance of the RF. When dealing with large data sets, the tuning of RF hyperparameters becomes a computationally expensive black-box optimization problem. A suitable approach is to use a surrogate-based method where surrogates are used to approximate the functional relationship between the hyperparameters and the overall out-of-bag (OOB) prediction error of the RF. This paper develops a surrogate-based method for discrete black-box optimization that can be used to tune RF hyperparameters. Global and local variants of the proposed method that use radial basis function (RBF) surrogates are applied to tune the RF hyperparameters for seven regression data sets that involve up to 81 predictors and up to over 21K data points. The RBF algorithms obtained better overall OOB RMSE than discrete global random search, a discrete local random search algorithm and a Bayesian optimization approach given a limited budget on the number of hyperparameter settings to consider.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Archetti, F., Candelieri, A.: Bayesian Optimization and Data Science. SO, Springer, Cham (2019). https://doi.org/10.1007/978-3-030-24494-1

    Book  MATH  Google Scholar 

  2. Bagheri, S., Konen, W., Emmerich, M., Bäck, T.: Self-adjusting parameter control for surrogate-assisted constrained optimization under limited budgets. Appl. Soft Comput. 61, 377–393 (2017)

    Article  Google Scholar 

  3. Bartz-Beielstein, T., Zaefferer, M.: Model-based methods for continuous and discrete global optimization. Appl. Soft Comput. 55, 154–167 (2017)

    Article  Google Scholar 

  4. Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)

    MathSciNet  MATH  Google Scholar 

  5. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  6. Costa, A., Nannicini, G.: RBFOpt: an open-source library for black-box optimization with costly function evaluations. Math. Program. Comput. 10(4), 597–629 (2018). https://doi.org/10.1007/s12532-018-0144-7

    Article  MathSciNet  MATH  Google Scholar 

  7. De Cock, D.: Ames, Iowa: alternative to the Boston housing data as an end of semester regression project. J. Stat. Educ. 19(3) (2011) https://doi.org/10.1080/10691898.2011.11889627

  8. Dua, D., Graff, C.: UCI Machine Learning Repository. School of Information and Computer Science, University of California, Irvine (2019). https://archive.ics.uci.edu/ml

  9. Fernandes, K., Vinagre, P., Cortez, P.: A proactive intelligent decision support system for predicting the popularity of online news. In: Proceedings of the 17th EPIA 2015 - Portuguese Conference on Artificial Intelligence, Coimbra, Portugal (2015)

    Google Scholar 

  10. Hamidieh, K.: A data-driven statistical model for predicting the critical temperature of a superconductor. Comput. Mater. Sci. 154, 346–354 (2018)

    Article  Google Scholar 

  11. Ilievski, I., Akhtar, T., Feng, J., Shoemaker, C.: Efficient hyperparameter optimization for deep learning algorithms using deterministic RBF surrogates. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31, no. 1 (2017)

    Google Scholar 

  12. James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning with Applications in R. Springer, New York (2013). https://doi.org/10.1007/978-1-4614-7138-7

    Book  MATH  Google Scholar 

  13. Jones, D.R., Schonlau, M., Welch, W.J.: Efficient global optimization of expensive black-box functions. J. Global Optim. 13, 455–492 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  14. Joy, T.T., Rana, S., Gupta, S., Venkatesh, S.: Hyperparameter tuning for big data using Bayesian optimisation. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 2574–2579 (2016). https://doi.org/10.1109/ICPR.2016.7900023

  15. Kuhn, M.: AmesHousing: the Ames Iowa housing data. R package version 0.0.4 (2020). https://CRAN.R-project.org/package=AmesHousing

  16. Kuhn, M., Johnson, K.: Applied Predictive Modeling. Springer, New York (2013). https://doi.org/10.1007/978-1-4614-6849-3

    Book  MATH  Google Scholar 

  17. Kuhn, M., Johnson, K.: AppliedPredictiveModeling: functions and data sets for ‘applied predictive modeling’. R package version 1.1-7 (2018). https://CRAN.R-project.org/package=AppliedPredictiveModeling

  18. Picheny, V., Ginsbourger, D., Roustant, O.: DiceOptim: Kriging-based optimization for computer experiments. R package version 2.1.1 (2021). https://CRAN.R-project.org/package=DiceOptim

  19. Powell, M.J.D.: The theory of radial basis function approximation in 1990. In: Light, W. (ed.) Advances in Numerical Analysis, Volume 2: Wavelets, Subdivision Algorithms and Radial Basis Functions, pp. 105–210. Oxford University Press, Oxford (1992)

    Google Scholar 

  20. Probst, P., Wright, M.N., Boulesteix, A.-L.: Hyperparameters and tuning strategies for random forest. WIREs Data Min. Knowl. Discov. 9(3), e1301 (2019)

    Google Scholar 

  21. R Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2021). https://www.R-project.org/

  22. Regis, R.G.: Stochastic radial basis function algorithms for large-scale optimization involving expensive black-box objective and constraint functions. Comput. Oper. Res. 38(5), 837–853 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  23. Regis, R.G.: Large-scale discrete constrained black-box optimization using radial basis functions. In: 2020 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 2924–2931 (2020). https://doi.org/10.1109/SSCI47803.2020.9308581

  24. Roustant, O., Ginsbourger, D., Deville, Y.: DiceKriging, DiceOptim: two R packages for the analysis of computer experiments by Kriging-based metamodeling and optimization. J. Stat. Softw. 51(1), 1–55 (2012)

    Article  Google Scholar 

  25. RStudio Team: RStudio: Integrated Development for R. RStudio, PBC, Boston (2020). https://www.rstudio.com/

  26. Turner, R., et al.: Bayesian optimization is superior to random search for machine learning hyperparameter tuning: analysis of the black-box optimization challenge 2020. In: Proceedings of the NeurIPS 2020 Competition and Demonstration Track, in Proceedings of Machine Learning Research, vol. 133, pp. 3–26 (2021)

    Google Scholar 

  27. Vu, K.K., D’Ambrosio, C., Hamadi, Y., Liberti, L.: Surrogate-based methods for black-box optimization. Int. Trans. Oper. Res. 24, 393–424 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  28. Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S, 4th edn. Springer, New York (2002). https://doi.org/10.1007/978-0-387-21706-2

    Book  MATH  Google Scholar 

  29. Wright, M.N., Ziegler, A.: Ranger: a fast implementation of random forests for high dimensional data in C++ and R. J. Stat. Softw. 77(1), 1–17 (2017)

    Article  Google Scholar 

  30. Yeh, I.C.: Modeling of strength of high-performance concrete using artificial neural networks. Cem. Concr. Res. 28(12), 1797–1808 (1998)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rommel G. Regis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Regis, R.G. (2023). Hyperparameter Tuning of Random Forests Using Radial Basis Function Models. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2022. Lecture Notes in Computer Science, vol 13810. Springer, Cham. https://doi.org/10.1007/978-3-031-25599-1_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25599-1_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25598-4

  • Online ISBN: 978-3-031-25599-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics