Skip to main content
Log in

Efficient hyperparameters optimization through model-based reinforcement learning with experience exploiting and meta-learning

  • Data analytics and machine learning
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Hyperparameter optimization plays a significant role in the overall performance of machine learning algorithms. However, the computational cost of algorithm evaluation can be extremely high for complex algorithm or large dataset. In this paper, we propose a model-based reinforcement learning with experience variable and meta-learning optimization method to speed up the training process of hyperparameter optimization. Specifically, an RL agent is employed to select hyperparameters and treat the k-fold cross-validation result as a reward signal to update the agent. To guide the agent’s policy update, we design an embedding representation called “experience variable” and dynamically update it during the training process. Besides, we employ a predictive model to predict the performance of machine learning algorithm with the selected hyperparameters and limit the model rollout in short horizon to reduce the impact of the inaccuracy of the model. Finally, we use the meta-learning technique to pre-train the model for fast adapting to a new task. To prove the advantages of our method, we conduct experiments on 25 real HPO tasks and the experimental results show that with the limited computational resources, the proposed method outperforms the state-of-the-art Bayesian methods and evolution method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data Availability

Not applicable.

Code Availability

Not applicable.

References

  • Baker B, Gupta O, Naik N, Raskar R (2016) Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167

  • Bay S, Kibler D, Pazzani MJ, Smyth P (2000) The UCI KDD archive of large data sets for data mining research and experimentation. ACM SIGKDD Explorations Newsl 2(2):81–85

    Article  Google Scholar 

  • Bello I, Zoph B, Vasudevan V, Le Q (2017) Neural optimizer search with reinforcement learning. In: Proceedings of the 34th international conference on machine learning. vol 70, pp 459–468

  • Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:281–305

    MathSciNet  MATH  Google Scholar 

  • Bergstra J, Bardenet R, Bengio Y, Kégl B (2011) Algorithms for hyper-parameter optimization. In: Advances in neural information processing systems, pp 2546–2554

  • Brazdil P, Soares C, Da Costa J (2003) Ranking learning algorithms: using IBL and meta-learning on accuracy and time results. Mach Learn 50(3):251–277

    Article  MATH  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  • Chen S, Wu J, Chen X (2019) Deep reinforcement learning with model-based acceleration for hyperparameter optimization. In: 2019 IEEE 31st international conference on tools with artificial intelligence (ICTAI), IEEE, pp 170–177

  • Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 785–794

  • Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  • Dewancker I, McCourt M, Clark S, Hayes P, Johnson A, Ke G (2016) A stratified analysis of bayesian optimization methods. arXiv preprint arXiv:1603.09441

  • Falkner S, Klein A, Hutter F (2018) Bohb: Robust and efficient hyperparameter optimization at scale. arXiv preprint arXiv:1807.01774

  • Fernández-Delgado M, Cernadas E, Barro S, Amorim D (2014) Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 15(1):3133–3181

    MathSciNet  MATH  Google Scholar 

  • Feurer M, Klein A, Eggensperger K, Springenberg J, Blum M, Hutter F (2015) Efficient and robust automated machine learning. In: Advances in neural information processing systems, pp 2962–2970

  • Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th international conference on machine learning. vol 70, pp 1126–1135

  • Frazier P (2018) A tutorial on bayesian optimization. arXiv preprint arXiv:1807.02811

  • Guerra S, Prudêncio R, Ludermir T (2008) Predicting the performance of learning algorithms using support vector machines as meta-regressors. In: International conference on artificial neural networks, Springer, pp 523–532

  • Gupta A, Mendonca R, Liu Y, Abbeel P, Levine S (2018) Meta-reinforcement learning of structured exploration strategies. In: Advances in Neural Information Processing Systems, pp 5302–5311

  • Haarnoja T, Zhou A, Hartikainen K, Tucker G, Ha S, Tan J, Kumar V, Zhu H, Gupta A, Abbeel P, et al. (2018) Soft actor-critic algorithms and applications. arXiv preprint arXiv:1812.05905

  • Hansen N (2016) The cma evolution strategy: A tutorial. arXiv preprint arXiv:1604.00772

  • Heess N, Wayne G, Silver D, Lillicrap T, Erez T, Tassa Y (2015) Learning continuous control policies by stochastic value gradients. In: Advances in neural information processing systems, pp 2944–2952

  • Hochreiter S, Schmidhuber J (1997) Lstm can solve hard long time lag problems. In: Advances in neural information processing systems, pp 473–479

  • Holzinger A, Plass M, Kickmeier-Rust M, Holzinger K, Crişan G, Pintea C, Palade V (2019) Interactive machine learning: experimental evidence for the human in the algorithmic loop. Appl Intell 49(7):2401–2414

    Article  MATH  Google Scholar 

  • Hutter F, Kotthoff L, Vanschoren J (2019) Automated machine learning. Springer, Berlin

    Book  Google Scholar 

  • Johnson V, Rossell D (2012) Bayesian model selection in high-dimensional settings. J Am Stat Assoc 107(498):649–660

    Article  MathSciNet  MATH  Google Scholar 

  • Kohavi R, John G (1995) Automatic parameter selection by minimizing estimated error. In: Machine learning proceedings 1995, Elsevier, pp 304–312

  • Kurutach T, Clavera I, Duan Y, Tamar A, Abbeel P (2018) Model-ensemble trust-region policy optimization. arXiv preprint arXiv:1802.10592

  • Li L, Jamieson K, DeSalvo G, Rostamizadeh A, Talwalkar A (2017) Hyperband: a novel bandit-based approach to hyperparameter optimization. J Mach Learn Res 18(1):6765–6816

    MathSciNet  MATH  Google Scholar 

  • Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  • Perrone V, Jenatton R, Seeger M, Archambeau C (2017) Multiple adaptive bayesian linear regression for scalable bayesian optimization with warm start. arXiv preprint arXiv:1712.02902

  • Plappert M, Houthooft R, Dhariwal P, Sidor S, Chen R, Chen X, Asfour T, Abbeel P, Andrychowicz M (2017) Parameter space noise for exploration. arXiv preprint arXiv:1706.01905

  • Rivolli A, Garcia LP, Soares C, Vanschoren J, de Carvalho AC (2018) Towards reproducible empirical research in meta-learning. arXiv preprint arXiv:1808.10406

  • Schilling N, Wistuba M, Drumond L, Schmidt-Thieme L (2015) Hyperparameter optimization with factorized multilayer perceptrons. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 87–103

  • Schulman J, Levine S, Abbeel P, Jordan M, Moritz P (2015) Trust region policy optimization. In: International conference on machine learning, pp 1889–1897

  • Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347

  • Sutton RS, Barto AG (2005) Reinforcement learning: an introduction. IEEE Trans Neural Netw 16:285–286

    Article  MATH  Google Scholar 

  • Vanschoren J (2018) Meta-learning: A survey. arXiv preprint arXiv:1810.03548

  • Wistuba M, Schilling N, Schmidt-Thieme L (2016) Two-stage transfer surrogate model for automatic hyperparameter optimization. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 199–214

  • Wistuba M, Schilling N, Schmidt-Thieme L (2018) Scalable gaussian process-based transfer surrogates for hyperparameter optimization. Mach Learn 107(1):43–78

    Article  MathSciNet  MATH  Google Scholar 

  • Wu J, Chen S, Chen X (2019) Rpr-bp: A deep reinforcement learning method for automatic hyperparameter optimization. In: 2019 international joint conference on neural networks (IJCNN), IEEE, pp 1–8

  • Zoph B, Le Q (2016) Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578

Download references

Funding

This work was supported by the National Science Foundation of China (grant 61503059).

Author information

Authors and Affiliations

Authors

Contributions

This manuscript tackles the hyperparameter optimization problem for the machine learning models. A novel method based on reinforcement learning is proposed to find the hyperparameters more quickly and efficiently. Our contributions are summarized as follows: 1. We designed an embedding representation called “experience variable” to guide the agent’s policy update, which improves the final accuracy; 2. We employed a predictive model to predict the performance of machine learning algorithm with the selected hyperparameters. In this way, the training process is accelerated. To trade off the accuracy and the efficiency, the model rollout is disciplined in short horizon to reduce the impact of the inaccuracy of the model; 3. To further accelerate the training, we used the meta-learning technique to pre-train the predictive model for fast adapting to a new task.

Corresponding author

Correspondence to Jia Wu.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest and they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, X., Wu, J. & Chen, S. Efficient hyperparameters optimization through model-based reinforcement learning with experience exploiting and meta-learning. Soft Comput 27, 8661–8678 (2023). https://doi.org/10.1007/s00500-023-08050-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-023-08050-x

Keywords

Navigation