Efficient hyperparameters optimization through model-based reinforcement learning with experience exploiting and meta-learning

Liu, Xiyuan; Wu, Jia; Chen, Senpeng

doi:10.1007/s00500-023-08050-x

Efficient hyperparameters optimization through model-based reinforcement learning with experience exploiting and meta-learning

Data analytics and machine learning
Published: 09 April 2023

Volume 27, pages 8661–8678, (2023)
Cite this article

Soft Computing Aims and scope Submit manuscript

445 Accesses
8 Citations
Explore all metrics

Abstract

Hyperparameter optimization plays a significant role in the overall performance of machine learning algorithms. However, the computational cost of algorithm evaluation can be extremely high for complex algorithm or large dataset. In this paper, we propose a model-based reinforcement learning with experience variable and meta-learning optimization method to speed up the training process of hyperparameter optimization. Specifically, an RL agent is employed to select hyperparameters and treat the k-fold cross-validation result as a reward signal to update the agent. To guide the agent’s policy update, we design an embedding representation called “experience variable” and dynamically update it during the training process. Besides, we employ a predictive model to predict the performance of machine learning algorithm with the selected hyperparameters and limit the model rollout in short horizon to reduce the impact of the inaccuracy of the model. Finally, we use the meta-learning technique to pre-train the model for fast adapting to a new task. To prove the advantages of our method, we conduct experiments on 25 real HPO tasks and the experimental results show that with the limited computational resources, the proposed method outperforms the state-of-the-art Bayesian methods and evolution method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 4

A Meta-Reinforcement Learning Approach to Optimize Parameters and Hyper-parameters Simultaneously

Efficient Online Hyperparameter Adaptation for Deep Reinforcement Learning

Learning Global Optimization by Deep Reinforcement Learning

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability

Not applicable.

Code Availability

Not applicable.

References

Baker B, Gupta O, Naik N, Raskar R (2016) Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167
Bay S, Kibler D, Pazzani MJ, Smyth P (2000) The UCI KDD archive of large data sets for data mining research and experimentation. ACM SIGKDD Explorations Newsl 2(2):81–85
Article Google Scholar
Bello I, Zoph B, Vasudevan V, Le Q (2017) Neural optimizer search with reinforcement learning. In: Proceedings of the 34th international conference on machine learning. vol 70, pp 459–468
Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:281–305
MathSciNet MATH Google Scholar
Bergstra J, Bardenet R, Bengio Y, Kégl B (2011) Algorithms for hyper-parameter optimization. In: Advances in neural information processing systems, pp 2546–2554
Brazdil P, Soares C, Da Costa J (2003) Ranking learning algorithms: using IBL and meta-learning on accuracy and time results. Mach Learn 50(3):251–277
Article MATH Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article MATH Google Scholar
Chen S, Wu J, Chen X (2019) Deep reinforcement learning with model-based acceleration for hyperparameter optimization. In: 2019 IEEE 31st international conference on tools with artificial intelligence (ICTAI), IEEE, pp 170–177
Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 785–794
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MathSciNet MATH Google Scholar
Dewancker I, McCourt M, Clark S, Hayes P, Johnson A, Ke G (2016) A stratified analysis of bayesian optimization methods. arXiv preprint arXiv:1603.09441
Falkner S, Klein A, Hutter F (2018) Bohb: Robust and efficient hyperparameter optimization at scale. arXiv preprint arXiv:1807.01774
Fernández-Delgado M, Cernadas E, Barro S, Amorim D (2014) Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 15(1):3133–3181
MathSciNet MATH Google Scholar
Feurer M, Klein A, Eggensperger K, Springenberg J, Blum M, Hutter F (2015) Efficient and robust automated machine learning. In: Advances in neural information processing systems, pp 2962–2970
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th international conference on machine learning. vol 70, pp 1126–1135
Frazier P (2018) A tutorial on bayesian optimization. arXiv preprint arXiv:1807.02811
Guerra S, Prudêncio R, Ludermir T (2008) Predicting the performance of learning algorithms using support vector machines as meta-regressors. In: International conference on artificial neural networks, Springer, pp 523–532
Gupta A, Mendonca R, Liu Y, Abbeel P, Levine S (2018) Meta-reinforcement learning of structured exploration strategies. In: Advances in Neural Information Processing Systems, pp 5302–5311
Haarnoja T, Zhou A, Hartikainen K, Tucker G, Ha S, Tan J, Kumar V, Zhu H, Gupta A, Abbeel P, et al. (2018) Soft actor-critic algorithms and applications. arXiv preprint arXiv:1812.05905
Hansen N (2016) The cma evolution strategy: A tutorial. arXiv preprint arXiv:1604.00772
Heess N, Wayne G, Silver D, Lillicrap T, Erez T, Tassa Y (2015) Learning continuous control policies by stochastic value gradients. In: Advances in neural information processing systems, pp 2944–2952
Hochreiter S, Schmidhuber J (1997) Lstm can solve hard long time lag problems. In: Advances in neural information processing systems, pp 473–479
Holzinger A, Plass M, Kickmeier-Rust M, Holzinger K, Crişan G, Pintea C, Palade V (2019) Interactive machine learning: experimental evidence for the human in the algorithmic loop. Appl Intell 49(7):2401–2414
Article MATH Google Scholar
Hutter F, Kotthoff L, Vanschoren J (2019) Automated machine learning. Springer, Berlin
Book Google Scholar
Johnson V, Rossell D (2012) Bayesian model selection in high-dimensional settings. J Am Stat Assoc 107(498):649–660
Article MathSciNet MATH Google Scholar
Kohavi R, John G (1995) Automatic parameter selection by minimizing estimated error. In: Machine learning proceedings 1995, Elsevier, pp 304–312
Kurutach T, Clavera I, Duan Y, Tamar A, Abbeel P (2018) Model-ensemble trust-region policy optimization. arXiv preprint arXiv:1802.10592
Li L, Jamieson K, DeSalvo G, Rostamizadeh A, Talwalkar A (2017) Hyperband: a novel bandit-based approach to hyperparameter optimization. J Mach Learn Res 18(1):6765–6816
MathSciNet MATH Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
MathSciNet MATH Google Scholar
Perrone V, Jenatton R, Seeger M, Archambeau C (2017) Multiple adaptive bayesian linear regression for scalable bayesian optimization with warm start. arXiv preprint arXiv:1712.02902
Plappert M, Houthooft R, Dhariwal P, Sidor S, Chen R, Chen X, Asfour T, Abbeel P, Andrychowicz M (2017) Parameter space noise for exploration. arXiv preprint arXiv:1706.01905
Rivolli A, Garcia LP, Soares C, Vanschoren J, de Carvalho AC (2018) Towards reproducible empirical research in meta-learning. arXiv preprint arXiv:1808.10406
Schilling N, Wistuba M, Drumond L, Schmidt-Thieme L (2015) Hyperparameter optimization with factorized multilayer perceptrons. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 87–103
Schulman J, Levine S, Abbeel P, Jordan M, Moritz P (2015) Trust region policy optimization. In: International conference on machine learning, pp 1889–1897
Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347
Sutton RS, Barto AG (2005) Reinforcement learning: an introduction. IEEE Trans Neural Netw 16:285–286
Article MATH Google Scholar
Vanschoren J (2018) Meta-learning: A survey. arXiv preprint arXiv:1810.03548
Wistuba M, Schilling N, Schmidt-Thieme L (2016) Two-stage transfer surrogate model for automatic hyperparameter optimization. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 199–214
Wistuba M, Schilling N, Schmidt-Thieme L (2018) Scalable gaussian process-based transfer surrogates for hyperparameter optimization. Mach Learn 107(1):43–78
Article MathSciNet MATH Google Scholar
Wu J, Chen S, Chen X (2019) Rpr-bp: A deep reinforcement learning method for automatic hyperparameter optimization. In: 2019 international joint conference on neural networks (IJCNN), IEEE, pp 1–8
Zoph B, Le Q (2016) Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578

Download references

Funding

This work was supported by the National Science Foundation of China (grant 61503059).

Author information

Authors and Affiliations

University of Electronic Science and Technology of China, Chendu, China
Xiyuan Liu, Jia Wu & Senpeng Chen

Authors

Xiyuan Liu
View author publications
You can also search for this author inPubMed Google Scholar
Jia Wu
View author publications
You can also search for this author inPubMed Google Scholar
Senpeng Chen
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

This manuscript tackles the hyperparameter optimization problem for the machine learning models. A novel method based on reinforcement learning is proposed to find the hyperparameters more quickly and efficiently. Our contributions are summarized as follows: 1. We designed an embedding representation called “experience variable” to guide the agent’s policy update, which improves the final accuracy; 2. We employed a predictive model to predict the performance of machine learning algorithm with the selected hyperparameters. In this way, the training process is accelerated. To trade off the accuracy and the efficiency, the model rollout is disciplined in short horizon to reduce the impact of the inaccuracy of the model; 3. To further accelerate the training, we used the meta-learning technique to pre-train the predictive model for fast adapting to a new task.

Corresponding author

Correspondence to Jia Wu.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest and they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liu, X., Wu, J. & Chen, S. Efficient hyperparameters optimization through model-based reinforcement learning with experience exploiting and meta-learning. Soft Comput 27, 8661–8678 (2023). https://doi.org/10.1007/s00500-023-08050-x

Download citation

Accepted: 13 March 2023
Published: 09 April 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s00500-023-08050-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient hyperparameters optimization through model-based reinforcement learning with experience exploiting and meta-learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Meta-Reinforcement Learning Approach to Optimize Parameters and Hyper-parameters Simultaneously

Efficient Online Hyperparameter Adaptation for Deep Reinforcement Learning

Learning Global Optimization by Deep Reinforcement Learning

Explore related subjects

Data Availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now