Abstract
Hyperparameter optimization plays a significant role in the overall performance of machine learning algorithms. However, the computational cost of algorithm evaluation can be extremely high for complex algorithm or large dataset. In this paper, we propose a model-based reinforcement learning with experience variable and meta-learning optimization method to speed up the training process of hyperparameter optimization. Specifically, an RL agent is employed to select hyperparameters and treat the k-fold cross-validation result as a reward signal to update the agent. To guide the agent’s policy update, we design an embedding representation called “experience variable” and dynamically update it during the training process. Besides, we employ a predictive model to predict the performance of machine learning algorithm with the selected hyperparameters and limit the model rollout in short horizon to reduce the impact of the inaccuracy of the model. Finally, we use the meta-learning technique to pre-train the model for fast adapting to a new task. To prove the advantages of our method, we conduct experiments on 25 real HPO tasks and the experimental results show that with the limited computational resources, the proposed method outperforms the state-of-the-art Bayesian methods and evolution method.










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
Not applicable.
Code Availability
Not applicable.
References
Baker B, Gupta O, Naik N, Raskar R (2016) Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167
Bay S, Kibler D, Pazzani MJ, Smyth P (2000) The UCI KDD archive of large data sets for data mining research and experimentation. ACM SIGKDD Explorations Newsl 2(2):81–85
Bello I, Zoph B, Vasudevan V, Le Q (2017) Neural optimizer search with reinforcement learning. In: Proceedings of the 34th international conference on machine learning. vol 70, pp 459–468
Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:281–305
Bergstra J, Bardenet R, Bengio Y, Kégl B (2011) Algorithms for hyper-parameter optimization. In: Advances in neural information processing systems, pp 2546–2554
Brazdil P, Soares C, Da Costa J (2003) Ranking learning algorithms: using IBL and meta-learning on accuracy and time results. Mach Learn 50(3):251–277
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Chen S, Wu J, Chen X (2019) Deep reinforcement learning with model-based acceleration for hyperparameter optimization. In: 2019 IEEE 31st international conference on tools with artificial intelligence (ICTAI), IEEE, pp 170–177
Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 785–794
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Dewancker I, McCourt M, Clark S, Hayes P, Johnson A, Ke G (2016) A stratified analysis of bayesian optimization methods. arXiv preprint arXiv:1603.09441
Falkner S, Klein A, Hutter F (2018) Bohb: Robust and efficient hyperparameter optimization at scale. arXiv preprint arXiv:1807.01774
Fernández-Delgado M, Cernadas E, Barro S, Amorim D (2014) Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 15(1):3133–3181
Feurer M, Klein A, Eggensperger K, Springenberg J, Blum M, Hutter F (2015) Efficient and robust automated machine learning. In: Advances in neural information processing systems, pp 2962–2970
Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th international conference on machine learning. vol 70, pp 1126–1135
Frazier P (2018) A tutorial on bayesian optimization. arXiv preprint arXiv:1807.02811
Guerra S, Prudêncio R, Ludermir T (2008) Predicting the performance of learning algorithms using support vector machines as meta-regressors. In: International conference on artificial neural networks, Springer, pp 523–532
Gupta A, Mendonca R, Liu Y, Abbeel P, Levine S (2018) Meta-reinforcement learning of structured exploration strategies. In: Advances in Neural Information Processing Systems, pp 5302–5311
Haarnoja T, Zhou A, Hartikainen K, Tucker G, Ha S, Tan J, Kumar V, Zhu H, Gupta A, Abbeel P, et al. (2018) Soft actor-critic algorithms and applications. arXiv preprint arXiv:1812.05905
Hansen N (2016) The cma evolution strategy: A tutorial. arXiv preprint arXiv:1604.00772
Heess N, Wayne G, Silver D, Lillicrap T, Erez T, Tassa Y (2015) Learning continuous control policies by stochastic value gradients. In: Advances in neural information processing systems, pp 2944–2952
Hochreiter S, Schmidhuber J (1997) Lstm can solve hard long time lag problems. In: Advances in neural information processing systems, pp 473–479
Holzinger A, Plass M, Kickmeier-Rust M, Holzinger K, Crişan G, Pintea C, Palade V (2019) Interactive machine learning: experimental evidence for the human in the algorithmic loop. Appl Intell 49(7):2401–2414
Hutter F, Kotthoff L, Vanschoren J (2019) Automated machine learning. Springer, Berlin
Johnson V, Rossell D (2012) Bayesian model selection in high-dimensional settings. J Am Stat Assoc 107(498):649–660
Kohavi R, John G (1995) Automatic parameter selection by minimizing estimated error. In: Machine learning proceedings 1995, Elsevier, pp 304–312
Kurutach T, Clavera I, Duan Y, Tamar A, Abbeel P (2018) Model-ensemble trust-region policy optimization. arXiv preprint arXiv:1802.10592
Li L, Jamieson K, DeSalvo G, Rostamizadeh A, Talwalkar A (2017) Hyperband: a novel bandit-based approach to hyperparameter optimization. J Mach Learn Res 18(1):6765–6816
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
Perrone V, Jenatton R, Seeger M, Archambeau C (2017) Multiple adaptive bayesian linear regression for scalable bayesian optimization with warm start. arXiv preprint arXiv:1712.02902
Plappert M, Houthooft R, Dhariwal P, Sidor S, Chen R, Chen X, Asfour T, Abbeel P, Andrychowicz M (2017) Parameter space noise for exploration. arXiv preprint arXiv:1706.01905
Rivolli A, Garcia LP, Soares C, Vanschoren J, de Carvalho AC (2018) Towards reproducible empirical research in meta-learning. arXiv preprint arXiv:1808.10406
Schilling N, Wistuba M, Drumond L, Schmidt-Thieme L (2015) Hyperparameter optimization with factorized multilayer perceptrons. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 87–103
Schulman J, Levine S, Abbeel P, Jordan M, Moritz P (2015) Trust region policy optimization. In: International conference on machine learning, pp 1889–1897
Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347
Sutton RS, Barto AG (2005) Reinforcement learning: an introduction. IEEE Trans Neural Netw 16:285–286
Vanschoren J (2018) Meta-learning: A survey. arXiv preprint arXiv:1810.03548
Wistuba M, Schilling N, Schmidt-Thieme L (2016) Two-stage transfer surrogate model for automatic hyperparameter optimization. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 199–214
Wistuba M, Schilling N, Schmidt-Thieme L (2018) Scalable gaussian process-based transfer surrogates for hyperparameter optimization. Mach Learn 107(1):43–78
Wu J, Chen S, Chen X (2019) Rpr-bp: A deep reinforcement learning method for automatic hyperparameter optimization. In: 2019 international joint conference on neural networks (IJCNN), IEEE, pp 1–8
Zoph B, Le Q (2016) Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578
Funding
This work was supported by the National Science Foundation of China (grant 61503059).
Author information
Authors and Affiliations
Contributions
This manuscript tackles the hyperparameter optimization problem for the machine learning models. A novel method based on reinforcement learning is proposed to find the hyperparameters more quickly and efficiently. Our contributions are summarized as follows: 1. We designed an embedding representation called “experience variable” to guide the agent’s policy update, which improves the final accuracy; 2. We employed a predictive model to predict the performance of machine learning algorithm with the selected hyperparameters. In this way, the training process is accelerated. To trade off the accuracy and the efficiency, the model rollout is disciplined in short horizon to reduce the impact of the inaccuracy of the model; 3. To further accelerate the training, we used the meta-learning technique to pre-train the predictive model for fast adapting to a new task.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest and they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, X., Wu, J. & Chen, S. Efficient hyperparameters optimization through model-based reinforcement learning with experience exploiting and meta-learning. Soft Comput 27, 8661–8678 (2023). https://doi.org/10.1007/s00500-023-08050-x
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-023-08050-x