Out-of-the-box parameter control for evolutionary and swarm-based algorithms with distributed reinforcement learning

de Lacerda, Marcelo Gomes Pereira; de Lima Neto, Fernando Buarque; Ludermir, Teresa Bernarda; Kuchen, Herbert

doi:10.1007/s11721-022-00222-z

Out-of-the-box parameter control for evolutionary and swarm-based algorithms with distributed reinforcement learning

Published: 07 January 2023

Volume 17, pages 173–217, (2023)
Cite this article

Swarm Intelligence Aims and scope Submit manuscript

Marcelo Gomes Pereira de Lacerda ORCID: orcid.org/0000-0002-6087-2770¹,
Fernando Buarque de Lima Neto²^na1,
Teresa Bernarda Ludermir¹^na1 &
…
Herbert Kuchen³^na1

701 Accesses
3 Citations
2 Altmetric
Explore all metrics

Abstract

Parameter control methods for metaheuristics with reinforcement learning put forward so far usually present the following shortcomings: (1) Their training processes are usually highly time-consuming and they are not able to benefit from parallel or distributed platforms; (2) they are usually sensitive to their hyperparameters, which means that the quality of the final results is heavily dependent on their values; (3) and limited benchmarks have been used to assess their generality. This paper addresses these issues by proposing a methodology for training out-of-the-box parameter control policies for mono-objective non-niching evolutionary and swarm-based algorithms using distributed reinforcement learning with population-based training. The proposed methodology is suitable to be used in any mono-objective optimization problem and for any mono-objective and non-niching Evolutionary and swarm-based algorithm. The results in this paper achieved through extensive experiments show that the proposed method satisfactorily improves all the aforementioned issues, overcoming constant, random and human-designed policies in several different scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-objective Geometric Mean Optimizer (MOGMO): A Novel Metaphor-Free Population-Based Math-Inspired Multi-objective Algorithm

Article Open access 11 April 2024

Particle swarm optimization algorithm: an overview

Article 17 January 2017

Dung beetle optimizer: a new meta-heuristic algorithm for global optimization

Article 27 November 2022

Data Availability

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

Code Availability

The following link takes the reader to a git repository for the implementation of the proposed training method used in our experiments: https://github.com/lacerdamarcelo/rl_based_parameter_control_ea_si.

Notes

References

Aine, S., Kumar, R., & Chakrabarti, P. P. (2006). Adaptive parameter control of evolutionary algorithms under time constraints. In A. Tiwari, R. Roy, J. Knowles, E. Avineri, & K. Dahal (Eds.), Applications of Software Computing. Berlin: Springer.
Google Scholar
Aleti, A., & Moser, I. (2016). A systematic literature review of adaptive parameter control methods for evolutionary algorithms. ACM Computing Survey, 49(3), 56–15635. https://doi.org/10.1145/2996355
Article Google Scholar
Aleti, A., Moser, I., Meedeniya, I., & Grunske, L. (2014). Choosing the appropriate forecasting model for predictive parameter control. Evolutionary Computation, 22(2), 319–349.
Article Google Scholar
Aleti, A., & Moser, I. (2013). Entropy-based adaptive range parameter control for evolutionary algorithms. In: Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation. GECCO ’13. ACM, NY, USA , pp. 1501–1508.https://doi.org/10.1145/2463372.2463560.
Aleti, A., Moser, I., & Mostaghim, S. (2012). Adaptive range parameter control. In: 2012 IEEE congress on evolutionary computation, pp. 1–8 https://doi.org/10.1109/CEC.2012.6256567
Aleti, A., & Moser, I. (2011). Predictive parameter control. In: Proceedings of the 13th annual conference on genetic and evolutionary computation. GECCO ’11. ACM, NY, pp. 561–568. https://doi.org/10.1145/2001576.2001653.
Antoniou, M., Hribar, R., & Papa, G. (2021). A geometrical picture of anisotropic elastic tensors. In M. Vasile (Ed.), Parameter control in evolutionary optimisation (pp. 357–385). Cham: Springer. https://doi.org/10.1007/978-3-030-60166-9_11.
Chapter Google Scholar
Awad, N. H., Ali, M. Z., Suganthan, P. N., Liang, J. J., & Qu, B. Y. (2016). Problem definitions and evaluation criteria for the cec 2017 special session and competition on single objective real-parameter numerical optimization. Singapore: Technical report Nanyang Technological University.
Google Scholar
Balaprakash, P., Birattari, M., & Stützle, T. (2007a). Improvement strategies for the f-race algorithm: Sampling design and iterative refinement. In T. Bartz-Beielstein, M. J. Blesa Aguilera, C. Blum, B. Naujoks, A. Roli, G. Rudolph, & M. Sampels (Eds.), Hybrid Metaheuristics (pp. 108–122). Berlin, Heidelberg: Springer.
Chapter Google Scholar
Balaprakash, P., Birattari, M., & Stützle, T. (2007b). Improvement strategies for the f-race algorithm: Sampling design and iterative refinement. In: Hybrid Metaheuristics. Springer, Berlin, pp. 108–122.
Bielza, C., del Pozo, J. A. F., & Larrañaga, P. (2013). Parameter control of genetic algorithms by learning and simulation of bayesian networks - a case study for the optimal ordering of tables. Journal of Computer Science and Technology, 28(4), 720–731.
Article Google Scholar
Birattari, M., Yuan, Z., Balaprakash, P., & Stützle, T. (2010). F-race and iterated f-race: An overview In: Experimental methods for the analysis of optimization algorithms. Springer, Singapore.
Birattari, M., Stützle, T., Paquete, L., & Varrentrapp, K. (2002) A racing algorithm for configuring metaheuristics. In: Proceedings of the 4th annual conference on genetic and evolutionary computation. GECCO’02. Morgan Kaufmann Publishers Inc., San Francisco, CA, pp. 11–18.
Bonabeau, E., Dorigo, M., & Theraulaz, G. (1999). From Natural to Artificial Swarm Intelligence. USA: Oxford University Press Inc.
Book MATH Google Scholar
Chatzinikolaou, N. (2011). Coordinating evolution: An open, peer-to-peer architecture for a self-adapting genetic algorithm. In: Enterprise information systems, vol. 73. Springer, Berlin.
Das, S., Mullick, S. S., & Suganthan, P. N. (2016). Recent advances in differential evolution - an updated survey. Swarm and Evolutionary Computation, 27, 1–30. https://doi.org/10.1016/j.swevo.2016.01.004
Article Google Scholar
Dorigo, M. (1992). Optimization, learning and natural algorithms. PhD thesis, Politecnico di Milano, Italy.
Eberhart, R. C. (2007). Computational Intelligence: Concepts to Implementations. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.
Book MATH Google Scholar
Eiben, A. E., Hinterding, R., & Michalewicz, Z. (1999). Parameter control in evolutionary algorithms. IEEE Transactions on Evolutionary Computation, 3(2), 124–141. https://doi.org/10.1109/4235.771166
Article Google Scholar
Eiben, A. E., & Smith, J. E. (2015). Introduction to evolutionary computing (2nd ed.). Singapore: Springer.
Book MATH Google Scholar
Eiben, A.E., Horvath, M., Kowalczyk, W., & Schut, M.C. (2007). Reinforcement learning for online control of evolutionary algorithms. In: Proceedings of the 4th international conference on engineering self-organising systems. ESOA’06, pp. 151–160. Springer, Berlin. http://dl.acm.org/citation.cfm?id=1763581.1763595
Engelbrecht, A. P. (2007). Computational intelligence: An introduction (2nd ed.). Hoboken: Wiley Publishing.
Book Google Scholar
Filho, C.J.A.B., de Lima Neto, F.B., Lins, A.J.C.C., Nascimento, A.I.S., & Lima, M.P. (2008). A novel search algorithm based on fish school behavior. In: 2008 IEEE International conference on systems, Man and Cybernetics, Melbourne, pp. 2646–2651. https://doi.org/10.1109/ICSMC.2008.4811695
Filho, C. J. A. B., de Lima Neto, F. B., Lins, A. J. C. C., Nascimento, A. I. S., & Lima, M.P. (2009), Chiong, R. (ed.) Fish School Search. Springer, Berlin, pp. 261–277.
Filho, C.J.A.B., Neto, F.B.L., Sousa, M.F.C., Pontes, M.R., & Madeiro, S.S. (2009). On the influence of the swimming operators in the fish school search algorithm. In: 2009 IEEE International Conference on Systems, Man and Cybernetics, Melbourne, pp. 5012–5017.
Fortnow, L. (2009). The status of the p versus np problem. Commun. ACM, 52(9), 78–86. https://doi.org/10.1145/1562164.1562186
Article Google Scholar
Foulds, L. (1983). The heuristic problem-solving approach. Journal of the Operational Research Society, 34, 927–934.
Article Google Scholar
Fujimoto, S., van Hoof, H., & Meger, D. (2018). Addressing Function Approximation Error in Actor-Critic Methods. In International conference on machine learning. PMLR, NY, pp. 1587–1596.
Guan, Y., Yang, L., & Sheng, W. (2017). Population control in evolutionary algorithms: Review and comparison (pp. 161–174). In Bio-inspired computing: Theories and applications.
Google Scholar
Horgan, D., Quan, J., Budden, D., Barth-Maron, G., Hessel, M., van Hasselt, H., & Silver, D. (2018). Distributed Prioritized Experience Replay. arXiv preprint arXiv:1803.00933
Hristakeva, M. (2004) Solving the 0–1 knapsack problem with genetic algorithms. In Midwest instruction and computing symposium, pp. 16–17
Ilavarasi, K., & Joseph, K.S. (2014). Variants of travelling salesman problem: A survey. In: International conference on information communication and embedded systems (ICICES2014), pp. 1–7.
Jaderberg, M., Dalibard, V., Osindero, S., Czarnecki, W.M., Donahue, J., Razavi, A., Vinyals, O., Green, T., Dunning, I., Simonyan, K., Fernando, C., & Kavukcuoglu, K. (2017). Population based training of neural networks.
Karafotias, G., Hoogendoorn, M., & Eiben, A. E. (2015a). Parameter control in evolutionary algorithms: Trends and challenges. IEEE Transactions on Evolutionary Computation, 19(2), 167–187. https://doi.org/10.1109/TEVC.2014.2308294
Article Google Scholar
Karafotias, G., Smit, S.K., & Eiben, A.E. (2012). A generic approach to parameter control. In: Proceedings of the 2012 European conference on the applications of evolutionary computation. EvoApplications ’12.
Karafotias, G., Eiben, A.E., & Hoogendoorn, M. (2014a). Generic parameter control with reinforcement learning. In: Proceedings of the 2014 annual conference on genetic and evolutionary computation. GECCO ’14, pp. 1319–1326.
Karafotias, G., Hoogendoorn, M., & Weel, B. (2014b). Comparing generic parameter controllers for eas giorgos. In: Proceedings of the 2014 IEEE symposium series on computational intelligence. SSCI ’14, pp. 16–53.
Karafotias, G., Hoogendoorn, M., & Eiben, A.E. (2015b). Evaluating reward definitions for parameter control. In: Proceedings of the 2015 European conference on the applications of evolutionary computation. EvoApplications ’15, pp. 667–680.
Kennedy, J., & Eberhart, R.C. (1995). Particle swarm optimization. In: Proceedings of the IEEE international conference on neural networks, pp. 1942–1948.
Kingma, D.P., & Ba, J. (2014) Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:1412.6980
Lacerda de, M.G.P., de Andrade Amorim Neto, H., Ludermir, T.B., Kuchen, H., & de Lima Neto, F.B. (2018). Population size control for efficiency and efficacy optimization in population based metaheuristics. In: 2018 IEEE congress on evolutionary computation (CEC), pp. 1–8. https://doi.org/10.1109/CEC.2018.8477792
Leung, S. W., Yuen, S. Y., & Chow, C. K. (2012). Parameter control system of evolutionary algorithm that is aided by the entire search history. Appl. Soft Comput., 12(9), 3063–3078. https://doi.org/10.1016/j.asoc.2012.05.008
Article Google Scholar
Liang, E., Liaw, R., Moritz, P., Nishihara, R., Fox, R., Goldberg, K., Gonzalez, J.E., Jordan, M.I., & Stoica, I. (2017). RLlib: Abstractions for Distributed Reinforcement Learning. In International conference on machine learning. PMLR, NY, pp. 3053–3062
Liashchynskyi, P., & Liashchynskyi, P. (2019). Grid Search, Random Search. Genetic Algorithm: A Big Comparison for NAS. arXiv preprint arXiv:1912.06059
Lynn, N., & Suganthan, P. N. (2015a). Heterogeneous comprehensive learning particle swarm optimization with enhanced exploration and exploitation. Swarm Evolution Computation, 24, 11–24.
Article Google Scholar
Lynn, N., & Suganthan, P. N. (2015b). Heterogeneous comprehensive learning particle swarm optimization with enhanced exploration and exploitation. Swarm and Evolutionary Computation, 24, 11–24. https://doi.org/10.1016/j.swevo.2015.05.002
Article Google Scholar
Maturana, J., & Saubion, F. (2008). On the design of adaptive control strategies for evolutionary algorithms. In: Proceedings of the Evolution Artificielle, 8th international conference on artificial evolution. EA’07. Springer, Berlin, pp. 303–315. http://dl.acm.org/citation.cfm?id=1793671.1793702
Mersmann, O., Bischl, B., Trautmann, H., Preuss, M., Weihs, C., & Rudolph, G. (2011). Exploratory landscape analysis. In: Proceedings of the 13th annual conference on genetic and evolutionary computation. GECCO ’11, pp. 829–836. Association for Computing Machinery, New York, USA . https://doi.org/10.1145/2001576.2001690.
Michalewicz, Z., & Arabas, J. (1994). Genetic algorithms for the 0/1 knapsack problem. In Z. W. Ras & M. Zemankova (Eds.), Methodologies for Intelligent Systems (pp. 134–143). Berlin: Springer.
Chapter Google Scholar
Miguel de Gomez, A., & Toosi, F. (2021). Continuous parameter control in genetic algorithms using policy gradient reinforcement learning. In: Proceedings of the 13th international joint conference on computational intelligence (IJCCI 2021), pp. 115–122. https://doi.org/10.1109/CEC.2018.8477792
Nocedal, J., & Wright, S. J. (2006). Numerical optimization (2nd ed.). NY,: Springer.
MATH Google Scholar
Panigrahi, B. K., Shi, Y., & Lim, M.-H. (2011). Handbook of Swarm intelligence: Concepts, principles and applications (1st ed.). Singapore: Springer.
Book MATH Google Scholar
Parker-Holder, J., Nguyen, V., & Roberts, S. (2021). Provably Efficient Online Hyperparameter Optimization with Population-Based Bandits. Advances in Neural Information Processing System, 33, 17200–17211.
Google Scholar
Parpinelli, R. S., Plichoski, G. F., & da Silva, R. S. (2019). A review of techniques for on-line control of parameters in swarm intelligence and evolutionary computation algorithms. International Journal of Bio-inspired Computation, 13(1), 1–17.
Article Google Scholar
Pereira, Gomes, de Lacerda, M., de Araujo Pessoa, L. F., de Lima, Buarque, Neto, F., Ludermir, T. B., & Kuchen, H. (2021). A systematic literature review on general parameter control for evolutionary and swarm-based algorithms. Swarm and Evolutionary Computation, 60, 100777. https://doi.org/10.1016/j.swevo.2020.100777
Article Google Scholar
Pisinger, D. (2005). Where are the hard knapsack problems? Computers & Operations Research, 32(9), 2271–2284. https://doi.org/10.1016/j.cor.2004.03.002
Article MathSciNet MATH Google Scholar
Quevedo, J., Abdelatti, M., Imani, F., & Sodhi, M. (2021). Using reinforcement learning for tuning genetic algorithms. In: Proceedings of the genetic and evolutionary computation conference companion. GECCO ’21. Association for computing machinery, New York, NY, pp. 1503–1507 10.1145/3449726.3463203.
Rost, A., Petrova, I., & Buzdalova, A. (2016). Adaptive parameter selection in evolutionary algorithms by reinforcement learning with dynamic discretization of parameter range. In: Proceedings of the 2016 on genetic and evolutionary computation. GECCO ’16.
Rummery, G.A., & Niranjan, M. (1994). On-line q-learning using connectionist systems. Technical report.
Schuchardt, J., Golkov, V., & Cremers, D. (2019). Learning to Evolve. arXiv preprint arXiv:1905.03389
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv. 1048550/ARXIV.1707.06347
Sharma, M., Komninos, A., Ibanez, M.L., & Kazakov, D. (2019). Deep Reinforcement Learning Based Parameter Control in Differential Evolution.
Silver, E. (2004). An overview of heuristic solution methods. Journal of the Operational Research Society, 55(9), 936–956.
Article MATH Google Scholar
Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., Lillicrap, T., Simonyan, K., & Hassabis, D. (2017) Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. arXiv preprint arXiv:1712.01815
Storn, R., & Price, K. (1997). Differential evolution - a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization, 11(4), 341–359. https://doi.org/10.1023/A:1008202821328
Article MathSciNet MATH Google Scholar
Sutton, R. S., & Barto, A. G. (2018a). Reinforcement Learning: An Introduction. Cambridge: A Bradford Book.
MATH Google Scholar
Sutton, R. S., & Barto, A. G. (2018b). Reinforcement learning: An introduction (2nd ed.). Cambridge: The MIT Press.
MATH Google Scholar
Szepesvari, C. (2010). Algorithms for Reinforcement Learning. Johnsen: Morgan and Claypool Publishers.
Book MATH Google Scholar
Talbi, E.-G. (2009). Metaheuristics: From design to implementation. Hoboken: Wiley Publishing.
Book MATH Google Scholar
Watkins, C. J. C. H., & Dayan, P. (1992). Q-learning. Machine Learning, 8(3), 279–292. https://doi.org/10.1007/BF00992698
Article MATH Google Scholar
Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82.
Article Google Scholar
Zhang, J., Chen, W.-N., Zhan, Z.-H., Yu, W.-J., Li, Y.-L., Chen, N., & Zhou, Q. (2012). A survey on algorithm adaptation in evolutionary computation. Frontiers of Electrical and Electronic Engineering, 7(1), 16–31. https://doi.org/10.1007/s11460-012-0192-0
Article Google Scholar

Download references

Acknowledgements

The authors of this paper would like to thank CNPq and CAPES (Brazil) for funding the research that originated this paper.

Author information

These authors have contributed equally: Fernando Buarque de Lima Neto, Teresa Bernarda Ludermir and Herbert Kuchen.

Authors and Affiliations

Centro de Informática, Universidade Federal de Pernambuco, Recife, Brazil
Marcelo Gomes Pereira de Lacerda & Teresa Bernarda Ludermir
Escola Politécnica de Pernambuco, Universidade de Pernambuco, Recife, Brazil
Fernando Buarque de Lima Neto
Institut für Wirtschaftsinformatik, Westfälische Wilhelms-Universität Münster, Münster, Germany
Herbert Kuchen

Authors

Marcelo Gomes Pereira de Lacerda
View author publications
You can also search for this author in PubMed Google Scholar
Fernando Buarque de Lima Neto
View author publications
You can also search for this author in PubMed Google Scholar
Teresa Bernarda Ludermir
View author publications
You can also search for this author in PubMed Google Scholar
Herbert Kuchen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marcelo Gomes Pereira de Lacerda.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

1.1 TD3’s hyperparameters

Update delay between policy and Q-Function parameters: 2 (i.e., for each policy update, the Q-Function is updated twice);
Target noise (i.e., variance of the gaussian noise \(\epsilon\)): 0.2;
Target noise clip (i.e., c): 0.5;
Standard deviation of the zero-mean gaussian noise added to the actions: 0.1;
\(\gamma\): 0.99;
Initial random steps (i.e., number of steps with random decisions executed before the algorithm starts learning): 45,000;
Adam \(\beta _1\): 0.9;
Adam \(\beta _2\): 0.999;
Adam \(\epsilon\): \(10^{-7}\);

1.2 PBT’s hyperparameters

Perturbation interval: 4;
Quantile fraction: 0.125;
Resample probability: 0.5.

1.3 I/F-race’s hyperparameters

Number of parameter configurations evaluated for each new F-Race process: 48;
Minimum F-Race iterations before it starts removing bad setups: 20;
Parameter setup generator standard deviation: 0.3 * range of values of the parameter;
Minimum number of configurations in the F-Race pool: 10;
Maximum number of F-Race iterations: 30.

1.4 Human-designed parameter control policies

HCLPSO (Lynn & Suganthan, 2015b):
- w: Linear decrease from 0.99 to 0.2;
- c: Linear decrease from 3 to 1.5;
- candidate solution step: Linear decrease from 0.1 to 0.000001;
- c1: Linear decrease from 2.5 to 0.5;
- c2: Linear increase from 0.5 to 2.5;
- m: 5.
FSS (Filho et al., 2009):
- candidate solution step: Linear decrease from 0.1 to 0.000001;
- Volitive step: twice the candidate solution step;
- Maximum weight: 5000.
DE (Das et al., 2016):
- F: 2;
- Crossover probability: 0.5;
ACO (Das et al., 2016):
- \(\alpha\): 1;
- \(\beta\): 2;
- \(\rho\): 0.98;
- Probability of using the best ant ever to update the pheromone trail instead of the best in the iteration: linear increase from 0 to 1.
GA (Michalewicz & Arabas, 1994; Hristakeva, 2004):
- Mutation probability: 0.1;
- Crossover probability: 0.75;
- Elitism size: 2.

1.5 Sampling interval for the random parameter control policy

HCLPSO:
- w: [0.2, 0.99];
- c: [1.5, 3];
- c1: [0.5, 2.5];
- c2: [0.5, 2.5];
- m: 5.
FSS:
- candidate solution step: [0, 0.1];
- Volitive step [\(-\)0.2, 0.2] (the RL algorithm decides whether to contract or not);
DE:
- F: [0.01, 4];
- Crossover probability: [0.01, 1].
ACO:
- \(\alpha\): [0, 4];
- \(\beta\): [0, 4];
- \(\rho\): [0, 1];
- Probability of using the best ant ever to update the pheromone trail instead of the best in the iteration: [0, 1].
GA:
- Mutation probability: [0.001, 1];
- Crossover probability: [0.001, 1];
- Elitism size: [1, 5].

1.6 Fully detailed experimental results

See Tables 6, 7, 8, 9, 10, 11, 12, 13, 14.

Table 6 Mean and standard deviation (between parenthesis) of the best fitnesses found by HCLPSO with the best policy found with 4, 8, and 16 PBT workers

Full size table

Table 7 Mean and standard deviation (between parenthesis) of the best fitnesses found by HCLPSO with the best policy found with 2, 4 and 8 iterations of perturbation interval

Full size table

Table 8 Mean and standard deviation (between parenthesis) of the best fitnesses found by HCLPSO with the best policy trained with different quantile fractions: 0.125, 0.25, and 0.375

Full size table

Table 9 Mean and standard deviation (between parenthesis) of the best fitnesses found by HCLPSO with the best policy trained with different resample probabilities: 0.25, 0.5, and 0.75

Full size table

Table 10 Mean of the best fitnesses found by HCLPSO with its parameters controlled by the selected policies, the best policies, a human-designed policy, a random policy, and the same algorithm with static parameters defined by I/F-Race

Full size table

Table 11 Mean of the best fitnesses found by DE with its parameters controlled by the selected policies, the best policies, a human-designed policy, a random policy, and the same algorithm with static parameters defined by I/F-Race

Full size table

Table 12 Mean of the best fitnesses found by FSS with its parameters controlled by the selected policies, the best policies, a human-designed policy, a random policy, and the same algorithm with static parameters defined by I/F-Race

Full size table

Table 13 Mean of the best fitnesses found by binary GA with its parameters controlled by the selected policies, the best policies, a human-designed policy, a random policy, and the same algorithm with static parameters defined by I/F-Race

Full size table

Table 14 Mean of the best fitnesses found by ACO with its parameters controlled by the selected policies, the best policies, a human-designed policy, a random policy, and the same algorithm with static parameters defined by I/F-Race

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

de Lacerda, M.G.P., de Lima Neto, F.B., Ludermir, T.B. et al. Out-of-the-box parameter control for evolutionary and swarm-based algorithms with distributed reinforcement learning. Swarm Intell 17, 173–217 (2023). https://doi.org/10.1007/s11721-022-00222-z

Download citation

Received: 08 March 2022
Accepted: 11 December 2022
Published: 07 January 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s11721-022-00222-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Out-of-the-box parameter control for evolutionary and swarm-based algorithms with distributed reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Multi-objective Geometric Mean Optimizer (MOGMO): A Novel Metaphor-Free Population-Based Math-Inspired Multi-objective Algorithm

Particle swarm optimization algorithm: an overview

Dung beetle optimizer: a new meta-heuristic algorithm for global optimization

Data Availability

Code Availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

1.1 TD3’s hyperparameters

1.2 PBT’s hyperparameters

1.3 I/F-race’s hyperparameters

1.4 Human-designed parameter control policies

1.5 Sampling interval for the random parameter control policy

1.6 Fully detailed experimental results

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Out-of-the-box parameter control for evolutionary and swarm-based algorithms with distributed reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Multi-objective Geometric Mean Optimizer (MOGMO): A Novel Metaphor-Free Population-Based Math-Inspired Multi-objective Algorithm

Particle swarm optimization algorithm: an overview

Dung beetle optimizer: a new meta-heuristic algorithm for global optimization

Data Availability

Code Availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

1.1 TD3’s hyperparameters

1.2 PBT’s hyperparameters

1.3 I/F-race’s hyperparameters

1.4 Human-designed parameter control policies

1.5 Sampling interval for the random parameter control policy

1.6 Fully detailed experimental results

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation