Hybrid Surrogate Assisted Evolutionary Multiobjective Reinforcement Learning for Continuous Robot Control

Mazumdar, Atanu; Kyrki, Ville

doi:10.1007/978-3-031-56855-8_4

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14635))

Included in the following conference series:

International Conference on the Applications of Evolutionary Computation (Part of EvoStar)

120 Accesses

Abstract

Many real world reinforcement learning (RL) problems consist of multiple conflicting objective functions that need to be optimized simultaneously. Finding these optimal policies (known as Pareto optimal policies) for different preferences of objectives requires extensive state space exploration. Thus, obtaining a dense set of Pareto optimal policies is challenging and often reduces the sample efficiency. In this paper, we propose a hybrid multiobjective policy optimization approach for solving multiobjective reinforcement learning (MORL) problems with continuous actions. Our approach combines the faster convergence of multiobjective policy gradient (MOPG) and a surrogate assisted multiobjective evolutionary algorithm (MOEA) to produce a dense set of Pareto optimal policies. The solutions found by the MOPG algorithm are utilized to build computationally inexpensive surrogate models in the parameter space of the policies that approximate the return of policies. An MOEA is executed that utilizes the surrogates’ mean prediction and uncertainty in the prediction to find approximate optimal policies. The final solution policies are later evaluated using the simulator and stored in an archive. Tests on multiobjective continuous action RL benchmarks show that a hybrid surrogate assisted multiobjective evolutionary optimizer with robust selection criterion produces a dense set of Pareto optimal policies without extensively exploring the state space. We also apply the proposed approach to train Pareto optimal agents for autonomous driving, where the hybrid approach produced superior results compared to a state-of-the-art MOPG algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Notes

1.
Codes can be found at https://github.com/amrzr/SA-MOEAMOPG.

References

Ao, Y., Li, H., Zhu, L., Ali, S., Yang, Z.: The linear random forest algorithm and its advantages in machine learning assisted logging regression modeling. J. Petrol. Sci. Eng. 174, 776–789 (2019)
Article Google Scholar
Arashi, M., Lukman, A.F., Algamal, Z.Y.: Liu regression after random forest for prediction and modeling in high dimension. J. Chemometr. 36(4), e3393 (2022)
Article Google Scholar
Bouhlel, M.A., Martins, J.R.R.A.: Gradient-enhanced kriging for high-dimensional problems. Eng. Comput. 35(1), 157–173 (2018)
Article Google Scholar
Chen, D., Wang, Y., Gao, W.: Combining a gradient-based method and an evolution strategy for multi-objective reinforcement learning. Appl. Intell. 50(10), 3301–3317 (2020)
Article Google Scholar
Cheng, R., Jin, Y., Olhofer, M., Sendhoff, B.: A reference vector guided evolutionary algorithm for many-objective optimization. IEEE Trans. Evol. Comput. 20, 773–791 (2016)
Article Google Scholar
Chugh, T., Sindhya, K., Hakanen, J., Miettinen, K.: A survey on handling computationally expensive multiobjective optimization problems with evolutionary algorithms. Soft. Comput. 23, 3137–3166 (2019)
Article Google Scholar
Conlon, J., Lin, J.: Greenhouse gas emission impact of autonomous vehicle introduction in an urban network. Transp. Res. Rec. 2673(5), 142–152 (2019)
Article Google Scholar
Deb, K., Jain, H.: An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part I: Solving problems with box constraints. IEEE Trans. Evol. Comput. 18, 577–601 (2014)
Article Google Scholar
Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)
Article Google Scholar
Forrester, A., Sobester, A., Keane, A.: Engineering Design via Surrogate Modelling. John Wiley & Sons, Hoboken (2008)
Book Google Scholar
Hayes, C.F., Reymond, M., Roijers, D.M., Howley, E., Mannion, P.: Risk aware and multi-objective decision making with distributional monte carlo tree search (2021). arXiv:2102.00966
Hayes, C.F., et al.: A practical guide to multi-objective reinforcement learning and planning. Auton. Agents Multi-Agent Syst. 36(1), 26 (2022)
Article Google Scholar
Jin, Y.: Surrogate-assisted evolutionary computation: recent advances and future challenges. Swarm Evol. Comput. 1, 61–70 (2011)
Article Google Scholar
Jin, Y., Wang, H., Chugh, T., Guo, D., Miettinen, K.: Data-driven evolutionary optimization: an overview and case studies. IEEE Trans. Evol. Comput. 23, 442–458 (2019)
Article Google Scholar
Knowles, J.D., Thiele, L., Zitzler, E.: A tutorial on the performance assessment of stochastic multiobjective optimizers (2006)
Google Scholar
Leurent, E.: An environment for autonomous driving decision-making (2018). https://github.com/eleurent/highway-env
Li, M., Yao, X.: Quality evaluation of solution sets in multiobjective optimisation. ACM Comput. Surv. 52(2), 1–38 (2019)
Article Google Scholar
Mazumdar, A., Chugh, T., Hakanen, J., Miettinen, K.: Probabilistic selection approaches in decomposition-based evolutionary algorithms for offline data-driven multiobjective optimization. IEEE Trans. Evol. Comput. 26, 1182–1191 (2022)
Article Google Scholar
Parisi, S., Pirotta, M., Smacchia, N., Bascetta, L., Restelli, M.: Policy gradient approaches for multi-objective sequential decision making. In: 2014 International Joint Conference on Neural Networks (IJCNN), pp. 2323–2330 (2014)
Google Scholar
Rodriguez-Galiano, V., Sanchez-Castillo, M., Chica-Olmo, M., Chica-Rivas, M.: Machine learning predictive models for mineral prospectivity: an evaluation of neural networks, random forest, regression trees and support vector machines. Ore Geol. Rev. 71, 804–818 (2015)
Article Google Scholar
Siddique, U., Weng, P., Zimmer, M.: Learning fair policies in multiobjective (deep) reinforcement learning with average and discounted rewards. In: Proceedings of the 37th International Conference on Machine Learning (2020)
Google Scholar
Stork, J., et al.: Open issues in surrogate-assisted optimization. High-Performance Simulation-Based Optimization p. 225–244 (2019)
Google Scholar
Xu, J., Tian, Y., Ma, P., Rus, D., Sueda, S., Matusik, W.: Prediction-guided multi-objective reinforcement learning for continuous robot control. In: Proceedings of the 37th International Conference on Machine Learning, pp. 10607–10616. PMLR (2020)
Google Scholar
Yang, K., Emmerich, M., Deutz, A., Bäck, T.: Efficient computation of expected hypervolume improvement using box decomposition algorithms. J. Global Optim. 75(1), 3–34 (2019)
Article MathSciNet Google Scholar
Zapotecas Martínez, S., Coello Coello, C.A.: Moea/d assisted by RBF networks for expensive multi-objective optimization problems. In: Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation, pp. 1405–1412. Association for Computing Machinery (2013)
Google Scholar
Zhang, Q., Li, H.: MOEA/D: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans. Evol. Comput. 11, 712–731 (2007)
Article Google Scholar
Zitzler, E., Deb, K., Thiele, L.: Comparison of multiobjective evolutionary algorithms: empirical results. Evol. Comput. 8, 173–195 (2000)
Article Google Scholar

Download references

Acknowledgements

The work is supported by Artificial Intelligence for Urban Low-Emission Autonomous Traffic (AIforLessAuto) funded under the Green and Digital transition call from the Academy of Finland. The research project has been granted funding from the European Union (NextGenerationEU) through the Academy of Finland under project number 347199.

Author information

Authors and Affiliations

Department of Electrical Engineering and Automation (EEA), Aalto University, Espoo, Finland
Atanu Mazumdar & Ville Kyrki

Authors

Atanu Mazumdar
View author publications
You can also search for this author in PubMed Google Scholar
Ville Kyrki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Atanu Mazumdar .

Editor information

Editors and Affiliations

University of York, York, UK
Stephen Smith
University of Coimbra, Coimbra, Portugal
João Correia
University of Málaga, Málaga, Spain
Christian Cintrano

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mazumdar, A., Kyrki, V. (2024). Hybrid Surrogate Assisted Evolutionary Multiobjective Reinforcement Learning for Continuous Robot Control. In: Smith, S., Correia, J., Cintrano, C. (eds) Applications of Evolutionary Computation. EvoApplications 2024. Lecture Notes in Computer Science, vol 14635. Springer, Cham. https://doi.org/10.1007/978-3-031-56855-8_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-56855-8_4
Published: 21 March 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-56854-1
Online ISBN: 978-3-031-56855-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Hybrid Surrogate Assisted Evolutionary Multiobjective Reinforcement Learning for Continuous Robot Control