Skip to main content

Hybrid Surrogate Assisted Evolutionary Multiobjective Reinforcement Learning for Continuous Robot Control

  • Conference paper
  • First Online:
Applications of Evolutionary Computation (EvoApplications 2024)

Abstract

Many real world reinforcement learning (RL) problems consist of multiple conflicting objective functions that need to be optimized simultaneously. Finding these optimal policies (known as Pareto optimal policies) for different preferences of objectives requires extensive state space exploration. Thus, obtaining a dense set of Pareto optimal policies is challenging and often reduces the sample efficiency. In this paper, we propose a hybrid multiobjective policy optimization approach for solving multiobjective reinforcement learning (MORL) problems with continuous actions. Our approach combines the faster convergence of multiobjective policy gradient (MOPG) and a surrogate assisted multiobjective evolutionary algorithm (MOEA) to produce a dense set of Pareto optimal policies. The solutions found by the MOPG algorithm are utilized to build computationally inexpensive surrogate models in the parameter space of the policies that approximate the return of policies. An MOEA is executed that utilizes the surrogates’ mean prediction and uncertainty in the prediction to find approximate optimal policies. The final solution policies are later evaluated using the simulator and stored in an archive. Tests on multiobjective continuous action RL benchmarks show that a hybrid surrogate assisted multiobjective evolutionary optimizer with robust selection criterion produces a dense set of Pareto optimal policies without extensively exploring the state space. We also apply the proposed approach to train Pareto optimal agents for autonomous driving, where the hybrid approach produced superior results compared to a state-of-the-art MOPG algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Notes

  1. 1.

    Codes can be found at https://github.com/amrzr/SA-MOEAMOPG.

References

  1. Ao, Y., Li, H., Zhu, L., Ali, S., Yang, Z.: The linear random forest algorithm and its advantages in machine learning assisted logging regression modeling. J. Petrol. Sci. Eng. 174, 776–789 (2019)

    Article  Google Scholar 

  2. Arashi, M., Lukman, A.F., Algamal, Z.Y.: Liu regression after random forest for prediction and modeling in high dimension. J. Chemometr. 36(4), e3393 (2022)

    Article  Google Scholar 

  3. Bouhlel, M.A., Martins, J.R.R.A.: Gradient-enhanced kriging for high-dimensional problems. Eng. Comput. 35(1), 157–173 (2018)

    Article  Google Scholar 

  4. Chen, D., Wang, Y., Gao, W.: Combining a gradient-based method and an evolution strategy for multi-objective reinforcement learning. Appl. Intell. 50(10), 3301–3317 (2020)

    Article  Google Scholar 

  5. Cheng, R., Jin, Y., Olhofer, M., Sendhoff, B.: A reference vector guided evolutionary algorithm for many-objective optimization. IEEE Trans. Evol. Comput. 20, 773–791 (2016)

    Article  Google Scholar 

  6. Chugh, T., Sindhya, K., Hakanen, J., Miettinen, K.: A survey on handling computationally expensive multiobjective optimization problems with evolutionary algorithms. Soft. Comput. 23, 3137–3166 (2019)

    Article  Google Scholar 

  7. Conlon, J., Lin, J.: Greenhouse gas emission impact of autonomous vehicle introduction in an urban network. Transp. Res. Rec. 2673(5), 142–152 (2019)

    Article  Google Scholar 

  8. Deb, K., Jain, H.: An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part I: Solving problems with box constraints. IEEE Trans. Evol. Comput. 18, 577–601 (2014)

    Article  Google Scholar 

  9. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)

    Article  Google Scholar 

  10. Forrester, A., Sobester, A., Keane, A.: Engineering Design via Surrogate Modelling. John Wiley & Sons, Hoboken (2008)

    Book  Google Scholar 

  11. Hayes, C.F., Reymond, M., Roijers, D.M., Howley, E., Mannion, P.: Risk aware and multi-objective decision making with distributional monte carlo tree search (2021). arXiv:2102.00966

  12. Hayes, C.F., et al.: A practical guide to multi-objective reinforcement learning and planning. Auton. Agents Multi-Agent Syst. 36(1), 26 (2022)

    Article  Google Scholar 

  13. Jin, Y.: Surrogate-assisted evolutionary computation: recent advances and future challenges. Swarm Evol. Comput. 1, 61–70 (2011)

    Article  Google Scholar 

  14. Jin, Y., Wang, H., Chugh, T., Guo, D., Miettinen, K.: Data-driven evolutionary optimization: an overview and case studies. IEEE Trans. Evol. Comput. 23, 442–458 (2019)

    Article  Google Scholar 

  15. Knowles, J.D., Thiele, L., Zitzler, E.: A tutorial on the performance assessment of stochastic multiobjective optimizers (2006)

    Google Scholar 

  16. Leurent, E.: An environment for autonomous driving decision-making (2018). https://github.com/eleurent/highway-env

  17. Li, M., Yao, X.: Quality evaluation of solution sets in multiobjective optimisation. ACM Comput. Surv. 52(2), 1–38 (2019)

    Article  Google Scholar 

  18. Mazumdar, A., Chugh, T., Hakanen, J., Miettinen, K.: Probabilistic selection approaches in decomposition-based evolutionary algorithms for offline data-driven multiobjective optimization. IEEE Trans. Evol. Comput. 26, 1182–1191 (2022)

    Article  Google Scholar 

  19. Parisi, S., Pirotta, M., Smacchia, N., Bascetta, L., Restelli, M.: Policy gradient approaches for multi-objective sequential decision making. In: 2014 International Joint Conference on Neural Networks (IJCNN), pp. 2323–2330 (2014)

    Google Scholar 

  20. Rodriguez-Galiano, V., Sanchez-Castillo, M., Chica-Olmo, M., Chica-Rivas, M.: Machine learning predictive models for mineral prospectivity: an evaluation of neural networks, random forest, regression trees and support vector machines. Ore Geol. Rev. 71, 804–818 (2015)

    Article  Google Scholar 

  21. Siddique, U., Weng, P., Zimmer, M.: Learning fair policies in multiobjective (deep) reinforcement learning with average and discounted rewards. In: Proceedings of the 37th International Conference on Machine Learning (2020)

    Google Scholar 

  22. Stork, J., et al.: Open issues in surrogate-assisted optimization. High-Performance Simulation-Based Optimization p. 225–244 (2019)

    Google Scholar 

  23. Xu, J., Tian, Y., Ma, P., Rus, D., Sueda, S., Matusik, W.: Prediction-guided multi-objective reinforcement learning for continuous robot control. In: Proceedings of the 37th International Conference on Machine Learning, pp. 10607–10616. PMLR (2020)

    Google Scholar 

  24. Yang, K., Emmerich, M., Deutz, A., Bäck, T.: Efficient computation of expected hypervolume improvement using box decomposition algorithms. J. Global Optim. 75(1), 3–34 (2019)

    Article  MathSciNet  Google Scholar 

  25. Zapotecas Martínez, S., Coello Coello, C.A.: Moea/d assisted by RBF networks for expensive multi-objective optimization problems. In: Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation, pp. 1405–1412. Association for Computing Machinery (2013)

    Google Scholar 

  26. Zhang, Q., Li, H.: MOEA/D: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans. Evol. Comput. 11, 712–731 (2007)

    Article  Google Scholar 

  27. Zitzler, E., Deb, K., Thiele, L.: Comparison of multiobjective evolutionary algorithms: empirical results. Evol. Comput. 8, 173–195 (2000)

    Article  Google Scholar 

Download references

Acknowledgements

The work is supported by Artificial Intelligence for Urban Low-Emission Autonomous Traffic (AIforLessAuto) funded under the Green and Digital transition call from the Academy of Finland. The research project has been granted funding from the European Union (NextGenerationEU) through the Academy of Finland under project number 347199.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Atanu Mazumdar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mazumdar, A., Kyrki, V. (2024). Hybrid Surrogate Assisted Evolutionary Multiobjective Reinforcement Learning for Continuous Robot Control. In: Smith, S., Correia, J., Cintrano, C. (eds) Applications of Evolutionary Computation. EvoApplications 2024. Lecture Notes in Computer Science, vol 14635. Springer, Cham. https://doi.org/10.1007/978-3-031-56855-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-56855-8_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-56854-1

  • Online ISBN: 978-3-031-56855-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics