Optimal Control and Reinforcement Learning for Robot: A Survey

Feng, Haodong; Yu, Lei; Chen, Yuqing

doi:10.1007/978-3-030-92635-9_4

Haodong Feng^17,18,
Lei Yu^17,18 &
Yuqing Chen¹⁷

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 406))

Included in the following conference series:

International Conference on Collaborative Computing: Networking, Applications and Worksharing

1340 Accesses

Abstract

Along with the development of systems and their applications, conventional control approaches are limited by system complexity and functions. The development of reinforcement learning and optimal control has become an impetus of engineering, which has show large potentials on automation. Currently, the optimization applications on robot are facing challenges caused by model bias, high dimensional systems, and computational complexity. To solve these issues, several researches proposed available data-driven optimization approaches. This survey aims to review the achievements on optimal control and reinforcement learning approaches for robots. This is not a complete and exhaustive survey, but provides some latest and remarkable achievements for optimal control of robots. It introduces the background and facing problem statement at the beginning. The developments of the solutions to existed issues for robot control and some notable control methods in these areas are reviewed briefly. In addition, the survey discusses the future development prospects from four aspects as research directions to achieve improving the efficiency of control, the artificial assistant learning, the applications in extreme environment and related subjects. The interdisciplinary researches are essential for engineering fields based on optimal control methods according to the perspective; which would not only promote engineering equipment to be more intelligent, but extend applications of optimal control approaches.

This work was supported by the Research Development Fund RDF-20-01-08 provided by Xi’an Jiaotong-Liverpool University.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Robot Navigation Based on Reinforcement Learning: An Overview

The Challenges of Reinforcement Learning in Robotics and Optimal Control

Survey of Model-Based Reinforcement Learning: Applications on Robotics

Article 26 January 2017

References

Chen, Y., Roveda, L., Braun, D.J.: Efficiently computable constrained optimal feedback controllers. IEEE Rob. Autom. Lett. 4(1), 121–128 (2019)
Article Google Scholar
Kober, J., Bagnell, J.A., Peter, J.: Reinforcement learning in robotics: a survey. Int. J. Rob. Res. 32(11), 1238–1274 (2013)
Article Google Scholar
Chen, Y., Braun, D.J.: Hardware-in-the-loop iterative optimal feedback control without model-based future prediction. IEEE Trans. Rob. 35(6), 1419–1434 (2019)
Article Google Scholar
Rastogi, D., Koryakovskiy, I., Kober, J.: Sample-efficient reinforcement learning via difference models. In: 3rd Machine Learning in Planning and Control of Robot Motion Workshop at ICRA (2018)
Google Scholar
Silver, D., et al.: Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Article Google Scholar
Nagabandi, A., Kahn, G., Fearing, R.S., Levine, S.: Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 7559–7566 (2018)
Google Scholar
Kupcsik, A., Deisenroth, M.P., Peters, J., Loh, A.P., Vadakkepat, P., Neumann, G.: Model-based contextual policy search for data-efficient generalization of robot skills. Artif. Intell. 247, 415–439 (2017)
Article MathSciNet Google Scholar
Deisenroth, M., Rasmussen, C.E.: PILCO: a model-based and data-efficient approach to policy search. In: 28th International Conference on Machine Learning (ICML), pp. 465–472 (2011)
Google Scholar
Kumar, V., Todorov, E., Levine, S.: Optimal control with learned local models: application to dexterous manipulation. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 378–383 (2016)
Google Scholar
Nagabandi, A., et al.: Learning image-conditioned dynamics models for control of underactuated legged millirobots. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4606–4613 (2018)
Google Scholar
Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: 31th International Conference on Machine Learning (ICML), pp. 1889–1897 (2015)
Google Scholar
Kingma, D.P., Ba, J.: ADAM: a method for stochastic optimization. In: 2015 International Conference for Learning Representations (ICLR) (2015)
Google Scholar
Zhang, K., Shi, Y.: Adaptive model predictive control for a class of constrained linear systems with parametric uncertainties. Automatica 117, 108974 (2020)
Article MathSciNet Google Scholar
Rottmann, A., Burgard, W.: Adaptive autonomous control using online value iteration with gaussian processes. In: 2009 IEEE International Conference on Robotics and Automation (ICRA), pp. 2106–2111 (2009)
Google Scholar
Vrabie, D., Pastravanu, O., Abu-Khalaf, M., Lewis, F.L.: Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica 45(2), 477–484 (2009)
Article MathSciNet Google Scholar
Chen, Y., Braun, D.J.: Iterative online optimal feedback control. IEEE Trans. Autom. Control 66(2), 566–580 (2021)
Article MathSciNet Google Scholar
Losey, D.P., McDonald, C.G., O’Malley, M.K.: A bio-inspired algorithm for identifying unknown kinematics from a discrete set of candidate models by using collision detection. In: 6th IEEE International Conference on Biomedical Robotics and Biomechatronics (BioRob), pp. 418–423 (2016)
Google Scholar
Saputra, A.A., Wi Tay, N.N., Toda, Y., Botzheim, J., Kubota, N.: Bézier curve model for efficient bio-inspired locomotion of low cost four legged robot. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4443–4448 (2016)
Google Scholar
Morton, J., Witherden, F.D., Jameson, A., Kochenderfer, M.J.: Deep dynamical modeling and control of unsteady fluid flows. In: 2018 Conference on Neural Information Processing Systems (NIPS) (2018)
Google Scholar
Corneil, D., Gerstner, W., Brea, J.: Efficient model-based deep reinforcement learning with variational state tabulation. In: 35th International Conference on Machine Learning (ICML), pp. 1049–1058 (2018)
Google Scholar
Lioutikov, R., Paraschos, A., Peters, J., Neumann, G.: Sample-based informationl-theoretic stochastic optimal control. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 3896–3902 (2014)
Google Scholar
Yaghmaie, F.A., Braun, D.J.: Reinforcement learning for a class of continuous-time input constrained optimal control problems. Automatica 99, 221–227 (2019)
Article MathSciNet Google Scholar
Levine, S., Wagener, N., Abbeel, P.: Learning contact-rich manipulation skills with guided policy search. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 156–163 (2015)
Google Scholar
Goedhart, M., Van Kampen, E.J., Armanini, S.F., de Visser, C.C., Chu, Q.P.: Machine learning for flapping wing flight control. In: 2018 AIAA Information Systems-AIAA Infotech @ Aerospace (2018)
Google Scholar
Jordan, M.I., Rumelhart, D.E.: Forward models: supervised learning with a distal teacher. Cogn. Sci. 16(3), 307–354 (1992)
Article Google Scholar
Åkesson, B.M., Toivonen, H.T.: A neural network model predictive controller. J. Process Control 16(9), 937–946 (2006)
Article Google Scholar
Liu, D., Wang, D., Zhao, D., Wei, Q., Jin, N.: Neural-network-based optimal control for a class of unknown discrete-time nonlinear systems using globalized dual heuristic programming. IEEE Trans. Autom. Sci. Eng. 9(3), 628–634 (2012)
Article Google Scholar
Sun, Z., Dai, L., Liu, K., Dimarogonas, D.V., Xia, Y.: Robust self-triggered MPC with adaptive prediction horizon for perturbed nonlinear systems. IEEE Trans. Autom. Control 64(11), 4780–4787 (2019)
Article MathSciNet Google Scholar
Talvitie, E.: Self-correcting models for model-based reinforcement learning. In: 31 Conference on Artificial Intelligence (AAAI), pp. 1–12 (2017)
Google Scholar
Talvitie, E.: Model regularization for stable sample rollouts. In: the 30th Conference on Uncertainty in Artificial Intelligence, pp. 780–789 (2014)
Google Scholar
Xu, F., Ocampomartinez, C., Olaru, S., Niculescu, S.I.: Robust MPC for actuator-fault tolerance using set-based passive fault detection and active fault isolation. Int. J. Appl. Math. Comput. Sci. 27(1), 43–61 (2017)
Article MathSciNet Google Scholar
Kumbasar, T., Eksin, I., Guzelkaya, M., Yesil, E.: Adaptive fuzzy internal model control design with bias term compensator. In: 2011 IEEE International Conference on Mechatronics, pp. 312–317 (2011)
Google Scholar
Li, X., Cao, L., Hu, X., Zhang, S.: Command filtered model-free robust control for aircrafts with actuator dynamics. IEEE Access. 7, 139475–139487 (2019)
Article Google Scholar
Daneshfar, F., Mansoori, F., Bevrani, H.: Multi-agent reinforcement learning design of load-frequency control with frequency bias estimation. In: The 2nd International Conference on Control, Instrumentation and Automation (ICCIA), pp. 310–314 (2011)
Google Scholar
Vafamand, N., Arefi, M.M., Khooban, M.H., Dragicevic, T., Blaabjerg, F.: Nonlinear model predictive speed control of electric vehicles represented by linear parameter varying models with bias terms. IEEE J. Emerg. Sel. Topics Power Electron. 7(3), 2081–2089 (2019)
Article Google Scholar
Song, R., Lewis, F.L., Wei, Q., Zhang, H.: Off-policy actor-critic structure for optimal control of unknown systems with disturbances. IEEE Trans. Cybern. 46(5), 1041–1050 (2016)
Article Google Scholar
Varutti, P., Findeisen, R.: Event-based NMPC for networked control systems over UDP-like communication channels. In: 2011 American Control Conference, pp. 3166–3171 (2011)
Google Scholar
Martinez-Cantin, R., Lopes, M., Montesano, L.: Body schema acquisition through active learning. In: 2010 IEEE International Conference on Robotics and Automation (ICRA), pp. 1860–1866 (2010)
Google Scholar
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(7), 2121–2159 (2011)
MathSciNet MATH Google Scholar
Fleming, J., Kouvaritakis, B., Cannon, M.: Robust tube MPC for linear systems with multiplicative uncertainty. IEEE Trans. Autom. Control 60(4), 1087–1092 (2015)
Article MathSciNet Google Scholar
Gupta, V., Luo, F.: On a control algorithm for time-varying processor availability. IEEE Trans. Autom. Control 58(3), 743–748 (2013)
Article MathSciNet Google Scholar
Demirel, B., Ghadimi, E., Quevedo, D.E., Johansson, M.: Optimal control of linear systems with limited control actions: threshold-based event-triggered control. IEEE Trans. Control Netw. Syst. 5(3), 1275–1286 (2017)
Article MathSciNet Google Scholar
Jenson, E.L., Chen, X., Scheeres, D.J.: Optimal control of sampled linear systems with control-linear noise. IEEE Control Syst. Lett. 4(3), 650–655 (2020)
Article MathSciNet Google Scholar
Nguyen, H., La, H.: Review of deep reinforcement learning for robot manipulation. In: 2019 3nd IEEE International Conference on Robotic Computing (IRC), pp. 590–595 (2019)
Google Scholar
Khan, S.G., Herrmann, G., Lewis, F.L., Pipe, T., Melhuish, C.: Reinforcement learning and optimal adaptive control: an overview and implementation examples. Ann. Rev. Control. 36(1), 42–59 (2012)
Article Google Scholar
Polydoros, A.S., Nalpantidis, L.: Survey of model-based reinforcement learning: applications on robotics. J. Intell. Rob. Syst. 86(2), 153–173 (2017)
Article Google Scholar
Bhagat, S., Banerjee, H., Ho Tse, Z.T., Ren, H.: Deep reinforcement learning for soft, flexible robots: brief review with impending challenges. Robotics 8(1), 4 (2019)
Article Google Scholar
Khan, M.A.M., et al.: A systematic review on reinforcement learning-based robotics within the last decade. IEEE Access 8, 176598–176623 (2020)
Article Google Scholar
Zhao, W., Queralta, J.P., Westerlund, T.: Sim-to-real transfer in deep reinforcement learning for robotics: a survey. In: 2020 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 737–744 (2020)
Google Scholar
Abbeel, P., Coates, A., Ng, A.Y.: Autonomous helicopter aerobatics through apprenticeship learning. Int. J. Rob. Res. 29(13), 1608–1639 (2010)
Article Google Scholar
Cui, R., Yang, C., Li, Y., Sharma, S.: Adaptive neural network control of AUVs with control input nonlinearities using reinforcement learning. IEEE Trans. Syst. Man Cybern. Syst. 47(6), 1019–1029 (2017)
Article Google Scholar
Refsnes, J.E., Sorensen, A.J., Pettersen, K.Y.: Model-based output feedback control of slender-body underactuated AUVs: theory and experiments. IEEE Trans. Control Syst. Technol. 16(5), 930–946 (2008)
Article Google Scholar
Eller, L., Siafara, L. C., Sauter, T.: Adaptive control for building energy management using reinforcement learning. In: 2018 IEEE International Conference on Industrial Technology (ICIT), pp. 1562–1567 (2018)
Google Scholar
Avila, L., De Paula, M., Carlucho, I., Sanchez Reinoso, C.: MPPT for PV systems using deep reinforcement learning algorithms. IEEE Lat. Am. Trans. 17(12), 2020–2027 (2019)
Article Google Scholar
Nguyen, T., Mukhopadhyay, S.: Multidisciplinary optimization in decentralized reinforcement learning. In: 16th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 779–784 (2017)
Google Scholar
Dan, H., et al.: Error-voltage-based open-switch fault diagnosis strategy for matrix converters with model predictive control method. IEEE Trans. Ind. Appl. 53(5), 4603–4612 (2017)
Article Google Scholar
Yu, B., Zhang, Y., Qu, Y.: MPC-based FTC with FDD against actuator faults of UAVs. In: 15th International Conference on Control, Automation and Systems (ICCAS), pp. 225–230 (2015)
Google Scholar
Kim, K., Raimondo, D.M., Braatz, R.D.: Optimum input design for fault detection and diagnosis: model-based prediction and statistical distance measures. Control Conference. In: 2013 European Control Conference (ECC), pp. 1940–1945 (2013)
Google Scholar

Download references

Acknowledgment

This work was supported by the Research Development Fund RDF-20-01-08 provided by Xi’an Jiaotong-Liverpool University.

Author information

Authors and Affiliations

School of Advanced Technology, Xi’an Jiaotong-Liverpool University, Suzhou, China
Haodong Feng, Lei Yu & Yuqing Chen
Department of Electrical Engineering and Electronics, University of Liverpool, Liverpool, UK
Haodong Feng & Lei Yu

Authors

Haodong Feng
View author publications
You can also search for this author in PubMed Google Scholar
Lei Yu
View author publications
You can also search for this author in PubMed Google Scholar
Yuqing Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuqing Chen .

Editor information

Editors and Affiliations

Shanghai University, Shanghai, China
Honghao Gao
Xi’an Jiaotong-Liverpool University, Suzhou, China
Xinheng Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Feng, H., Yu, L., Chen, Y. (2021). Optimal Control and Reinforcement Learning for Robot: A Survey. In: Gao, H., Wang, X. (eds) Collaborative Computing: Networking, Applications and Worksharing. CollaborateCom 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 406. Springer, Cham. https://doi.org/10.1007/978-3-030-92635-9_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-92635-9_4
Published: 01 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92634-2
Online ISBN: 978-3-030-92635-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Optimal Control and Reinforcement Learning for Robot: A Survey