Driving Reinforcement Learning with Models

Rathi, Meghana; Ferraro, Pietro; Russo, Giovanni

doi:10.1007/978-3-030-55180-3_6

Meghana Rathi¹⁷,
Pietro Ferraro¹⁸ &
Giovanni Russo¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1250))

Included in the following conference series:

Proceedings of SAI Intelligent Systems Conference

1383 Accesses

Abstract

In this paper we propose a new approach to complement reinforcement learning (RL) with model-based control (in particular, Model Predictive Control - MPC). We introduce an algorithm, the MPC augmented RL (MPRL) that combines RL and MPC in a novel way so that they can augment each other’s strengths. We demonstrate the effectiveness of the MPRL by letting it play against the Atari game Pong. For this task, the results highlight how MPRL is able to outperform both RL and MPC when these are used individually.

The work in this paper was completed while in School of Electrical & Electronic Eng. University College Dublin Belfield, Dublin, Ireland.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Handbook of Reinforcement Learning and Control

High-accuracy model-based reinforcement learning, a survey

Article 04 February 2023

Challenges of Model Predictive Control in a Black Box Environment

Notes

1.
See http://gym.openai.com/docs/ for documentation on the environment observation space.

References

Atkeson, C.G., Santamaria, J.C.: A comparison of direct and model-based reinforcement learning. In: International Conference on Robotics and Automation, pp. 3557–3564 (1997)
Google Scholar
Battaglia, P., Pascanu, R., Lai, M., Rezende, D.J., Kavukcuoglu, K.: Interaction networks for learning about objects, relations and physics. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 4509–4517 (2016)
Google Scholar
Battaglia, P.W., Hamrick, J.B., Tenenbaum, J.B.: Simulation as an engine of physical scene understanding. Proc. Natl. Acad. Sci. 110(45), 18327–18332 (2013)
Article Google Scholar
Berkenkamp, F., Turchetta, M., Schoellig, A., Krause, A.: Safe model-based reinforcement learning with stability guarantees. In: Advances in Neural Information Processing Systems, vol. 30, pp. 908–918 (2017)
Google Scholar
Borrelli, F., Bemporad, A., Morari, M.: Predictive Control for Linear and Hybrid Systems, 1st edn. Cambridge University Press, New York (2017)
Book Google Scholar
Breyer, M., Furrer, F., Novkovic, T., Siegwart, R., Nieto, J.: Comparing task simplifications to learn closed-loop object picking using deep reinforcement learning. IEEE Robot. Autom. Lett. 4(2), 1549–1556 (2019)
Article Google Scholar
Chang, M.B., Ullman, T., Torralba, A., Tenenbaum, J.B.: A compositional object-based approach to learning physical dynamics. In: 5th International Conference on Learning Representations (2017)
Google Scholar
Cline, M.B.: Rigid body simulation with contact and constraints. Ph.D. thesis (2002)
Google Scholar
Coraluppi, S.P., Marcus, S.I.: Risk-sensitive and minimax control of discrete-time, finite-state markov decision processes. Automatica 35(2), 301–309 (1999)
Article MathSciNet Google Scholar
Cottle, R.W.: Linear complementarity problem. In: Floudas, C.A., Pardalos, P.M. (eds.) Encyclopedia of Optimization, pp. 1873–1878. Springer, Boston (2009)
Chapter Google Scholar
de Avila Belbute-Peres, F., Smith, K., Allen, K., Tenenbaum, J., Kolter, J.Z.: End-to-end differentiable physics for learning and control. In: Advances in Neural Information Processing Systems, vol. 31, pp. 7178–7189 (2018)
Google Scholar
Degrave, J., Hermans, M., Dambre, J., Wyffels, F.: A differentiable physics engine for deep learning in robotics. Front. Neurorobot. 13, 6 (2019)
Article Google Scholar
García, C.E., Prett, D.M., Morari, M.: Model predictive control: theory and practice–a survey. Automatica 25(3), 335–348 (1989)
Article Google Scholar
García, J., Fernández, F.: Safe exploration of state and action spaces in reinforcement learning. J. Artif. Int. Res. 45, 515–564 (2012)
MathSciNet MATH Google Scholar
García, J., Fernández, F.: A comprehensive survey on safe reinforcement learning. J. Mach. Learn. Res. 16, 1437–1480 (2015)
MathSciNet MATH Google Scholar
Geibel, P., Wysotzki, F.: Risk-sensitive reinforcement learning applied to control under constraints. J. Artif. Int. Res. 24, 81–108 (2005)
MATH Google Scholar
Hazara, M., Kyrki, V.: Transferring generalizable motor primitives from simulation to real world. IEEE Robot. Autom. Lett. 4(2), 2172–2179 (2019)
Article Google Scholar
Hermans, M., Schrauwen, B., Bienstman, P., Dambre, J.: Automated design of complex dynamic systems. PLOS One 9(1), 1–11 (2014)
Article Google Scholar
Hoppe, S., Lou, Z., Hennes, D., Toussaint, M.: Planning approximate exploration trajectories for model-free reinforcement learning in contact-rich manipulation. IEEE Robot. Autom. Lett. 4(4), 4042–4047 (2019)
Article Google Scholar
Kurutach, T., Clavera, I., Duan, Y., Tamar, A., Abbeel, P.: Model-ensemble trust-region policy optimization. In: International Conference on Learning Representations (2018)
Google Scholar
Lee, J., Grey, M.X., Ha, S., Kunz, T., Jain, S., Ye, Y., Srinivasa, S.S., Stilman, M., Liu, C.K.: DART: dynamic animation and robotics toolkit. J. Open Sour. Softw. 3(22), 500 (2018)
Article Google Scholar
De Lellis, F., Auletta, F., Russo, G., di Bernardo, M.: Control-tutored reinforcement learning: an application to the herding problem (2019)
Google Scholar
Lerer, A., Gross, S., Fergus, R.: Learning physical intuition of block towers by example. In: 33rd International Conference on Machine Learning, vol. 48, pp. 430–438 (2016)
Google Scholar
Liu, B., Wang, L., Liu, M.: Lifelong federated reinforcement learning: a learning architecture for navigation in cloud robotic systems. IEEE Robot. Autom. Lett. 4(4), 4555–4562 (2019)
Article Google Scholar
McKinnon, C.D., Schoellig, A.P.: Learn fast, forget slow: safe predictive learning control for systems with unknown and changing dynamics performing repetitive tasks. IEEE Robot. Autom. Lett. 4(2), 2180–2187 (2019)
Article Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning. In: NIPS Deep Learning Workshop (2013)
Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Article Google Scholar
Pecka, M., Zimmermann, K., Petrlík, M., Svoboda, T.: Data-driven policy transfer with imprecise perception simulation. IEEE Robot. Autom. Lett. 3(4), 3916–3921 (2018)
Article Google Scholar
Pecka, M., Svoboda, T.: Safe exploration techniques for reinforcement learning – an overview. In: Hodicky, J. (ed.) Modelling and Simulation for Autonomous Systems, pp. 357–375 (2014)
Google Scholar
Pfeiffer, M., Shukla, S., Turchetta, M., Cadena, C., Krause, A., Siegwart, R., Nieto, J.: Reinforced imitation: sample efficient deep reinforcement learning for mapless navigation by leveraging prior demonstrations. IEEE Robot. Autom. Lett. 3(4), 4423–4430 (2018)
Article Google Scholar
Rosolia, U., Borrelli, F.: Learning model predictive control for iterative tasks. A data-driven control framework. IEEE Trans. Autom. Control 63(7), 1883–1896 (2018)
Article MathSciNet Google Scholar
Rosolia, U., Zhang, X., Borrelli, F.: Data-driven predictive control for autonomous systems. Ann. Rev. Control Robot. Auton. Syst. 1(1), 259–286 (2018)
Article Google Scholar
Sadigh, D., Kapoor, A.: Safe control under uncertainty with probabilistic signal temporal logic. In: Robotics: Science and Systems XII (2016)
Google Scholar
Smith, K.A., Vul, E.: Sources of uncertainty in intuitive physics. Top. Cogn. Sci. 5(1), 185–199 (2013)
Article Google Scholar
Sutton, R.S., Barto, A.G.: Introduction to Reinforcement Learning, 1st edn. MIT Press, Cambridge (1998)
MATH Google Scholar
Tamar, A., Mannor, S., Xu, H.: Scaling up robust MDPS using function approximation. In: Proceedings of the 31st International Conference on Machine Learning, vol. 32, pp. 181–189 (2014)
Google Scholar
Tan, X., Chng, C., Su, Y., Lim, K., Chui, C.: Robot-assisted training in laparoscopy using deep reinforcement learning. IEEE Robot. Autom. Lett. 4(2), 485–492 (2019)
Article Google Scholar
Thananjeyan, B., Balakrishna, A., Rosolia, U., Li, F., McAllister, R., Gonzalez, J.E., Levine, S., Borrelli, F., Goldberg, K.: Safety augmented value estimation from demonstrations (saved): safe deep model-based RL for sparse cost robotic tasks (2019)
Google Scholar
Todorov, E., Erez, T., Tassa, Y.: MuJoCo: a physics engine for model-based control. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5026–5033, October 2012
Google Scholar
Werbos, P.J.: Neural networks for control and system identification. In: Proceedings of the 28th IEEE Conference on Decision and Control, vol. 1, pp. 260–265, December 1989
Google Scholar
Wunder, M., Littman, M.L., Babes, M.: Classes of multiagent Q-learning dynamics with epsilon-greedy exploration. In: Proceedings of the 27th International Conference on Machine Learning, pp. 1167–1174 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

IBM, Dublin, Ireland
Meghana Rathi
Dyson School of Design Engineering, Imperial College London, South Kensington, London, UK
Pietro Ferraro
Department of Information and Electronic Engineering and Applied Mathematics, Universita’ degli Studi di Salerno, Fisciano, Salerno, Italy
Giovanni Russo

Authors

Meghana Rathi
View author publications
You can also search for this author in PubMed Google Scholar
Pietro Ferraro
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Russo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Meghana Rathi .

Editor information

Editors and Affiliations

Saga University, Saga, Japan
Kohei Arai
The Science and Information (SAI) Organization, Bradford, West Yorkshire, UK
Supriya Kapoor
The Science and Information (SAI) Organization, Bradford, West Yorkshire, UK
Rahul Bhatia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rathi, M., Ferraro, P., Russo, G. (2021). Driving Reinforcement Learning with Models. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Systems and Applications. IntelliSys 2020. Advances in Intelligent Systems and Computing, vol 1250. Springer, Cham. https://doi.org/10.1007/978-3-030-55180-3_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-55180-3_6
Published: 25 August 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-55179-7
Online ISBN: 978-3-030-55180-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Driving Reinforcement Learning with Models

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Handbook of Reinforcement Learning and Control

High-accuracy model-based reinforcement learning, a survey

Challenges of Model Predictive Control in a Black Box Environment

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Driving Reinforcement Learning with Models

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Handbook of Reinforcement Learning and Control

High-accuracy model-based reinforcement learning, a survey

Challenges of Model Predictive Control in a Black Box Environment

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation