Regulation of a Van der Pol Oscillator Using Reinforcement Learning

Solórzano-Espíndola, Carlos Emiliano; Avelar-Barragán, José Ángel; Menchaca-Mendez, Rolando

doi:10.1007/978-3-030-62554-2_21

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1280))

Included in the following conference series:

International Congress of Telematics and Computing

817 Accesses

Abstract

In this work, we propose a reinforcement learning-based methodology for the regulation problem of a Van der Pol oscillator with an actuator subject to constraints. We use two neural networks, one who learns an approximation of the cost given a state, and one that learns the controller output. We employ a classic PID controller with compensation as base policy in a rollout scheme. This policy is further improved by a neural network trained on trajectories from random initial states. The results show that the resulting control policy reduces the cost for a minimal energy trajectory given an initial state.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Reinforcement learning-based neural control for discrete-time nonlinear systems via deterministic learning

Article 07 December 2024

Application of Reinforcement Learning in Control Systems for Designing Controllers

Reinforcement Learning in Plant Control Systems with Transport Lag

Article 01 May 2021

References

Avelar, A., Salgado, I., Ahmed, H., Mera, M., Chairez, I.: Differential neural networks observer for second order systems with sampled and quantized output. IFAC-PapersOnLine 51(13), 490–495 (2018). https://doi.org/10.1016/j.ifacol.2018.07.327
Article Google Scholar
Bertsekas, D.: Reinforcement Learning and Optimal Control. Athena Scientific Optimization and Computation Series, Athena Scientific (2019). https://books.google.com.mx/books?id=ZlBIyQEACAAJ
Bhattacharya, S., Badyal, S., Wheeler, T., Gil, S., Bertsekas, D.: Reinforcement learning for POMDP: partitioned rollout and policy iteration with application to autonomous sequential repair problems. IEEE Robot. Autom. Lett. 5(3), 3967–3974 (2020)
Article Google Scholar
Buşoniu, L., de Bruin, T., Tolić, D., Kober, J., Palunko, I.: Reinforcement learning for control: performance, stability, and deep approximators. Ann. Rev. Control 46, 8–28 (2018). https://doi.org/10.1016/j.arcontrol.2018.09.005
Article MathSciNet Google Scholar
Chagas, T., Toledo, B., Rempel, E., Chian, A.L., Valdivia, J.: Optimal feedback control of the forced van der pol system. Chaos Solit. Fractals 45(9), 1147–1156 (2012). https://doi.org/10.1016/j.chaos.2012.06.004. http://www.sciencedirect.com/science/article/pii/S0960077912001282
Article MathSciNet Google Scholar
El Cheikh, R., Lepoutre, T., Bernard, S.: Modeling biological rhythms in cell populations. Math. Modell. Nat. Phenom. 7(6), 107–125 (2012). https://doi.org/10.1051/mmnp/20127606
Article MathSciNet MATH Google Scholar
el Hakim, A., Hindersah, H., Rijanto, E.: Application of reinforcement learning on self-tuning PID controller for soccer robot multi-agent system. In: 2013 Joint International Conference on Rural Information Communication Technology and Electric-Vehicle Technology (rICT ICeV-T), pp. 1–6 (2013)
Google Scholar
Fabbri, G., Gozzi, F., Świȩch, A.: Stochastic Optimal Control in Infinite Dimension. Probability Theory and Stochastic Modelling. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-319-53067-3. https://link.springer.com/book/10.1007/978-3-319-53067-3
Book MATH Google Scholar
Grondman, I., Busoniu, L., Lopes, G.A.D., Babuska, R.: A survey of actor-critic reinforcement learning: standard and natural policy gradients. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42(6), 1291–1307 (2012)
Article Google Scholar
Ji, Z., Lou, X.: Adaptive dynamic programming for optimal control of van der pol oscillator. In: 2018 Chinese Control And Decision Conference (CCDC), pp. 1537–1542 (2018)
Google Scholar
Khalil, H.K.: Nonlinear Systems, 3rd edn. Prentice-Hall, Upper Saddle River (2002). https://cds.cern.ch/record/1173048. The book can be consulted by contacting: PH-AID: Wallet, Lionel
MATH Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015, Conference Track Proceedings (2015). http://arxiv.org/abs/1412.6980
Kinoshita, S.: 1 Introduction to Nonequilibrium Phenomena. Elsevier Inc. (2013). https://doi.org/10.1016/B978-0-12-397014-5.00001-8.
Li, Q., Li, G., Wang, X., Wei, M.: Diffusion welding furnace temperature controller based on actor-critic. In: 2019 Chinese Control Conference (CCC), pp. 2484–2487 (2019)
Google Scholar
Noori Skandari, M., Ghaznavi, M., Abedian, M.: Stabilizer control design for nonlinear systems based on the hyperbolic modelling. Appl. Math. Modell. 67, 413–429 (2019). https://doi.org/10.1016/j.apm.2018.11.006. http://www.sciencedirect.com/science/article/pii/S0307904X1830533X
Article MathSciNet MATH Google Scholar
Ogata, K.: Modern Control Engineering, 4th edn. Prentice Hall PTR, Upper Saddle River (2001)
MATH Google Scholar
Paszke, A., et al.: Automatic differentiation in PyTorch. In: NIPS 2017 Workshop (2017)
Google Scholar
Tsatsos, M.: Theoretical and numerical study of the Van der Pol equation. Undergraduate thesis, Aristotle University of Thessaloniki (2008)
Google Scholar
Virtanen, P., et al.: SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020). https://doi.org/10.1038/s41592-019-0686-2
Article Google Scholar
van der Walt, S., Colbert, S.C., Varoquaux, G.: The NumPy array: a structure for efficient numerical computation. Comput. Sci. Eng. 13(2), 22–30 (2011). https://doi.org/10.1109/mcse.2011.37
Article Google Scholar

Download references

Author information

Authors and Affiliations

Centro de Investigación en Computación, Mexico City, Mexico
Carlos Emiliano Solórzano-Espíndola, José Ángel Avelar-Barragán & Rolando Menchaca-Mendez

Authors

Carlos Emiliano Solórzano-Espíndola
View author publications
You can also search for this author in PubMed Google Scholar
José Ángel Avelar-Barragán
View author publications
You can also search for this author in PubMed Google Scholar
Rolando Menchaca-Mendez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Carlos Emiliano Solórzano-Espíndola .

Editor information

Editors and Affiliations

Instituto Politécnico Nacional, México, Mexico
Miguel Félix Mata-Rivera
Instituto Politécnico Nacional, México, Mexico
Roberto Zagal-Flores
Universidad Mayor, Santiago de Chile, Chile
Cristian Barria-Huidobro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Solórzano-Espíndola, C.E., Avelar-Barragán, J.Á., Menchaca-Mendez, R. (2020). Regulation of a Van der Pol Oscillator Using Reinforcement Learning. In: Mata-Rivera, M.F., Zagal-Flores, R., Barria-Huidobro, C. (eds) Telematics and Computing. WITCOM 2020. Communications in Computer and Information Science, vol 1280. Springer, Cham. https://doi.org/10.1007/978-3-030-62554-2_21

Download citation

DOI: https://doi.org/10.1007/978-3-030-62554-2_21
Published: 28 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62553-5
Online ISBN: 978-3-030-62554-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics