A Variational Quantum Soft Actor-Critic Algorithm for Continuous Control Tasks

Policicchio, Antonio; Acuto, Alberto; Barillà, Paola; Bozzolo, Ludovico; Conterno, Matteo

doi:10.1007/978-3-031-81247-7_13

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14478))

Included in the following conference series:

International Conference on Numerical Computations: Theory and Algorithms

125 Accesses

Abstract

Quantum Computing promises the availability of computational resources and generalization capabilities well beyond the possibilities of classical computers. An interesting approach for leveraging the near-term, Noisy Intermediate-Scale Quantum Computers, is the hybrid training of Parameterized Quantum Circuits (PQCs), i.e. the optimization of a parameterized quantum algorithms as a function approximation with classical optimization techniques. When PQCs are used in Machine Learning models, they may offer some advantages over classical models in terms of memory consumption and sample complexity for classical data analysis. In this work we explore and assess the advantages of the application of Parametric Quantum Circuits to one of the state-of-art Reinforcement Learning algorithm for continuous control - namely Soft Actor-Critic. We investigate its performance on the control of a virtual robotic arm by means of digital simulations of quantum circuits. A quantum advantage over the classical algorithm has been found in terms of a significant decrease in the amount of required parameters for satisfactory model training, paving the way for further developments and studies. A quantum advantage over the classical algorithm has been found in terms of a significant decrease in the amount of required parameters for satisfactory model training, paving the way for further developments and studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Optimizing Deep Reinforcement Learning for Adaptive Robotic Arm Control

Robustness of quantum reinforcement learning under hardware errors

Article Open access 28 February 2023

Comparing quantum hybrid reinforcement learning to classical methods

Article Open access 12 March 2021

References

Wang, H., et al.: Deep reinforcement learning: a survey. Front. Inf. Technol. Electron. Eng. 21 (2020)
Google Scholar
Liu, R., Nageotte, F., Zanne, P., Mathelin, M., Dresp-Langley, B.: Deep reinforcement learning for the control of robotic manipulation: a focused mini-review. Robotics 10, 22 (2021). https://doi.org/10.3390
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. The MIT Press (2018). http://incompleteideas.net/book/the-book-2nd.html
Nielsen, M., Chuang, I.: Quantum Computation and Quantum Information: 10th Anniversary Edition. Cambridge University Press (2011)
Google Scholar
Preskill, J.: Quantum computing in the NISQ era and beyond. Quantum 2, 79 (2018). https://doi.org/10.22331
Benedetti, M., Lloyd, E., Sack, S., Fiorentini, M.: Parameterized quantum circuits as machine learning models. Quantum Sci. Technol. 4, 043001 (2019). https://doi.org/10.1088
Schuld, M., Petruccione, F.: Machine Learning with Quantum Computers. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-83098-4. https://books.google.it/books?id=-N5IEAAAQBAJ
Banchi, L., Pereira, J., Pirandola, S.: Generalization in quantum machine learning: a quantum information standpoint. PRX Quantum 2 (2021). https://doi.org/10.1103
Heimann, D., Hohenfeld, H., Wiebe, F., Kirchner, F.: Quantum deep reinforcement learning for robot navigation tasks. arXiv (2022). https://arxiv.org/abs/2202.12180
Chen, S., et al.: Variational quantum circuits for deep reinforcement learning. IEEE Access 8, 141007–141024 (2020)
Google Scholar
Valdez, F., Melin, P.: A review on quantum computing and deep learning algorithms and their applications. Soft Comput. (2022)
Google Scholar
Nian, R., Liu, J., Huang, B.: A review on reinforcement learning: introduction and applications in industrial process control. Comput. Chem. Eng. 139, 106–886 (2020). https://www.sciencedirect.com/science/article/pii/S0098135420300557
Lillicrap, T., et al.: Continuous control with deep reinforcement learning. arXiv (2015). https://arxiv.org/abs/1509.02971
Puterman, M.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley (2005). https://cds.cern.ch/record/1319893
Watkins, C., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992)
Article MATH Google Scholar
Konda, V., Tsitsiklis, J.: Actor-critic algorithms. Soc. Ind. Appl. Math. 42 (2001)
Google Scholar
Li, H., Lau, T.: Reinforcement learning: prediction, control and value function approximation. arXiv (2019). https://arxiv.org/abs/1908.10771
Han, M., Zhang, L., Wang, J., Pan, W.: Actor-critic reinforcement learning for control with stability guarantee. arXiv (2020). https://arxiv.org/abs/2004.14288
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv (2018). https://arxiv.org/abs/1801.01290
Rumelhart, D., Hinton, G., Williams, R.: Learning representations by back-propagating errors. Nature 323, 533–536 (1986)
Article MATH Google Scholar
Crooks, G.: Gradients of parameterized quantum gates using the parameter-shift rule and gate decomposition. arXiv (2019). https://arxiv.org/abs/1905.13311
Lan, Q.: Variational quantum soft actor-critic. arXiv (2021). https://arxiv.org/abs/2112.11921
Pérez-Salinas, A., Cervera-Lierta, A., Gil-Fuster, E., Latorre, J.: Data re-uploading for a universal quantum classifier. Quantum 4, 226 (2020). https://doi.org/10.22331
Gu, S., Holly, E., Lillicrap, T., Levine, S.: Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. arXiv (2016). https://arxiv.org/abs/1610.00633
Ibarz, J., Tan, J., Finn, C., Kalakrishnan, M., Pastor, P., Levine, S.: How to train your robot with deep reinforcement learning: lessons we have learned. Int. J. Rob. Res. 40, 698–721 (2021). https://doi.org/10.1177
Kilinc, O., Montana, G.: Reinforcement learning for robotic manipulation using simulated locomotion demonstrations. Mach. Learn. 111, 465–486 (2021). https://doi.org/10.1007
Catto, E.: Box2D, a 2D physics engine for games (2011). https://box2d.org/
Brockman, G., et al.: OpenAI Gym. arXiv (2016). https://arxiv.org/abs/1606.01540
Van Rossum, G., Drake, F.: Python 3 reference manual. CreateSpace (2009)
Google Scholar
Broughton, M., et al.: TensorFlow quantum: a software framework for quantum machine learning. arXiv (2020). https://arxiv.org/abs/2003.02989
Gidney, C., et al.: Cirq, Zenodo (2022). https://doi.org/10.5281/zenodo.6599601
Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). https://www.tensorflow.org/. Software available from tensorflow.org
Acuto, A., Barillà, P., Bozzolo, L., Conterno, M., Pavese, M., Policicchio, A.: Variational quantum soft actor-critic for robotic arm control (2022)
Google Scholar
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv (2014). https://arxiv.org/abs/1412.6980

Download references

Acknowledgements

This work was fully funded by NTT DATA Corporation (Japan) and supported by NTT DATA Italia S.p.A. (Italy).

Author information

Authors and Affiliations

NTT DATA Italia S.p.A., Via Ernesto Calindri 4, 20143, Milan, Italy
Antonio Policicchio, Alberto Acuto, Paola Barillà, Ludovico Bozzolo & Matteo Conterno

Authors

Antonio Policicchio
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Acuto
View author publications
You can also search for this author in PubMed Google Scholar
Paola Barillà
View author publications
You can also search for this author in PubMed Google Scholar
Ludovico Bozzolo
View author publications
You can also search for this author in PubMed Google Scholar
Matteo Conterno
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antonio Policicchio .

Editor information

Editors and Affiliations

University of Calabria, Rende, Italy
Yaroslav D. Sergeyev
University of Calabria, Rende, Italy
Dmitri E. Kvasov
University of Calabria, Rende, Italy
Annabella Astorino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Policicchio, A., Acuto, A., Barillà, P., Bozzolo, L., Conterno, M. (2025). A Variational Quantum Soft Actor-Critic Algorithm for Continuous Control Tasks. In: Sergeyev, Y.D., Kvasov, D.E., Astorino, A. (eds) Numerical Computations: Theory and Algorithms. NUMTA 2023. Lecture Notes in Computer Science, vol 14478. Springer, Cham. https://doi.org/10.1007/978-3-031-81247-7_13

Download citation

DOI: https://doi.org/10.1007/978-3-031-81247-7_13
Published: 01 January 2025
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-81246-0
Online ISBN: 978-3-031-81247-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Variational Quantum Soft Actor-Critic Algorithm for Continuous Control Tasks