Abstract
The Markov decision process has various applications in engineering, economics, operations research and artificial intelligence. Quantum computers provide a new way to tackle the computational problems in solving Markov decision process. We develop quantum circuits for dynamic programming algorithm to solve for the Markov decision process. The heuristic circuit construction method based on linear combinations of unitaries is demonstrated. The matrix decomposition and sampling are discussed. The computability advantage over classical Birkhoff-von Neumann method is proved. The connection to traditional dynamic programming algorithm is discussed in terms of functional equations. Our work suggests a new hybrid quantum-classical approach to dynamic programming algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abraham, H., et al.: Qiskit: an open-source framework for quantum computing (2019). https://doi.org/10.5281/zenodo.2562110
Bellman, R.E.: Dynamic Programming. Dover Publications Inc., New York (2003)
Chen, C.C., Shiau, S.Y., Wu, M.F., Wu, Y.R.: Hybrid classical-quantum linear solver using noisy intermediate-scale quantum machines. Sci. Rep. 9, 16251 (2019). https://doi.org/10.1038/s41598-019-52275-6
Chen, S.Y.C., Huck Yang, C.H., Qi, J., Chen, P.Y., Ma, X., Goan, H.S.: Variational quantum circuits for deep reinforcement learning. arXiv e-prints arXiv:1907.00397 (2019)
Childs, A.M., Wiebe, N.: Hamiltonian simulation using linear combinations of unitary operations. Quantum Info. Comput. 12(11–12), 901–924 (2012)
Denardo, E.V.: Dynamic Programming: Models and Applications. Prentice Hall PTR, Hoboken (1981)
Dufossé, F., Uçar, B.: Notes on Birkhoff-von Neumann decomposition of doubly stochastic matrices. Linear Algebra Appl. 497, 108–115 (2016). https://doi.org/10.1016/j.laa.2016.02.023. http://www.sciencedirect.com/science/article/pii/S0024379516001257
Geramifard, A., Walsh, T.J., Tellex, S., Chowdhary, G., Roy, N., How, J.P.: A tutorial on linear function approximators for dynamic programming and reinforcement learning. Found. Trends Mach. Learn. 6(4), 375–451 (2013). https://doi.org/10.1561/2200000042
Hamagami, T., Shibuya, T., Shimada, S.: Complex-valued reinforcement learning. In: 2006 IEEE International Conference on Systems, Man and Cybernetics, vol. 5, pp. 4175–4179 (2006). https://doi.org/10.1109/ICSMC.2006.384789
Horn, R.A., Johnson, C.R.: Matrix Analysis, 2nd edn. Cambridge University Press, Cambridge (2012)
Shende, V.V., Bullock, S.S., Markov, I.L.: Synthesis of quantum-logic circuits. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 25(6), 1000–1010 (2006). https://doi.org/10.1109/TCAD.2005.855930
Shende, V.V., Prasad, A.K., Markov, I.L., Hayes, J.P.: Synthesis of reversible logic circuits. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 22(6), 710–722 (2003). https://doi.org/10.1109/TCAD.2003.811448
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. A Bradford Book, Cambridge (2018)
Acknowledgements
We thank Naoki Yamamoto for valuable discussions.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, CC., Shiba, K., Sogabe, M., Sakamoto, K., Sogabe, T. (2021). Hybrid Quantum-Classical Dynamic Programming Algorithm. In: Yada, K., et al. Advances in Artificial Intelligence. JSAI 2020. Advances in Intelligent Systems and Computing, vol 1357. Springer, Cham. https://doi.org/10.1007/978-3-030-73113-7_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-73113-7_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73112-0
Online ISBN: 978-3-030-73113-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)