Optimal Trade Execution Based on Deep Deterministic Policy Gradient

Ye, Zekun; Deng, Weijie; Zhou, Shuigeng; Xu, Yi; Guan, Jihong

doi:10.1007/978-3-030-59410-7_42

Zekun Ye¹⁴,
Weijie Deng¹⁴,
Shuigeng Zhou¹⁴,
Yi Xu¹⁴ &
…
Jihong Guan¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12112))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

2983 Accesses
7 Citations

Abstract

In this paper, we address the Optimal Trade Execution (OTE) problem over the limit order book mechanism, which is about how best to trade a given block of shares at minimal cost or for maximal return. To this end, we propose a deep reinforcement learning based solution. Though reinforcement learning has been applied to the OTE problem, this paper is the first work that explores deep reinforcement learning and achieves state of the art performance. Concretely, we develop a deep deterministic policy gradient framework that can effectively exploit comprehensive features of multiple periods of the real and volatile market. Experiments on three real market datasets show that the proposed approach significantly outperforms the existing methods, including the Submit & Leave (SL) policy (as baseline), the Q-learning algorithm, and the latest hybrid method that combines the Almgren-Chriss model and reinforcement learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Akbarzadeh, N., Tekin, C., van der Schaar, M.: Online learning in limit order book trade execution. IEEE Trans. Signal Process. 66(17), 4626–4641 (2018). https://doi.org/10.1109/TSP.2018.2858188
Article MathSciNet MATH Google Scholar
Almgren, R., Chriss, N.: Optimal execution of portfolio transactions. J. Risk 3, 5–40 (2001)
Article Google Scholar
Barto, A.G., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discrete Event Dyn. Syst. 13(4), 341–379 (2003)
Article MathSciNet Google Scholar
Bertsimas, D., Lo, A.: Optimal control of execution costs - a study of government bonds with the same maturity date. J. Financ. Mark. 1, 1–50 (1998)
Article Google Scholar
Cont, R., Larrard, A.D.: Price dynamics in a Markovian limit order market. SIAM J. Financ. Math. 4(1), 1–25 (2013)
Article MathSciNet Google Scholar
Cont, R., Stoikov, S., Talreja, R.: A stochastic model for order book dynamics. Oper. Res. 58(3), 549–563 (2010)
Article MathSciNet Google Scholar
Feng, Y., Palomar, D.P., Rubio, F.: Robust optimization of order execution. IEEE Trans. Signal Process. 63(4), 907–920 (2015)
Article MathSciNet Google Scholar
Hendricks, D., Wilcox, D.: A reinforcement learning extension to the Almgren-Chriss framework for optimal trade execution. In: 2104 IEEE Conference on Computational Intelligence for Financial Engineering & Economics (CIFEr), pp. 457–464. IEEE (2014)
Google Scholar
Huang, W., Lehalle, C.A., Rosenbaum, M.: Simulating and analyzing order book data: the queue-reactive model. J. Am. Stat. Assoc. 110(509), 107–122 (2015)
Article MathSciNet Google Scholar
Jiang, Z., Xu, D., Liang, J.: A deep reinforcement learning framework for the financial portfolio management problem. arXiv preprint arXiv:1706.10059 (2017)
Johnson, J.D., Li, J., Chen, Z.: Reinforcement learning: an introduction-RS Sutton, AG Barto, MIT Press, Cambridge, MA 1998, 322 pp. ISBN 0-262-19398-1. Neurocomputing 35(1), 205–206 (2000)
Google Scholar
Kearns, M., Nevmyvaka, Y.: Machine learning for market microstructure and high frequency trading. High Frequency Trading: New Realities for Traders, Markets, and Regulators (2013)
Google Scholar
Liang, Z., Jiang, K., Chen, H., Zhu, J., Li, Y.: Deep reinforcement learning in portfolio management. arXiv preprint arXiv:1808.09940 (2018)
Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Nevmyvaka, Y., Feng, Y., Kearns, M.: Reinforcement learning for optimized trade execution. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 673–680. ACM (2006)
Google Scholar
Palguna, D., Pollak, I.: Non-parametric prediction in a limit order book. In: 2013 IEEE Global Conference on Signal and Information Processing (GlobalSIP), p. 1139. IEEE (2013)
Google Scholar
Palguna, D., Pollak, I.: Mid-price prediction in a limit order book. IEEE J. Sel. Topics Signal Process. 10(6), 1 (2016)
Article Google Scholar
Perold, A.F.: The implementation shortfall. J. Portfolio Manag. 33(1), 25–30 (1988)
Google Scholar
Rosenberg, G., Haghnegahdar, P., Goddard, P., Carr, P., Wu, K., De Prado, M.L.: Solving the optimal trading trajectory problem using a quantum annealer. IEEE J. Sel. Top. Signal Process. 10(6), 1053–1060 (2016)
Article Google Scholar
Sherstov, A.A., Stone, P.: Three automated stock-trading agents: a comparative study. In: Faratin, P., Rodríguez-Aguilar, J.A. (eds.) AMEC 2004. LNCS (LNAI), vol. 3435, pp. 173–187. Springer, Heidelberg (2006). https://doi.org/10.1007/11575726_13
Chapter Google Scholar
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: Proceedings of the 31st International Conference on International Conference on Machine Learning, ICML 2014, vol. 32, pp. I-387–I-395. JMLR.org (2014)
Google Scholar
Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems, pp. 1057–1063 (2000)
Google Scholar
Xiong, Z., Liu, X.Y., Zhong, S., Walid, A., et al.: Practical deep reinforcement learning approach for stock trading. In: NeurIPS Workshop on Challenges and Opportunities for AI in Financial Services: The Impact of Fairness, Explainability, Accuracy, and Privacy (2018)
Google Scholar
Ye, Z., Huang, K., Zhou, S., Guan, J.: Gaussian weighting reversion strategy for accurate on-line portfolio selection. In: 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 929–936 (2017)
Google Scholar

Download references

Acknowledgement

This work was supported in part by Science and Technology Commission of Shanghai Municipality Project (#19511120700). Jihong Guan was partially supported by the Program of Science and Technology Innovation Action of Science and Technology Commission of Shanghai Municipality under Grant No. 17511105204 and the Special Fund for Shanghai Industrial Transformation and Upgrading under grant No. 18XI-05, Shanghai Municipal Commission of Economy and Informatization.

Author information

Authors and Affiliations

Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai, 200433, China
Zekun Ye, Weijie Deng, Shuigeng Zhou & Yi Xu
Department of Computer Science and Technology, Tongji University, 4800 Caoan Road, Shanghai, 201804, China
Jihong Guan

Authors

Zekun Ye
View author publications
You can also search for this author in PubMed Google Scholar
Weijie Deng
View author publications
You can also search for this author in PubMed Google Scholar
Shuigeng Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Yi Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jihong Guan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuigeng Zhou .

Editor information

Editors and Affiliations

Dankook University, Yongin, Korea (Republic of)
Yunmook Nah
Peking University, Haidian, China
Bin Cui
Sungkyunkwan University, Suwon, Korea (Republic of)
Sang-Won Lee
Department of System Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong, Hong Kong
Jeffrey Xu Yu
Kangwon National University, Chunchon, Korea (Republic of)
Yang-Sae Moon
Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Steven Euijong Whang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ye, Z., Deng, W., Zhou, S., Xu, Y., Guan, J. (2020). Optimal Trade Execution Based on Deep Deterministic Policy Gradient. In: Nah, Y., Cui, B., Lee, SW., Yu, J.X., Moon, YS., Whang, S.E. (eds) Database Systems for Advanced Applications. DASFAA 2020. Lecture Notes in Computer Science(), vol 12112. Springer, Cham. https://doi.org/10.1007/978-3-030-59410-7_42

Download citation

DOI: https://doi.org/10.1007/978-3-030-59410-7_42
Published: 18 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59409-1
Online ISBN: 978-3-030-59410-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics