Abstract
Trading robots, meticulously crafted programs, are designed to execute trades automatically. However, stock trading presents a unique challenge. Unlike finite game tasks, stock markets operate perpetually, making it arduous for traders to design appropriate reward functions for training Reinforcement Learning models. For stock trading tasks that can easily determine the optimal decision trajectory from historical data, previous studies showed that Behavioral Cloning has much better learning efficiency than Reinforcement Learning. In this study, we propose a novel Behavior Cloning algorithm that leverages Long Short-Term Memory (LSTM) networks and self-attention mechanism as core components. Our approach effectively captures temporal dependencies and interrelations among elements at various positions, aiming to enhance learning efficiency. Additionally, a strategic approach known as the positive transaction expert strategy was devised to guide the model training process. In our comparative analysis, we evaluated the proposed algorithm against supervised learning, reinforcement learning, and traditional time series trading algorithms. The empirical results indicate that the Attention-Based Behavioral Cloning algorithm exhibits an 83.33% likelihood of achieving the highest return.
Graphical abstract
The learning structure of Attention-Based Behavioral Cloning
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Availability of data and materials
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Haykin S (2009) Neural Networks and Learning Machines, 3/E. Pearson Education, India
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Li Z, Liu F, Yang W, Peng S, Zhou J (2021) A survey of convolutional neural networks: analysis, applications, and prospects. IEEE Tran Neural Netw Learn Syst 33(12):6999–7019
Yang S, Chen B (2023) Effective surrogate gradient learning with high-order information bottleneck for spike-based machine intelligence. IEEE Trans Neural Netw Learn Syst
Yang S, Chen B (2023) Snib: improving spike-based machine learning using nonlinear information bottleneck. IEEE Tran Syst Man Cybern Syst
Yang S, Pang Y, Wang H, Lei T, Pan J, Wang J, Jin Y (2023) Spike-driven multi-scale learning with hybrid mechanisms of spiking dendrites. Neurocomputing 542:126240
Yang S, Wang H, Chen B (2023) Sibols: robust and energy-efficient learning for spike-based machine intelligence in information bottleneck framework. IEEE Trans Cognit Dev Syst
Yang K, Liu Y, Yu Z, Chen CP (2021) Extracting and composing robust features with broad learning system. IEEE Trans Knowl Data Eng 35(4):3885–3896
Tatsat H, Puri S, Lookabaugh B (2020) Machine Learning and Data Science Blueprints for Finance. O’Reilly Media, USA
Bontempi G, Ben Taieb S, Le Borgne Y-A (2013) Machine learning strategies for time series forecasting. In: Business Intelligence: Second European Summer School, eBISS 2012, Brussels, Belgium, July 15-21, 2012, Tutorial Lectures vol2, pp 62–77
Kim HH, Swanson NR (2018) Mining big data using parsimonious factor, machine learning, variable selection and shrinkage methods. Int J Forecast 34(2):339–354
Goulet Coulombe P, Leroux M, Stevanovic D, Surprenant S (2022) How is machine learning useful for macroeconomic forecasting? J Appl Econ 37(5):920–964
Liu X-Y, Xia Z, Yang H, Gao J, Zha D, Zhu M, Wang CD, Wang Z, Guo J (2024) Dynamic datasets and market environments for financial reinforcement learning. Mach Learn
Sun Q, Si Y-W (2022) Supervised actor-critic reinforcement learning with action feedback for algorithmic trading. Appl Intell 1–18
Arora S, Doshi P (2021) A survey of inverse reinforcement learning: Challenges, methods and progress. Artif Intell 297:103500
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Dandurand F, Shultz TR (2009) Connectionist models of reinforcement, imitation, and instruction in learning to solve complex problems. IEEE Trans Auton Mental Dev 1(2):110–121
Yang K, Yu Z, Wen X, Cao W, Chen CP, Wong H-S, You J (2019) Hybrid classifier ensemble for imbalanced data. IEEE Trans Neural Netw Learn Syst 31(4):1387–1400
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30
Zhao B, Li X, Lu X (2019) Cam-rnn: Co-attention model based rnn for video captioning. IEEE Trans Image Process 28(11)
Shi X, Huang H, Jian P, Tang Y-K (2021) Improving neural machine translation with sentence alignment learning. Neurocomputing 420:15–26
Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46(3):175–185
Wang L (2005) Support Vector Machines: Theory and Applications, vol 177. Springer, Berlin
Liu W, Gu Y, Ge Y (2024) Multi-factor stock trading strategy based on dqn with multi-bigru and multi-head probsparse self-attention. Appl Intell
Zemzem W, Tagina M (2023) Improving exploration in deep reinforcement learning for stock trading. Int J Comput Appl Technol 72(4):288–295
Wu X, Chen H, Wang J, Troiano L, Loia V, Fujita H (2020) Adaptive stock trading strategies with deep reinforcement learning methods. Inf Sci 538:142–158
Kim S-H, Park D-Y, Lee K-H (2022) Hybrid deep reinforcement learning for pairs trading. Appli Sci 12(3):944
Edwards RD, Bassetti W, Magee J (2012) Technical Analysis of Stock Trends. CRC Press, USA
Sutton RS, Barto AG (2018) Reinforcement Learning: An Introduction. MIT press, USA
Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A et al (2017) Mastering the game of go without human knowledge. Nature 550(7676):354–359
Duan Y, Chen X, Houthooft R, Schulman J, Abbeel P (2016) Benchmarking deep reinforcement learning for continuous control. In: International conference on machine learning, pp 1329–1338. PMLR
Li Y, Liu P, Wang Z (2022) Stock trading strategies based on deep reinforcement learning. Sci Program 2022
Li Y, Zheng W, Zheng Z (2019) Deep robust reinforcement learning for practical algorithmic trading. IEEE Access 7:108014–108022
Hussein A, Gaber MM, Elyan E, Jayne C (2017) Imitation learning: A survey of learning methods. ACM Comput Surv (CSUR) 50(2):1–35
Liu Y, Liu Q, Zhao H, Pan Z, Liu C (2020) Adaptive quantitative trading: An imitative deep reinforcement learning approach. In: Proceedings of the AAAI conference on artificial intelligence vol 34, pp 2128–2135
Yu P, Lee JS, Kulyatin I, Shi Z, Dasgupta S (2019) Model-based deep reinforcement learning for financial portfolio optimization. In: ICML vol 1, pp 2019
Florence P, Lynch C, Zeng A, Ramirez OA, Wahid A, Downs L, Wong A, Lee J, Mordatch I, Tompson J (2022) Implicit behavioral cloning. In: Conference on robot learning pp 158–168. PMLR
Edwards RD, Magee J, Bassetti WC (2018) Technical Analysis of Stock Trends. CRC Press, UK
Sun Q, Gong X, Si Y-W (2023) Transaction-aware inverse reinforcement learning for trading in stock markets. Appl Intell 53(23):28186–28206
Géron A (2022) Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow. O’Reilly Media, Inc, USA
Noshad M, Zeng Y, Hero AO (2019) Scalable mutual information estimation using dependence graphs. In: ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2962–2966. IEEE
Peng C-YJ, Lee KL, Ingersoll GM (2002) An introduction to logistic regression analysis and reporting. J Educ Res 96(1):3–14
Wan Y, Si Y-W (2017) Adaptive neuro fuzzy inference system for chart pattern matching in financial time series. Appl Soft Comput 57:1–18
Heravi N, Wahid A, Lynch C, Florence P, Armstrong T, Tompson J, Sermanet P, Bohg J, Dwibedi D (2023) Visuomotor control in multi-object scenes using object-aware representations. In: 2023 IEEE international conference on robotics and automation (ICRA 2023) pp 9515–9522
Funding
This research was supported by National Key Research and Development Program of China (2023YFC2813000, 2023YFC2813002), Science and Technology Plan Project of GuangZhou Nansha (project No. 2023ZD016) and Research Services and Knowledge Transfer Office of University of Macau (file no. MYRG2022-00162-FST and MYRG2019-00136-FST).
Author information
Authors and Affiliations
Contributions
Qizhou Sun conceived and designed the analysis. Qizhou Sun and Yufan Xie collected the data and performed the analysis. Qizhou Sun, Yufan Xie and Yain-Whar Si wrote the paper.
Corresponding author
Ethics declarations
Competing interests
The authors have no competing interests as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sun, Q., Xie, Y. & Si, YW. Attention-Based Behavioral Cloning for algorithmic trading. Appl Intell 55, 74 (2025). https://doi.org/10.1007/s10489-024-06064-y
Accepted:
Published:
DOI: https://doi.org/10.1007/s10489-024-06064-y