Skip to main content

Advertisement

Attention-Based Behavioral Cloning for algorithmic trading

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Trading robots, meticulously crafted programs, are designed to execute trades automatically. However, stock trading presents a unique challenge. Unlike finite game tasks, stock markets operate perpetually, making it arduous for traders to design appropriate reward functions for training Reinforcement Learning models. For stock trading tasks that can easily determine the optimal decision trajectory from historical data, previous studies showed that Behavioral Cloning has much better learning efficiency than Reinforcement Learning. In this study, we propose a novel Behavior Cloning algorithm that leverages Long Short-Term Memory (LSTM) networks and self-attention mechanism as core components. Our approach effectively captures temporal dependencies and interrelations among elements at various positions, aiming to enhance learning efficiency. Additionally, a strategic approach known as the positive transaction expert strategy was devised to guide the model training process. In our comparative analysis, we evaluated the proposed algorithm against supervised learning, reinforcement learning, and traditional time series trading algorithms. The empirical results indicate that the Attention-Based Behavioral Cloning algorithm exhibits an 83.33% likelihood of achieving the highest return.

Graphical abstract

The learning structure of Attention-Based Behavioral Cloning

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Availability of data and materials

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Haykin S (2009) Neural Networks and Learning Machines, 3/E. Pearson Education, India

    MATH  Google Scholar 

  2. Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507

    Article  MathSciNet  MATH  Google Scholar 

  3. Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554

    Article  MathSciNet  MATH  Google Scholar 

  4. Li Z, Liu F, Yang W, Peng S, Zhou J (2021) A survey of convolutional neural networks: analysis, applications, and prospects. IEEE Tran Neural Netw Learn Syst 33(12):6999–7019

    Article  MathSciNet  MATH  Google Scholar 

  5. Yang S, Chen B (2023) Effective surrogate gradient learning with high-order information bottleneck for spike-based machine intelligence. IEEE Trans Neural Netw Learn Syst

  6. Yang S, Chen B (2023) Snib: improving spike-based machine learning using nonlinear information bottleneck. IEEE Tran Syst Man Cybern Syst

  7. Yang S, Pang Y, Wang H, Lei T, Pan J, Wang J, Jin Y (2023) Spike-driven multi-scale learning with hybrid mechanisms of spiking dendrites. Neurocomputing 542:126240

    Article  MATH  Google Scholar 

  8. Yang S, Wang H, Chen B (2023) Sibols: robust and energy-efficient learning for spike-based machine intelligence in information bottleneck framework. IEEE Trans Cognit Dev Syst

  9. Yang K, Liu Y, Yu Z, Chen CP (2021) Extracting and composing robust features with broad learning system. IEEE Trans Knowl Data Eng 35(4):3885–3896

    Article  MATH  Google Scholar 

  10. Tatsat H, Puri S, Lookabaugh B (2020) Machine Learning and Data Science Blueprints for Finance. O’Reilly Media, USA

    Google Scholar 

  11. Bontempi G, Ben Taieb S, Le Borgne Y-A (2013) Machine learning strategies for time series forecasting. In: Business Intelligence: Second European Summer School, eBISS 2012, Brussels, Belgium, July 15-21, 2012, Tutorial Lectures vol2, pp 62–77

  12. Kim HH, Swanson NR (2018) Mining big data using parsimonious factor, machine learning, variable selection and shrinkage methods. Int J Forecast 34(2):339–354

    Article  MATH  Google Scholar 

  13. Goulet Coulombe P, Leroux M, Stevanovic D, Surprenant S (2022) How is machine learning useful for macroeconomic forecasting? J Appl Econ 37(5):920–964

    Article  MathSciNet  Google Scholar 

  14. Liu X-Y, Xia Z, Yang H, Gao J, Zha D, Zhu M, Wang CD, Wang Z, Guo J (2024) Dynamic datasets and market environments for financial reinforcement learning. Mach Learn

  15. Sun Q, Si Y-W (2022) Supervised actor-critic reinforcement learning with action feedback for algorithmic trading. Appl Intell 1–18

  16. Arora S, Doshi P (2021) A survey of inverse reinforcement learning: Challenges, methods and progress. Artif Intell 297:103500

    Article  MathSciNet  MATH  Google Scholar 

  17. Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533

    Article  Google Scholar 

  18. Dandurand F, Shultz TR (2009) Connectionist models of reinforcement, imitation, and instruction in learning to solve complex problems. IEEE Trans Auton Mental Dev 1(2):110–121

    Article  MATH  Google Scholar 

  19. Yang K, Yu Z, Wen X, Cao W, Chen CP, Wong H-S, You J (2019) Hybrid classifier ensemble for imbalanced data. IEEE Trans Neural Netw Learn Syst 31(4):1387–1400

    Article  MathSciNet  MATH  Google Scholar 

  20. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  MATH  Google Scholar 

  21. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30

  22. Zhao B, Li X, Lu X (2019) Cam-rnn: Co-attention model based rnn for video captioning. IEEE Trans Image Process 28(11)

  23. Shi X, Huang H, Jian P, Tang Y-K (2021) Improving neural machine translation with sentence alignment learning. Neurocomputing 420:15–26

    Article  MATH  Google Scholar 

  24. Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46(3):175–185

    Article  MathSciNet  MATH  Google Scholar 

  25. Wang L (2005) Support Vector Machines: Theory and Applications, vol 177. Springer, Berlin

    Book  MATH  Google Scholar 

  26. Liu W, Gu Y, Ge Y (2024) Multi-factor stock trading strategy based on dqn with multi-bigru and multi-head probsparse self-attention. Appl Intell

  27. Zemzem W, Tagina M (2023) Improving exploration in deep reinforcement learning for stock trading. Int J Comput Appl Technol 72(4):288–295

    Article  Google Scholar 

  28. Wu X, Chen H, Wang J, Troiano L, Loia V, Fujita H (2020) Adaptive stock trading strategies with deep reinforcement learning methods. Inf Sci 538:142–158

    Article  MathSciNet  Google Scholar 

  29. Kim S-H, Park D-Y, Lee K-H (2022) Hybrid deep reinforcement learning for pairs trading. Appli Sci 12(3):944

    Article  MATH  Google Scholar 

  30. Edwards RD, Bassetti W, Magee J (2012) Technical Analysis of Stock Trends. CRC Press, USA

    MATH  Google Scholar 

  31. Sutton RS, Barto AG (2018) Reinforcement Learning: An Introduction. MIT press, USA

    MATH  Google Scholar 

  32. Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A et al (2017) Mastering the game of go without human knowledge. Nature 550(7676):354–359

    Article  MATH  Google Scholar 

  33. Duan Y, Chen X, Houthooft R, Schulman J, Abbeel P (2016) Benchmarking deep reinforcement learning for continuous control. In: International conference on machine learning, pp 1329–1338. PMLR

  34. Li Y, Liu P, Wang Z (2022) Stock trading strategies based on deep reinforcement learning. Sci Program 2022

  35. Li Y, Zheng W, Zheng Z (2019) Deep robust reinforcement learning for practical algorithmic trading. IEEE Access 7:108014–108022

    Article  MATH  Google Scholar 

  36. Hussein A, Gaber MM, Elyan E, Jayne C (2017) Imitation learning: A survey of learning methods. ACM Comput Surv (CSUR) 50(2):1–35

    Article  MATH  Google Scholar 

  37. Liu Y, Liu Q, Zhao H, Pan Z, Liu C (2020) Adaptive quantitative trading: An imitative deep reinforcement learning approach. In: Proceedings of the AAAI conference on artificial intelligence vol 34, pp 2128–2135

  38. Yu P, Lee JS, Kulyatin I, Shi Z, Dasgupta S (2019) Model-based deep reinforcement learning for financial portfolio optimization. In: ICML vol 1, pp 2019

  39. Florence P, Lynch C, Zeng A, Ramirez OA, Wahid A, Downs L, Wong A, Lee J, Mordatch I, Tompson J (2022) Implicit behavioral cloning. In: Conference on robot learning pp 158–168. PMLR

  40. Edwards RD, Magee J, Bassetti WC (2018) Technical Analysis of Stock Trends. CRC Press, UK

    Book  MATH  Google Scholar 

  41. Sun Q, Gong X, Si Y-W (2023) Transaction-aware inverse reinforcement learning for trading in stock markets. Appl Intell 53(23):28186–28206

    Article  MATH  Google Scholar 

  42. Géron A (2022) Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow. O’Reilly Media, Inc, USA

  43. Noshad M, Zeng Y, Hero AO (2019) Scalable mutual information estimation using dependence graphs. In: ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2962–2966. IEEE

  44. Peng C-YJ, Lee KL, Ingersoll GM (2002) An introduction to logistic regression analysis and reporting. J Educ Res 96(1):3–14

    Article  MATH  Google Scholar 

  45. Wan Y, Si Y-W (2017) Adaptive neuro fuzzy inference system for chart pattern matching in financial time series. Appl Soft Comput 57:1–18

  46. Heravi N, Wahid A, Lynch C, Florence P, Armstrong T, Tompson J, Sermanet P, Bohg J, Dwibedi D (2023) Visuomotor control in multi-object scenes using object-aware representations. In: 2023 IEEE international conference on robotics and automation (ICRA 2023) pp 9515–9522

Download references

Funding

This research was supported by National Key Research and Development Program of China (2023YFC2813000, 2023YFC2813002), Science and Technology Plan Project of GuangZhou Nansha (project No. 2023ZD016) and Research Services and Knowledge Transfer Office of University of Macau (file no. MYRG2022-00162-FST and MYRG2019-00136-FST).

Author information

Authors and Affiliations

Authors

Contributions

Qizhou Sun conceived and designed the analysis. Qizhou Sun and Yufan Xie collected the data and performed the analysis. Qizhou Sun, Yufan Xie and Yain-Whar Si wrote the paper.

Corresponding author

Correspondence to Yain-Whar Si.

Ethics declarations

Competing interests

The authors have no competing interests as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, Q., Xie, Y. & Si, YW. Attention-Based Behavioral Cloning for algorithmic trading. Appl Intell 55, 74 (2025). https://doi.org/10.1007/s10489-024-06064-y

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-024-06064-y

Keywords