Skip to main content

Distilling Financial Models by Symbolic Regression

  • Conference paper
  • First Online:
Book cover Machine Learning, Optimization, and Data Science (LOD 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 13164))

  • 1805 Accesses

Abstract

Symbolic Regression has been widely used during the last decades for inferring complex models. The foundation of its success is due to the ability to recognize data correlations, defining non-trivial and interpretable models. In this paper, we apply Symbolic Regression to explore possible uses and obstacles for describing stochastic financial processes. Symbolic Regression (SR) with Genetic Programming (GP) is used to extract financial formulas, inspired by the theory of financial stochastic processes and Itô Lemma. For this purpose, we introduce in the model two operators: the derivative and the integral. The experiments are conducted on five market indices that are reliable at defining the evolution of the processes in time: Tokyo Stock Price Index (TOPIX), Standard & Poors 500 Index (SPX), Dow Jones (DJI), FTSE 100 (FTSE) and Nasdaq Composite (NAS). To avoid both trivial and not interpretable results, an error-complexity optimization is accomplished. We perform computational experiments to obtain and investigate simple and accurate financial models. The Pareto Front is used to select between multiple candidates removing the over specified ones. We also test Eureqa as a benchmark to extract invariant equations. The results we obtain highlight the limitations and some pursuable paths in the study of financial processes with SR and GP techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Koza, J.R.: 1992 Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)

    MATH  Google Scholar 

  2. Wolf, J.B.: 2007 evolutionary computation: a unified approach. In: A De Jong, K., (ed.) The Quarterly Review of Biology. A Bradford Book, vol. 82, p. 46. MIT Press, Cambridge (2006). ISBN: 0-262-04194-4 82

    Google Scholar 

  3. Villaverde, A.F., Banga, J.R.: Reverse engineering and identification in systems biology: strategies, perspectives and challenges. J. R. Soc. Interface 11, 20130505 (2014)

    Article  Google Scholar 

  4. Schmidt, M., Lipson, H.: Coevolution of fitness predictors. IEEE Trans. Evol. Comput. 12, 736–749 (2008)

    Article  Google Scholar 

  5. Schmidt, M., Lipson, H.: Co-evolution of fitness maximizers and fitness predictors. GECCO Late Breaking Paper (2005)

    Google Scholar 

  6. Udrescu, S.M., Tegmark, M.: AI Feynman: a physics-inspired method for symbolic regression. Sci. Adv. 6(16), eaay2631 (2020)

    Google Scholar 

  7. Saxena, A., Lipson, H., Valero-Cuevas, F.J.: Functional inference of complex anatomical tendinous networks at a macroscopic scale via sparse experimentation. PLoS Comput. Biol. (2012 )

    Google Scholar 

  8. Pandey, S., Purohit, G.N., Munshi, U.M.: In: Munshi, U.M., Verma, N. (eds.) Data Science Landscape. SBD, vol. 38, pp. 321–326. Springer, Singapore (2018). https://doi.org/10.1007/978-981-10-7515-5_24

  9. Tan, K.C., Wang, L.F., Lee, T.H., Vadakkepat, P.: Evolvable hardware in evolutionary robotics. Autonom. Robot. 16, 5–21 (2004). https://doi.org/10.1023/B:AURO.0000008669.57012.88

  10. Trudeau, A., Clark, C.M.: Multi-robot path planning via genetic programming. ARMS 2019 Workshop (AAMAS), arXiv:1912.09503v1 (2019)

  11. Claveria, O., Enric, M., Salvador, T.: Evolutionary computation for macroeconomic forecasting. Comput. Econ. 53(2), 833–849 (2019)

    Article  Google Scholar 

  12. Michell, K., Kristjanpoller, W.: Generating trading rules on US Stock Market using strongly typed genetic programming. Soft. Comput. 24(5), 3257–3274 (2019). https://doi.org/10.1007/s00500-019-04085-1

    Article  Google Scholar 

  13. Taghian M., Asadi A., Safabakhsh R.: Learning financial asset-specific trading rules via deep reinforcement learning. arXiv preprint arXiv:2010.14194 (2020)

  14. Butler, K.T., Davies, D.W., Cartwright, H.: Machine learning for molecular and materials science. Nature 559, 547–555 (2018)

    Article  Google Scholar 

  15. Rudy, S.H., Brunton, S.L., Proctor, J.L., Kutz, J.N.: Data-driven discovery of partial differential equations. Sci. Adv. 3(4), e1602614 (2017)

    Google Scholar 

  16. Brunton, S.L., Proctor, J.L., Kutz, J.N.: Discovering governing equations from data by sparse identification of nonlinear dynamical systems. PNAS 113(15), 3932–3937 (2016)

    Article  MathSciNet  Google Scholar 

  17. Schmidt, M., Lipson, H.: Distilling free-form natural laws from experimental data. Science 324(5923), 81–85 (2009)

    Article  Google Scholar 

  18. Koza, J.R., Keane, A., Rice, J.P.: Performance improvement of machine learning via automatic discovery of facilitating functions as applied to a problem of symbolic system identification. In: IEEE International Conference on Neural Networks, San Francisco, IEEE, pp. 191–198 (1993)

    Google Scholar 

  19. Forrest, S.: Genetic algorithms: principles of natural selection applied to computation. Science 261(5123), 872–878 (1993)

    Article  Google Scholar 

  20. Chen, Q., Xue, B., Zhang, M.: Improving generalization of genetic programming for symbolic regression with angle-driven geometric semantic operators. IEEE Trans. Evolution. Comput. 23(3) (2019)

    Google Scholar 

  21. Chen, Q., Zhang, M., Xue, B.: Structural risk minimization-driven genetic programming for enhancing generalization in symbolic regression. IEEE Trans. Evolution. Comput. 23(4) (2019)

    Google Scholar 

  22. Sheta, F., Ahmed, S.E.M., Farid, H.: Evolving stock market prediction models using multi-gene symbolic regression genetic programming. Artif. Intell. Mach. Learn. 15(1), 11–20 (2015)

    Google Scholar 

  23. Wick, C.: Deep learning. Informatik-Spektrum 40(1), 103–107 (2016). https://doi.org/10.1007/s00287-016-1013-2

    Article  Google Scholar 

  24. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  25. Graves A.: Supervised Sequence Labelling with Recurrent Neural Networks, vol. 385. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-24797-2

  26. Lin, H.W., Tegmark, M., Rolnick, D.: Why does deep and cheap learning work so well? J. Stat. Phys. 168(6), 1223–1247 (2017). https://doi.org/10.1007/s10955-017-1836-5

    Article  MathSciNet  MATH  Google Scholar 

  27. Wu, T., Tegmark, M.: Toward an AI physicist for unsupervised learning. Phys. Rev. E 100(3) arXiv:1810.10525v4 (2018)

  28. McRee, R.K.: Symbolic regression using nearest neighbor indexing. In: GECCO 2010: Proceedings of the 12th Annual Conference Companion on Genetic and Evolutionary Computation, pp. 1983–1990 (2010)

    Google Scholar 

  29. Stijven, S., Minnebo, W., Vladislavleva, K.: Separating the wheat from the chaff: on feature selection and feature importance in regression random forests and symbolic regression. In: GECCO 2011: Proceedings of the 13th Annual Conference Companion on Genetic and Evolutionary Computation (2011)

    Google Scholar 

  30. Cavicchioli, M.: Higher Order Moments of Markov Switching Varma Models. Cambridge University Press, Cambridge (2016)

    Google Scholar 

  31. Charles, A., Darné, O.: The accuracy of asymmetric GARCH model estimation. Int. Econ. 157 (2019)

    Google Scholar 

  32. Hyndman, R.J., Athanasopoulos, G.: Forecasting: Principles and Practice. oTexts, Monash University, Australia (2015)

    Google Scholar 

  33. Tsay, R.S.: Multivariate Time Series Analysis: With R and Financial Applications. Wiley, Hoboken (2013)

    Google Scholar 

  34. Black, F.: Noise. Wiley, Hoboken (1986)

    Google Scholar 

  35. Fama, E.F.: Random walks in stock market prices. Financ. Anal. J. 21(5), 55–59 (1965)

    Google Scholar 

  36. Merton, R.C.: Lifetime portfolio selection under uncertainty: the continuous-time case. Rev. Econ. Stat. 51, 247 (1969)

    Article  Google Scholar 

  37. Merton, R.C.: Continuous-Time Finance. Basil Blackwell, Oxford (1990)

    MATH  Google Scholar 

  38. Black, F., Scholes, M.: The pricing of options and corporate liabilities. J. Polit. Econ. 81(3), 637–654 (1973)

    Article  MathSciNet  Google Scholar 

  39. Itô, K.: 1944 Stochastic Integral. Proc. Imperial Acad. 20(8), 519–524 (1944)

    MathSciNet  MATH  Google Scholar 

  40. Shreve, S.: Stochastic Calculus for Finance II: Continuous-Time Models. Springer, New York (2004)

    Google Scholar 

  41. Stochastic Differential Equations. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-642-14394-6_9

  42. Hull, J.C.: Options, Futures, and Other Derivatives. 10th edn., Pearson, London (2018)

    Google Scholar 

  43. Vladislavleva, E.J., Smits, G.F., Hertog, D.: Order of nonlinearity as a complexity measure for models generated by symbolic regression via pareto genetic programming. IEEE Trans. Evolution. Comput. 13(2) (2009)

    Google Scholar 

  44. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evolution. Comput. 6(2) (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Gabriele La Malfa , Emanuele La Malfa , Roman Belavkin , Panos M. Pardalos or Giuseppe Nicosia .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 656 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

La Malfa, G., La Malfa, E., Belavkin, R., Pardalos, P.M., Nicosia, G. (2022). Distilling Financial Models by Symbolic Regression. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2021. Lecture Notes in Computer Science(), vol 13164. Springer, Cham. https://doi.org/10.1007/978-3-030-95470-3_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-95470-3_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-95469-7

  • Online ISBN: 978-3-030-95470-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics