Skip to main content

A Study on Relevant Features for Intraday S&P 500 Prediction Using a Hybrid Feature Selection Approach

  • Conference paper
  • First Online:
Machine Learning, Optimization, and Data Science (LOD 2021)

Abstract

This paper investigates relevant features for the prediction of intraday S&P 500 returns. In contrast to most previous research, the problem is approached as a four class classification problem to account for the magnitude of the returns and not only the direction of price movements. A novel framework for feature selection using a hybrid approach is developed that combines correlation as a fast filter method, with the wrapper method differential evolution feature selection (DEFS) that deploys distance-based classifiers (k-nearest neighbor, fuzzy k-nearest neighbor, and multi-local power mean fuzzy k-nearest neighbor) as evaluation criterion. The experimental results show that feature selection successfully discarded features for this application to improve the test set accuracies or, at a minimum, lead to similar accuracies than using the entire feature subset. Moreover, all setups in this study ranked technical indicators such as 5-day simple moving average as the most relevant features in this application. In contrast, the features based on other stock indices, commodities, and simple price and volume information were a minority within the top 10 and top 50 features. The prediction accuracies for the positive return class considerably higher than the negative class predictions with over \(70\%\) accuracy compared to \(30\%\).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The MATLAB code of the updated MLPM-FKNN algorithm can be found from https://github.com/MahindaMK/Multi-local-Power-means-based-fuzzy-k-nearest-neighbor-algorithm-MLPM-FKNN.

References

  1. Kazem, A., Sharifia, E., Hussainb, F.K., Saberic, M., Hussain, O.K.: Support vector regression with chaos-based firefly algorithm for stock market price forecasting. Appl. Soft Comput. 13, 947–958 (2013). https://doi.org/10.1016/j.asoc.2012.09.024

    Article  Google Scholar 

  2. Zhang, X., Hu, Y., Xie, K., Wang, S., Ngai, E.W.T., Liu, M.: A causal feature selection algorithm for stock prediction modeling. Neurocomputing 142, 48–59 (2014). https://doi.org/10.1016/j.neucom.2014.01.057

    Article  Google Scholar 

  3. Tsai, C.F., Hsiao, Y.C.: Combining multiple feature selection methods for stock prediction: union, intersection, and multi-intersection approaches. Decis. Support Syst. 50(1), 258–269 (2010). https://doi.org/10.1016/j.dss.2010.08.028

    Article  Google Scholar 

  4. Lohrmann, C., Luukka, P.: Classification of intraday S&P500 returns with a random forest. Int. J. Forecast. 35, 390–407 (2019). https://doi.org/10.1016/j.ijforecast.2018.08.004

    Article  Google Scholar 

  5. Kittler, J., Mardia, K.V.: Statistical pattern recognition in image analysis. J. Appl. Stat. 21, 61–75 (1994)

    Article  Google Scholar 

  6. Liang, J., Yang, S., Winstanley, A.: Invariant optimal feature selection: a distance discriminant and feature ranking based solution. Pattern Recogn. 41, 1429–1439 (2008). https://doi.org/10.1016/j.patcog.2007.10.018

    Article  MATH  Google Scholar 

  7. Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40, 16–28 (2014)

    Article  Google Scholar 

  8. Kumbure, M.M., Luukka, P., Collan, M.: An enhancement of fuzzy K-nearest neighbor classifier using multi-local power means. In: Proceeding of the 11th Conference of the European Society for Fuzzy Logic and Technology (EUSFLAT), pp. 83–90, Atlantis Press (2019)

    Google Scholar 

  9. Zhang, N., Lin, A., Shang, P.: Multidimensional k-nearest neighbor model based on EEMD for financial time series forecasting. IPhysica A Stat. Mech. Appl. 477, 161–173 (2017)

    Article  MathSciNet  Google Scholar 

  10. Cao, H., Lin, T., Li, Y., Zhang, H.: Stock price pattern prediction based on complex network and machine learning. Complexity 2019 (2019)

    Google Scholar 

  11. Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theor. 13(1), 21–27 (1967). https://doi.org/10.1109/TIT.1967.1053964

    Article  MATH  Google Scholar 

  12. Keller, J.M., Gray, M.R., Givens, J.A.: A fuzzy k-nearest neighbor algorithm. IEEE Trans. Syst. Man Cybern. 15(4), 580–585 (1985). https://doi.org/10.1109/TSMC.1985.6313426

    Article  Google Scholar 

  13. Price, K., Storn, R.M., Lampinen, J.A.: Differential Evolution - a Practical Approach to Global Optimization. Springer, Heidelberg (2005)

    MATH  Google Scholar 

  14. Yang, F., Chen, Z., Li, J., Tang, L.: A novel hybrid stock selection method with stock prediction. Appl. Soft Comput. J. 142, 820–831 (2019)

    Article  Google Scholar 

  15. Khushaba, R.N., Al-Ani, A., Al-Jumaily, A.: Feature subset selection using differential evolution and a statistical repair mechanism. Expert Syst. Appl. 38, 11515–11526 (2011). https://doi.org/10.1016/j.eswa.2011.03.028

    Article  Google Scholar 

  16. Bisoi, R., Dash, P.K., Parida, A.K.: Hybrid variational mode decomposition and evolutionary robust kernel extreme learning machine for stock price and movement prediction on daily basis. Appl. Soft Comput. 74, 652–676 (2019)

    Article  Google Scholar 

  17. Yang, F., Chen, Z., Li, J., Tang, L.: A novel hybrid stock selection method with stock prediction. Appl. Soft Comput. 80, 820–831 (2019)

    Article  Google Scholar 

  18. Nabipour, M., Nayyeri, P., Jabani, H., Shahab, S., Mosavi, A.: Predicting stock market trends using machine learning and deep learning algorithms via continuous and binary data; a comparative analysis. IEEE Access 8, 150199–150212 (2020). https://doi.org/10.1109/ACCESS.2020.3015966

    Article  Google Scholar 

  19. Patel, J., Shah, S., Thakkar, P., Kotecha, K.: Predicting stock market index using fusion of machine learning techniques. Expert Syst. Appl. 42(4), 2162–2172 (2015). https://doi.org/10.1016/j.eswa.2014.10.031

    Article  Google Scholar 

  20. Yahoo Finance. https://finance.yahoo.com/. Accessed 22 Oct 2020

  21. FRED Economic Data. https://fred.stlouisfed.org. Accessed 25 Oct 2020

  22. Kumbure, M. M., Lohrmann, C., Luukka, P., Porras, J.: Machine learning techniques and data for stock market forecasting: a literature review. Expert Syst. Appl. (2021, Submitted)

    Google Scholar 

  23. Kumbure, M.M., Luukka, P., Collan, M.: A new fuzzy k-nearest neighbor classifier based on the Bonferroni mean. Pattern Recogn. Lett. 140, 172–178 (2020). https://doi.org/10.1016/j.patrec.2020.10.005

    Article  Google Scholar 

  24. Teixeira, L.A., De Oliveira, A.L.I.: A method for automatic stock trading combining technical analysis and nearest neighbor classification. Expert Syst. Appl. 37(10), 6885–6890 (2010). https://doi.org/10.1016/j.eswa.2010.03.033

    Article  Google Scholar 

Download references

Acknowledgment

This research was supported by the Finnish Foundation for Share Promotion (Pörssisäätiö).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mahinda Mailagaha Kumbure .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mailagaha Kumbure, M., Lohrmann, C., Luukka, P. (2022). A Study on Relevant Features for Intraday S&P 500 Prediction Using a Hybrid Feature Selection Approach. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2021. Lecture Notes in Computer Science(), vol 13163. Springer, Cham. https://doi.org/10.1007/978-3-030-95467-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-95467-3_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-95466-6

  • Online ISBN: 978-3-030-95467-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics