Comparative performance of machine learning-selected portfolios from dynamic CSI300 constituents: forward vs. backward adjusted stock prices

Zhou, Ligang; Chen, Xiaoguo; Tang, Xiaolei

doi:10.1007/s10489-024-06107-4

Comparative performance of machine learning-selected portfolios from dynamic CSI300 constituents: forward vs. backward adjusted stock prices

Published: 17 December 2024

Volume 55, article number 176, (2025)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

76 Accesses
1 Altmetric
Explore all metrics

Abstract

Most existing studies utilize backward-adjusted stock prices from data platforms to develop and backtest investment strategies using machine learning models. However, these prices are not point-in-time data and may introduce look-ahead bias, raising concerns about the reliability of model performance. To examine the impact of different price adjustment methods, we compare the predictive performance of various machine learning models and the backtesting results of portfolios constructed using these models with both forward-adjusted and backward-adjusted stock prices. Our study, conducted from 2012 to 2022, evaluates the real-world viability of investment strategies on the dynamic constituents of the CSI300 index. The empirical results reveal that while certain measures of machine learning models’ predictive performance may not be significantly affected by the stock price adjustment method, the backtesting performance under backward-adjusted stock prices is overestimated compared to that under forward-adjusted stock prices. This research provides evidence for the impact of historical stock price adjustments in developing machine learning models and presents a comprehensive framework for applying these techniques to the management of index constituent portfolios, thereby bridging the gap between predictive modeling and practical investment strategies.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluation of forecasting methods from selected stock market returns

Article Open access 02 December 2019

State-dependent stock selection in index tracking: a machine learning approach

Article 26 April 2021

A statistical learning approach for stock selection in the Chinese stock market

Article Open access 29 April 2019

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

The data that support this study were obtained from China Stock Market & Accounting Research Database (CSMAR), available at https://data.csmar.com/. As a commercial and professional financial data provider, CSMAR restricts free sharing of its data due to licensing terms. However, additional details about the specific field names and database tables used in this research can be made available upon request for purposes of reproducing and building upon the analytical work.

References

Kumbure MM, Lohrmann C, Luukka P, Porras J (2022) Machine learning techniques and data for stock market forecasting: A literature review. Expert Syst Appl 197
Olorunnimbe K, Viktor H (2023) Deep learning in the stock market—a systematic survey of practice, backtesting, and applications. Artif Intell Rev 56(3):2057–2109
Article MATH Google Scholar
Thakkar A, Chaudhari K (2021) Fusion in stock market prediction: A decade survey on the necessity, recent developments, and potential future directions. Inf Fusion 65:95–107
Article MATH Google Scholar
Kim H, Jun S, Moon KS (2022) Stock market prediction based on adaptive training algorithm in machine learning. Quant Fin 22(6):1133–1152
Article MathSciNet MATH Google Scholar
Akyildirim E, Nguyen DK, Sensoy A, Šikić M (2023) Forecasting high-frequency excess stock returns via data analytics and machine learning. Eur Fin Manag 29(1):22–75
Article MATH Google Scholar
Fabozzi FJ, de Prado ML (2018) Being honest in backtest reporting: a template for disclosing multiple tests. J Portf Manag 45(1):141–147
Article MATH Google Scholar
Krauss C, Do XA, Huck N (2017) Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S &P 500. Eur J Oper Res 259(2):689–702
Article MATH Google Scholar
Ghosh P, Neufeld A, Sahoo JK (2022) Forecasting directional movements of stock prices for intraday trading using LSTM and random forests. Fin Res Lett 46
Wang T, Guo J, Shan Y, Zhang Y, Peng B, Wu Z (2023) A knowledge graph-GCN-community detection integrated model for large-scale stock price prediction. Appl Soft Comput 145
Xu C, Huang H, Ying X, Gao J, Li Z, Zhang P, Xiao J, Zhang J, Luo J (2022) HGNN: Hierarchical graph neural network for predicting the classification of price-limit-hitting stocks. Inf Sci 607:783–798
Article MATH Google Scholar
Wolff D, Echterling F (2024) Stock picking with machine learning. J Forecast 43(1):81–102
Article MathSciNet MATH Google Scholar
Han Y, Kim J, Enke D (2023) A machine learning trading system for the stock market based on N-period min-max labeling using xgboost. Expert Syst Appl 211:118581
Tang H, Dong P, Shi Y (2019) A new approach of integrating piecewise linear representation and weighted support vector machine for forecasting stock turning points. Appl Soft Comput 78:685–696
Article MATH Google Scholar
Nti IK, Adekoya AF, Weyori BA (2020) A comprehensive evaluation of ensemble learning for stock-market prediction. J Big Data 7(1):20
Article MATH Google Scholar
Markowitz H (1952) Portfolio selection, The. J Fin 7(1):77–91
MATH Google Scholar
Bodnar T, Mazur S, Okhrin Y (2017) Bayesian estimation of the global minimum variance portfolio. Eur J Oper Res 256(1):292–307
Article MathSciNet MATH Google Scholar
Black F, Litterman R (1992) Global portfolio optimization. Fin Anal J 48(5):28–43
Article MATH Google Scholar
Wu M-E, Syu J-H, Lin JC-W, Ho J-M (2021) Portfolio management system in equity market neutral using reinforcement learning. Appl Intell 51(11):8119–8131
Article MATH Google Scholar
Wu X, Chen H, Wang J, Troiano L, Loia V, Fujita H (2020) Adaptive stock trading strategies with deep reinforcement learning methods. Inf Sci 538:142–158
Article MathSciNet Google Scholar
Zhang Y, Zhao P, Wu Q, Li B, Huang J, Tan M (2020) Cost-sensitive portfolio selection via deep reinforcement learning. IEEE Trans Knowl Data Eng 34(1):236–248
Singh V, Chen S-S, Singhania M, Nanavati B, Gupta A et al (2022) How are reinforcement learning and deep learning algorithms used for big data based decision making in financial industries-a review and research agenda. Int J Inf Manag Data Insights 2(2)
Esteve V, Prats MA (2010) Threshold cointegration and nonlinear adjustment between stock prices and dividends. Appl Econ Lett 17(4):405–410
Article MATH Google Scholar
Fan Y, Gao Y (2024) Short selling, informational efficiency, and extreme stock price adjustment. Int Rev Econ Fin 89(A):1009–1028
Truong C, Corrado C (2014) Options trading volume and stock price response to earnings announcements. Rev Account Stud 19(1):161–209
Article MATH Google Scholar
Isichenko M (2021) Quantitative portfolio management: The art and science of statistical arbitrage, John Wiley & Sons
Long J, Chen Z, He W, Wu T, Ren J (2020) An integrated framework of deep learning and knowledge graph for prediction of stock price trend: An application in chinese stock exchange market. Appl Soft Comput 91
Li W, Mei F (2020) Asset returns in deep learning methods: An empirical analysis on sse 50 and csi 300. Res Int Bus Fin 54
Lin Y, Lin Z, Liao Y, Li Y, Xu J, Yan Y (2022) Forecasting the realized volatility of stock price index: A hybrid model integrating ceemdan and lstm. Expert Syst Appl 206
Lv D, Yuan S, Li M, Xiang Y (2019) An empirical study of machine learning algorithms for stock daily trading strategy. Math Probl Eng 2019(1):7816154
Article MATH Google Scholar
Hao J, He F, Ma F, Zhang S, Zhang X (2023) Machine learning vs deep learning in stock market investment: an international evidence. Ann Oper Res March 1–23
Breiman L (2001) Random forests. Mach Learn 45:5–32
Article MATH Google Scholar
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. in: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 785–794
Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu TY (2017) Lightgbm: a highly efficient gradient boosting decision tree. in: Advances in Neural Information Processing Systems 30, pp 3149–3157
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: Machine learning in Python. J Mach Learn Res 12:2825–2830
MathSciNet MATH Google Scholar
Markowitz H (1952) Portfolio selection. The. J Fin 7(1):77–91
MATH Google Scholar
Bergstra J, Yamins D, Cox D (2013) Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. in: International conference on machine learning, PMLR, pp 115–123

Download references

Funding

No funding was received for conducting this study.

Author information

Authors and Affiliations

School of Business, Macau University of Science and Technology, Taipa, Macau
Ligang Zhou, Xiaoguo Chen & Xiaolei Tang

Authors

Ligang Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoguo Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xiaolei Tang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Ligang Zhou: Conceptualization, Data analysis, Writing - original draft preparation. Xiaoguo Chen: Data collection, Visualization. Xiaolei Tang: Investigation, Writing - review and editing.

Corresponding author

Correspondence to Ligang Zhou.

Ethics declarations

Competing interests

The authors have no competing interests to declare that are relevant to the content of this article.

Conflicts of Interest

The authors have no financial interests that could have appeared to influence the work reported in this paper.

Ethics approval

Ethical approval and informed consent were not necessary for the use of these data in this study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhou, L., Chen, X. & Tang, X. Comparative performance of machine learning-selected portfolios from dynamic CSI300 constituents: forward vs. backward adjusted stock prices. Appl Intell 55, 176 (2025). https://doi.org/10.1007/s10489-024-06107-4

Download citation

Accepted: 22 November 2024
Published: 17 December 2024
DOI: https://doi.org/10.1007/s10489-024-06107-4

Keywords