Abstract
Causal discovery in observational data is crucial to a variety of scientific and business research. Although many causal discovery algorithms have been proposed in recent decades, none of them is effective enough in dealing with high-dimensional discrete data. The main challenge is the complex interactions among large volume of variables, leading to numerous spurious causalities found. In this work, we propose a novel multiple-cause discovery method combined with structure learning (McDSL) to eliminate the spurious causalities. The method is carried out in two phases. In the first phase, conditional independence test is used to distinguish direct causal candidates from the indirect ones. In the second phase, causal direction of multi-cause structure is carefully determined with a hybrid causal discovery method. Validation experiments on synthetic data showed that McDSL is reliable in discovering multi-cause structures and eliminating indirect causes. We then applied this algorithm in discovering multiple causes of stock return based on 13-year historical financial data of the Shanghai Stock Exchanges of China, and established a stock prediction model. Experimental results showed that the McDSL discovered causes revealed changes of key risk factors of the stock market over 13 years, which indicated investors should change their investment strategy over time. Moreover, the causes discovered by McDSL have better performance in predicting stock return than that of other common filter-based feature selection algorithms.
Similar content being viewed by others
Notes
If \(|S|,|S'|=1\), that the above definition will be transformed into the definition in article (Peters et al. 2011).
\(\#\) Factor represents that Factor \(\#\) is inferred as the causes of return in training set by McDSL.
‘NoFS’ indicates no feature selection. Best results are highlighted in bold. The value in parentheses indicates the performance difference with the corresponding our algorithm. ‘Average’ is the average value of 6 algorithms on 7 baseline models.
References
Agbabiaka TB, Savović J, Ernst E (2008) Methods for causality assessment of adverse drug reactions. Drug Saf 310(1):21–37
Aliferis CF, Statnikov A, Tsamardinos I, Mani S, Koutsoukos XD (2010) Local causal and markov blanket induction for causal discovery and feature selection for classification part i: algorithms and empirical evaluation. J Mach Learn Res 11:171–234
Andreu L, Aldás J, Bigné JE, Mattila AS (2010) An analysis of e-business adoption and its impact on relational quality in travel agency-supplier relationships. Tour Manag 310(6):777–787
Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC Press, Boca Raton
Cai R, Zhang Z, Hao Z (2011) Bassum: a Bayesian semi-supervised method for classification feature selection. Pattern Recognit 440(4):811–820
Cai R, Zhang Z, Hao Z (2013a) Causal gene identification using combinatorial v-structure search. Neural Netw 43:63–71
Cai R, Zhang Z, Hao Z (2013b) Sada: a general framework to support robust causation discovery. In: Proceedings of the 30th international conference on machine learning, pp 208–216
Chang YC, Hsieh YL, Chen CC, Hsu WL (2015) A semantic frame-based intelligent agent for topic detection. Soft Comput. doi:10.1007/s00500-015-1695-4
De Morais SR, Aussem A (2010) A novel Markov boundary based feature subset selection algorithm. Neurocomputing 730(4):578–584
Esposito C, Ficco M, Palmieri F, Castiglione A (2015) Smart cloud storage service selection based on fuzzy logic, theory of evidence and game theory. IEEE Trans Comput. doi:10.1109/TC.2015.2389952
Fama EF, French KR (1992) The cross-section of expected stock returns. J Financ 470(2):427–465
Fernandez-Lozano C, Seoane JA, Gestal M, Gaunt TR, Dorado J, Campbell C (2015) Texture classification using feature selection and kernel-based techniques. Soft Comput doi:10.1007/s00500-014-1573-5
Fu R, Qin B, Liu T (2015) Open-categorical text classification based on multi-lda models. Soft Comput 190(1):29–38
Hoyer PO, Janzing D, Mooij JM, Peters J, Schölkopf B (2009) Nonlinear causal discovery with additive noise models. In: Advances in neural information processing systems, pp 689–696
Kano Y, Shimizu S (2003) Causal inference using nonnormality. In: Proceedings of the international symposium on science of modeling, the 30th anniversary of the information criterion, pp 261–270
Karahoca A, Tunga MA (2015) A polynomial based algorithm for detection of embolism. Soft Comput 190(1):167–177
Koller D, Sahami M (1996) Toward optimal feature selection. Proc int conf mach Learn 20(1113):284–292
Lee M-C (2009) Using support vector machine with a hybrid feature selection method to the stock trend prediction. Expert Syst Appl 360(8):10896–10904
Mooij J, Janzing D, Peters J, Schölkopf B (2009) Regression by dependence minimization and its application to causal inference in additive noise models. In: Proceedings of the 26th annual international conference on machine learning, pp 745–752. ACM
Pearl J (2000) Causality: models, reasoning and inference, vol 29. Cambridge Univ Press, Cambridge
Peters J, Janzing D, Gretton A, Schölkopf B (2009) Detecting the direction of causal time series. In: Proceedings of the 26th annual international conference on machine learning, pp 801–808. ACM
Peters J, Janzing D, Schölkopf B (2010) Identifying cause and effect on discrete data using additive noise models. In: International conference on artificial intelligence and statistics, pp 597–604
Peters J, Janzing D, Scholkopf B (2011) Causal inference on discrete data using additive noise models. IEEE Trans Pattern Anal Mach Intell 330(12):2436–2450
Sethi R (1996) Endogenous regime switching in speculative markets. Struct Change Econ Dyn 70(1):99–118
Shimizu S, Hoyer PO, Hyvärinen A, Kerminen A (2006) A linear non-gaussian acyclic model for causal discovery. J Mach Learn Res 7:2003–2030
Sobel ME (1996) An introduction to causal inference. Sociol Methods Res 240(3):353–379
Spirtes P, Glymour CN, Scheines R (2000) Causation, prediction, and search, vol 81. MIT press, Cambridge
Tibshirani R (1994) Regression shrinkage and selection via the lasso. J Royal Stat Soc 58(1):267–288
Tsai C-F, Hsiao Y-C (2010) Combining multiple feature selection methods for stock prediction: union, intersection, and multi-intersection approaches. Decis Support Syst 500(1):258–269
Tsai C-F, Lin Y-C, Yen DC, Chen Y-M (2011) Predicting stock returns by classifier ensembles. Appl Soft Comput 110(2):2452–2459
Tsamardinos I, Aliferis CF, Statnikov A (2003) Time and sample efficient discovery of markov blankets and direct causal relations. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining, pp 673–678. ACM
Zhang J, Spirtes P (2008) Detection of unfaithfulness and robust causal inference. Minds Mach 180(2):239–271
Zhang X, Yong H, Xie K, Wang S, Ngai EWT, Liu M (2014) A causal feature selection algorithm for stock prediction modeling. Neurocomputing 142:48–59
Zhu Z, Ong Y-S, Dash M (2007) Markov blanket-embedded genetic algorithm for gene selection. Pattern Recognit 400(11):3236–3248
Zunino L, Zanin M, Tabak BM, Pérez DG, Rosso OA (2010) omplexity-entropy causality plane: A useful approach to quantify the stock market inefficiency. Phys A Stat Mech Appl 3890(9):1891–1901
Zuo Y, Kita E (2012) Stock price forecast using Bayesian network. Expert Syst Appl 390(8):6729–6737
Acknowledgments
This research was partly supported by the National Natural Science Foundation of China (71271061, 70801020), Science and Technology Planning Project of Guangdong Province, China (2010B010600034, 2012B091100192), Guangdong Natural Science Foundation Research Team (S2013030015737), and Business Intelligence Key Team of Guangdong University of Foreign Studies (TD1202).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Chen, W., Hao, Z., Cai, R. et al. Multiple-cause discovery combined with structure learning for high-dimensional discrete data and application to stock prediction. Soft Comput 20, 4575–4588 (2016). https://doi.org/10.1007/s00500-015-1764-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-015-1764-8