Abstract
Existing causal discovery algorithms are usually not effective and efficient enough on high dimensional data. Because the high dimensionality reduces the discovered accuracy and increases the computation complexity. To alleviate these problems, we present a three-phase approach to learn the structure of nonlinear causal models by taking the advantage of feature selection method and two state of the art causal discovery methods. In the first phase, a greedy search method based on Max-Relevance and Min-Redundancy is employed to discover the candidate causal set, a rough skeleton of the causal network is generated accordingly. In the second phase, constraint-based method is explored to discover the accurate skeleton from the rough skeleton. In the third phase, direction learning algorithm IGCI is conducted to distinguish the direction of causalities from the accurate skeleton. The experimental results show that the proposed approach is both effective and scalable, particularly with interesting findings on the high dimensional data.
Similar content being viewed by others
References
Pearl J (2009) Causality: models, reasoning and inference, 2nd edn. Cambridge University Press, Cambridge, UK
Spirtes P, Glymour CN, Scheines R (2001) Causation, prediction, and search, 2nd edn. MIT Press, Cambridge, MA
Tsamardinos I, Brown LE, Aliferis CF (2006) The max-min hill-climbing bayesian network structure learning algorithm. Mach Learn 65(1):31–78
Chickering DM (2003) Optimal structure identication with greedy search. J Mach Learn Res 3:507–554
Shimizu S, Hoyer PO, Hyvärinen A et al (2006) A linear non-Gaussian acyclic model for causal discovery. J Mach Learn Res 7:2003–2030
Hoyer PO, Janzing D, Mooij J et al (2008) Nonlinear causal discovery with additive noise models. NIPS 21:689–696
Peters J, Janzing D, Schölkopf B (2010) Identifying cause and effect on discrete data using additive noise models. In: International Conference on Artificial Intelligence and Statistics, pp 597–604
Janzing D, Mooij J, Zhang K et al (2012) Information-geometric approach to inferring causal directions. Artif Intell 182:1–31
Herskovits E (1991) Computer-Based Probabilistic-Network Construction. PhD dissertation, Stanford University, Stanford, CA
Chickering DM (2002) Optimal structure identification with greedy search. J Mach Learn Res 3:507–554
Meek C (1997) Graphical models: selecting causal and statistical models. PhD thesis, Carnegie Mellon University, Pittsburgh, PA
Schwarz GE (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Peng H, Long F, Ding C (2005) Variable selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Zhang K, Peters J, Janzing D et al (2012) Kernel-based conditional independence test and application in causal discovery. arXiv preprint arXiv: 1202.3775
Yeung RW (2002) A first course in information theory. Springer, Berlin Heidelberg New York
Kraskov A, Stögbauer H, Grassberger P (2004) Estimating mutual information. Phys Rev E 69(3):066138–066154
Hadley SW, Pelizzari C, Chen GTY (1996) Registrationof localization images by maximization of mutual information. In: Proceedings of Annual Meeting of the American Association Physicists in Medicine
Kwak N, Choi CH (2002) Input feature selection by mutual information based on parzen window. IEEE Trans Pattern Anal Mach Intell 24(12):1667–1671
Kelly L, Clark J, Gilliland DG (2002) Comprehensive genotypic analysis of leukemia: clinical and therapeutic implications. Curr Opin Oncol 14(1):10–18
Wong ETL, Jenne DE, Zimmer M et al (1999) Changes in chromatin organization at the neutrophil elastase locus associated with myeloid cell differentiation. Blood 94(11):3730–3736
Gullberg M, Noreus K, Brattsand G et al (1990) Purification an characterization of a 19-kilodalton intracellular protein. An activation-regulated putative protein kinase C substrate of T lymphocytes. J Biol Chem 265(29):17499–17505
Tang LJ, Jiang JH, Wu HL et al (2009) Variable selection using probability density function similarity for support vector machine classification of high dimensional microarray data. Talanta 79(2):260–267
Acknowledgments
This work is financially supported by Natural Science Foundation of China (61100148, 61202269, 61472089), Foundation for Distinguished Young Talents in Higher Education of Guangdong, China (LYM11060), Key Technology Research and Development Programs of Guangdong Province (2012B01010029), Science and Technology Plan Project of Guangzhou City(12C42111607, 201200000031, 2013Y2-00034, 2014Y2-00027), Specialized Research Fund for the Doctoral Program of Higher Education (20134420110010), Opening Project of the State Key Laboratory for Novel Software Technology (KFKT2014B03), Discipline Construction and Quality Engineering of Higher Education in Guangdong Province(PT2011JSJ).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hao, Z., Zhang, H., Cai, R. et al. Causal discovery on high dimensional data. Appl Intell 42, 594–607 (2015). https://doi.org/10.1007/s10489-014-0607-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-014-0607-0