Causal discovery on high dimensional data

Hao, Zhifeng; Zhang, Hao; Cai, Ruichu; Wen, Wen; Li, Zhihao

doi:10.1007/s10489-014-0607-0

Causal discovery on high dimensional data

Published: 25 November 2014

Volume 42, pages 594–607, (2015)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Zhifeng Hao^1,2,
Hao Zhang¹,
Ruichu Cai²,
Wen Wen² &
…
Zhihao Li²

879 Accesses
7 Citations
Explore all metrics

Abstract

Existing causal discovery algorithms are usually not effective and efficient enough on high dimensional data. Because the high dimensionality reduces the discovered accuracy and increases the computation complexity. To alleviate these problems, we present a three-phase approach to learn the structure of nonlinear causal models by taking the advantage of feature selection method and two state of the art causal discovery methods. In the first phase, a greedy search method based on Max-Relevance and Min-Redundancy is employed to discover the candidate causal set, a rough skeleton of the causal network is generated accordingly. In the second phase, constraint-based method is explored to discover the accurate skeleton from the rough skeleton. In the third phase, direction learning algorithm IGCI is conducted to distinguish the direction of causalities from the accurate skeleton. The experimental results show that the proposed approach is both effective and scalable, particularly with interesting findings on the high dimensional data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

High-dimensional causal discovery based on heuristic causal partitioning

Article 14 July 2023

Yinghan Hong, Junping Guo, … Gengzhong Zheng

Generalised Partial Association in Causal Rules Discovery

A Survey on Causal Discovery

References

Pearl J (2009) Causality: models, reasoning and inference, 2nd edn. Cambridge University Press, Cambridge, UK
Book Google Scholar
Spirtes P, Glymour CN, Scheines R (2001) Causation, prediction, and search, 2nd edn. MIT Press, Cambridge, MA
MATH Google Scholar
Tsamardinos I, Brown LE, Aliferis CF (2006) The max-min hill-climbing bayesian network structure learning algorithm. Mach Learn 65(1):31–78
Article Google Scholar
Chickering DM (2003) Optimal structure identication with greedy search. J Mach Learn Res 3:507–554
MATH MathSciNet Google Scholar
Shimizu S, Hoyer PO, Hyvärinen A et al (2006) A linear non-Gaussian acyclic model for causal discovery. J Mach Learn Res 7:2003–2030
MATH MathSciNet Google Scholar
Hoyer PO, Janzing D, Mooij J et al (2008) Nonlinear causal discovery with additive noise models. NIPS 21:689–696
Google Scholar
Peters J, Janzing D, Schölkopf B (2010) Identifying cause and effect on discrete data using additive noise models. In: International Conference on Artificial Intelligence and Statistics, pp 597–604
Janzing D, Mooij J, Zhang K et al (2012) Information-geometric approach to inferring causal directions. Artif Intell 182:1–31
Article MathSciNet Google Scholar
Herskovits E (1991) Computer-Based Probabilistic-Network Construction. PhD dissertation, Stanford University, Stanford, CA
Chickering DM (2002) Optimal structure identification with greedy search. J Mach Learn Res 3:507–554
MathSciNet Google Scholar
Meek C (1997) Graphical models: selecting causal and statistical models. PhD thesis, Carnegie Mellon University, Pittsburgh, PA
Schwarz GE (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Article MATH Google Scholar
Peng H, Long F, Ding C (2005) Variable selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Article Google Scholar
Zhang K, Peters J, Janzing D et al (2012) Kernel-based conditional independence test and application in causal discovery. arXiv preprint arXiv: 1202.3775
Yeung RW (2002) A first course in information theory. Springer, Berlin Heidelberg New York
Book Google Scholar
Kraskov A, Stögbauer H, Grassberger P (2004) Estimating mutual information. Phys Rev E 69(3):066138–066154
Article MathSciNet Google Scholar
Hadley SW, Pelizzari C, Chen GTY (1996) Registrationof localization images by maximization of mutual information. In: Proceedings of Annual Meeting of the American Association Physicists in Medicine
Kwak N, Choi CH (2002) Input feature selection by mutual information based on parzen window. IEEE Trans Pattern Anal Mach Intell 24(12):1667–1671
Article Google Scholar
http://www.cs.huji.ac.il/site/labs/compbio/Repository/
Kelly L, Clark J, Gilliland DG (2002) Comprehensive genotypic analysis of leukemia: clinical and therapeutic implications. Curr Opin Oncol 14(1):10–18
Article Google Scholar
Wong ETL, Jenne DE, Zimmer M et al (1999) Changes in chromatin organization at the neutrophil elastase locus associated with myeloid cell differentiation. Blood 94(11):3730–3736
Google Scholar
Gullberg M, Noreus K, Brattsand G et al (1990) Purification an characterization of a 19-kilodalton intracellular protein. An activation-regulated putative protein kinase C substrate of T lymphocytes. J Biol Chem 265(29):17499–17505
Google Scholar
Tang LJ, Jiang JH, Wu HL et al (2009) Variable selection using probability density function similarity for support vector machine classification of high dimensional microarray data. Talanta 79(2):260–267
Article Google Scholar

Download references

Acknowledgments

This work is financially supported by Natural Science Foundation of China (61100148, 61202269, 61472089), Foundation for Distinguished Young Talents in Higher Education of Guangdong, China (LYM11060), Key Technology Research and Development Programs of Guangdong Province (2012B01010029), Science and Technology Plan Project of Guangzhou City(12C42111607, 201200000031, 2013Y2-00034, 2014Y2-00027), Specialized Research Fund for the Doctoral Program of Higher Education (20134420110010), Opening Project of the State Key Laboratory for Novel Software Technology (KFKT2014B03), Discipline Construction and Quality Engineering of Higher Education in Guangdong Province(PT2011JSJ).

Author information

Authors and Affiliations

Faculty of Applied Mathematics, Guangdong University of Technology, Guangzhou, China
Zhifeng Hao & Hao Zhang
Faculty of Computer Science, Guangdong University of Technology, Guangzhou, China
Zhifeng Hao, Ruichu Cai, Wen Wen & Zhihao Li

Authors

Zhifeng Hao
View author publications
You can also search for this author in PubMed Google Scholar
Hao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ruichu Cai
View author publications
You can also search for this author in PubMed Google Scholar
Wen Wen
View author publications
You can also search for this author in PubMed Google Scholar
Zhihao Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hao Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hao, Z., Zhang, H., Cai, R. et al. Causal discovery on high dimensional data. Appl Intell 42, 594–607 (2015). https://doi.org/10.1007/s10489-014-0607-0

Download citation

Published: 25 November 2014
Issue Date: April 2015
DOI: https://doi.org/10.1007/s10489-014-0607-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Causal discovery on high dimensional data

Abstract

Access this article

Similar content being viewed by others

High-dimensional causal discovery based on heuristic causal partitioning

Generalised Partial Association in Causal Rules Discovery

A Survey on Causal Discovery

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Causal discovery on high dimensional data

Abstract

Access this article

Similar content being viewed by others

High-dimensional causal discovery based on heuristic causal partitioning

Generalised Partial Association in Causal Rules Discovery

A Survey on Causal Discovery

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation