Replacing Causal Faithfulness with Algorithmic Independence of Conditionals

Lemeire, Jan; Janzing, Dominik

doi:10.1007/s11023-012-9283-1

Replacing Causal Faithfulness with Algorithmic Independence of Conditionals

Published: 22 July 2012

Volume 23, pages 227–249, (2013)
Cite this article

Minds and Machines Aims and scope Submit manuscript

Jan Lemeire^1,2 &
Dominik Janzing³

841 Accesses
20 Citations
1 Altmetric
Explore all metrics

Abstract

Independence of Conditionals (IC) has recently been proposed as a basic rule for causal structure learning. If a Bayesian network represents the causal structure, its Conditional Probability Distributions (CPDs) should be algorithmically independent. In this paper we compare IC with causal faithfulness (FF), stating that only those conditional independences that are implied by the causal Markov condition hold true. The latter is a basic postulate in common approaches to causal structure learning. The common spirit of FF and IC is to reject causal graphs for which the joint distribution looks ‘non-generic’. The difference lies in the notion of genericity: FF sometimes rejects models just because one of the CPDs is simple, for instance if the CPD describes a deterministic relation. IC does not behave in this undesirable way. It only rejects a model when there is a non-generic relation between different CPDs although each CPD looks generic when considered separately. Moreover, it detects relations between CPDs that cannot be captured by conditional independences. IC therefore helps in distinguishing causal graphs that induce the same conditional independences (i.e., they belong to the same Markov equivalence class). The usual justification for FF implicitly assumes a prior that is a probability density on the parameter space. IC can be justified by Solomonoff’s universal prior, assigning non-zero probability to those points in parameter space that have a finite description. In this way, it favours simple CPDs, and therefore respects Occam’s razor. Since Kolmogorov complexity is uncomputable, IC is not directly applicable in practice. We argue that it is nevertheless helpful, since it has already served as inspiration and justification for novel causal inference algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The three faces of faithfulness

Article 11 February 2015

Causal Discovery from Databases with Discrete and Continuous Variables

Causal Structure Learning: A Combinatorial Perspective

Article Open access 01 August 2022

Notes

A path is a set of consecutive edges (independent of the direction) that do not visit a vertex more than once.
Note that model selection procedures that are based on the minimum description length principle automatically define a probability distribution having finite description length (Grünwald 2007).

References

Cartwright, N. (1999). The dappled word: A study of the boundaries of science. Cambridge, MA: Cambridge University Press.
Book MATH Google Scholar
Cartwright, N. (2002). Against modularity, the causal Markov condition and any link between the two. British Journal for the Philosophy of Science, 53, 411–53.
Article MathSciNet MATH Google Scholar
Chaitin, G. (1966). On the length of programs for computing finite binary sequences. Journal of Association for Computing Machinery, 13, 547–569.
Article MathSciNet MATH Google Scholar
Chaitin, G. (1975). A theory of program size formally identical to information theory. Journal of Association for Computing Machinery, 22, 329–340.
Article MathSciNet MATH Google Scholar
Daniusis, P., Janzing, D., Mooij, J., Zscheischler, J., Steudel, B., Zhang, et al. (2010). Inferring deterministic causal relations. In: Proceedings of 6th Conference on Uncertainty in Artificial Intelligence (UAI).
Gacs, P., Tromp, J., & Vitányi, P. (2001). Algorithmic statistics. IEEE Transactions on Information Theory , 47(6), 2443–2463.
Article MATH Google Scholar
Grünwald, P. (2007). The minimum description length principle. Cambridge, MA: MIT Press.
Google Scholar
Hoyer, P. O., Janzing, D., Mooij, J. M., Peters, J., & Schölkopf, B. (2008). Nonlinear causal discovery with additive noise models. In: D. Koller, D. Schuurmans, Y. Bengio & L. Bottou (Eds.), NIPS (pp. 689–696). Cambridge, MA: IT Press.
Google Scholar
Hutter, M. (2007). On universal prediction and Bayesian confirmation. Theoretical Computer Science, 384(1), 33–48.
Article MathSciNet MATH Google Scholar
Janzing, D., & Schölkopf, B. (2010). Causal inference using the algorithmic Markov condition. IEEE Transactions on Information Theory, 56(10), 5168–5194.
Article Google Scholar
Janzing, D., & Steudel, B. (2010). Justifying additive-noise-based causal discovery via algorithmic information theory. Open Systems and Information Dynamics, 17(2), 189–212.
Article MathSciNet MATH Google Scholar
Janzing, D., Sun, X., & Schölkopf, B. (2009). Distinguishing cause and effect via second order exponential models. http://arxivorg/abs/09105561.
Janzing, D., Hoyer, P., & Schölkopf, B. (2010). Telling cause from effect based on high-dimensional observations. In: Proceedings of the Internationl Conference on Machine Learning (ICML), Israel: Haifa.
Janzing, D., Mooij, J., Zhang, K., Lemeire, J., Zscheischler, J., Daniusis, P., et al. (2012). Information-geometric approach to inferring causal directions. Artificial Intelligence, 56(10), 5168–5194.
Google Scholar
Kolmogorov, A. (1965). Three approaches to the quantitative definition of information. Problems of Information Transmission, 1(1), 1–7.
MathSciNet Google Scholar
Korb, K. B., & Nyberg, E. (2006). The power of intervention. Minds and Machines, 16(3), 289–302.
Article Google Scholar
Lauritzen, S. L. (1996). Graphical models. Oxford: Clarendon Press.
Google Scholar
Lauritzen, S. L., & Richardson, T. S. (2002). Chain graph models and their causal interpretation. Journal of the Royal Statistical Society, Series B, 64, 321 – 361.
Article MathSciNet MATH Google Scholar
Lemeire, J., & Dirkx, E. (2006). Causal models as minimal descriptions of multivariate systems. http://parallel.vub.ac.be/∼jan.
Lemeire, J., Meganck, S., Cartella, F., Liu, T., & Statnikov, A. (2011a). Inferring the causal decomposition under the presence of deterministic relations. In: Special session learning of causal relations at the ESANN conference.
Lemeire, J., Steenhaut, K., & Touhafi, A. (2011b). When are graphical causal models not good models? In: J. Williamson, F. Russo & P. McKay (Eds.), Causality in the sciences. Oxford: Oxford University Press.
Google Scholar
Levin, L. (1974). Laws of information conservation (non-growth) and aspects of the foundation of probability theory. Problems Information Transmission, 10(3), 206–210.
Google Scholar
Meek, C. (1995). Strong completeness and faithfulness in Bayesian networks. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), pp 411–418.
Pearl, J. (2000). Causality. Models, reasoning, and inference. Cambridge, MA: Cambridge University Press.
MATH Google Scholar
Peters, J., Janzing, D., & Schölkopf, B. (2011a). Causal inference on discrete data using additive noise models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(12), 2436–2450.
Article Google Scholar
Peters, J., Mooij, J., Janzing, D., & Schölkopf, B. (2011b) Identifiability of causal graphs using functional models. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI).
Rathmanner, S., & Hutter, M. (2011). A philosophical treatise of universal induction. Entropy, 13(6), 1076–1136 doi:10.3390/e13061076.
Article MathSciNet Google Scholar
Solomonoff, R. (1960). A preliminary report on a general theory of inductive inference. Technical report V-131 report ZTB-138 Zator Co.
Solomonoff, R. (1964). A formal theory of inductive inference. Information and Control, Part II, 7(2), 224–254.
Article MathSciNet MATH Google Scholar
Spirtes, P., Glymour, C., & Scheines, R. (1993). Causation, prediction, and search, 2nd edn. Berlin: Springer Verlag.
Book MATH Google Scholar
Zhang, J., Spirtes, P. (2011). Intervention, determinism, and the causal minimality condition. Synthese, 182(3), 335–347.
Article MathSciNet MATH Google Scholar
Zscheischler, J., Janzing, D., & Zhang, K. (2011) Testing whether linear equations are causal: A free probability theory approach. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI).

Download references

Acknowledgments

We would like to thank the blind reviewers in helping to structure our exposition and make our ideas clear. We would also like to thank Patrik Hoyer for providing us the example of Sect. "Both FF and IC are Sanity Checks of the Model Class". This work has partially been carried out within the framework of the Prognostics for Optimal Maintenance (POM) project (grant nr. 100031; http://www.pom-sbo.org) which is financially supported by the Institute for the Promotion of Innovation through Science and Technology in Flanders (IWT-Vlaanderen).

Author information

Authors and Affiliations

ETRO Department, Vrije Universiteit Brussel (VUB), Pleinlaan 2, 1050, Brussels, Belgium
Jan Lemeire
FMI Department, Interdisciplinary Institute for Broadband Technology (IBBT), Gaston Crommenlaan 8, box 102, 9050, Ghent, Belgium
Jan Lemeire
MPI for Intelligent Systems, Tubingen, Germany
Dominik Janzing

Authors

Jan Lemeire
View author publications
You can also search for this author in PubMed Google Scholar
Dominik Janzing
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jan Lemeire.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lemeire, J., Janzing, D. Replacing Causal Faithfulness with Algorithmic Independence of Conditionals. Minds & Machines 23, 227–249 (2013). https://doi.org/10.1007/s11023-012-9283-1

Download citation

Received: 23 June 2010
Accepted: 25 June 2012
Published: 22 July 2012
Issue Date: May 2013
DOI: https://doi.org/10.1007/s11023-012-9283-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Replacing Causal Faithfulness with Algorithmic Independence of Conditionals

Abstract

Access this article

Similar content being viewed by others

The three faces of faithfulness

Causal Discovery from Databases with Discrete and Continuous Variables

Causal Structure Learning: A Combinatorial Perspective

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Replacing Causal Faithfulness with Algorithmic Independence of Conditionals

Abstract

Access this article

Similar content being viewed by others

The three faces of faithfulness

Causal Discovery from Databases with Discrete and Continuous Variables

Causal Structure Learning: A Combinatorial Perspective

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation