Skip to main content

Graphical Causal Models with Discretized Data and Background Information

  • Conference paper
  • First Online:
Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2024)

Abstract

In several application areas, discretized variables represent an underlying continuous variable. For example, the level of certain medical measures can be ‘low’, ‘medium’ or ‘high’, while the underlying measure is a continuous variable. The estimation of graphical causal models for data with discretized variables leads to biased estimates and underestimated causal relations. In this work, we study the effect of incorporating background information on causal relations when estimating causal models with discretized variables. We show that incorporating background information on the relations between variables improves graphical causal model estimates in case of discretized variables. We find particularly large gains in reducing omitted causal relations and in estimating causal relations correctly. We relate these improvements to the hyperparameter choice in graphical causal models and properties of the variables in the model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Barnwell-Ménard, J.L., Li, Q., Cohen, A.A.: Effects of categorization method, regression type, and variable distribution on the inflation of type-i error rate when categorizing a confounding variable. Stat. Med. 34(6), 936–949 (2015)

    Article  MathSciNet  Google Scholar 

  2. Cobb, B.R., Rumí, R., Salmerón, A.: Bayesian network models with discrete and continuous variables. In: Advances in Probabilistic Graphical Models, pp. 81–102 (2007)

    Google Scholar 

  3. Colombo, D., Maathuis, M.H., et al.: Order-independent constraint-based causal structure learning. J. Mach. Learn. Res. 15(1), 3741–3782 (2014)

    MathSciNet  Google Scholar 

  4. Cornelisz, I., Cuijpers, P., Donker, T., van Klaveren, C.: Addressing missing data in randomized clinical trials: a causal inference perspective. PLoS ONE 15(7), e0234349 (2020)

    Article  Google Scholar 

  5. Cristianini, N., Shawe-Taylor, J., Elisseeff, A., Kandola, J.: On kernel-target alignment. In: Advances in Neural Information Processing Systems, vol. 14 (2001)

    Google Scholar 

  6. Elwert, F.: Graphical causal models. In: Handbook of Causal Analysis for Social Research, pp. 245–273. Springer (2013)

    Google Scholar 

  7. Fang, Z., He, Y.: IDA with background knowledge. In: Peters, J., Sontag, D. (eds.) Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI). Proceedings of Machine Learning Research, vol. 124, pp. 270–279. PMLR (2020)

    Google Scholar 

  8. Friedman, N., Goldszmidt, M., et al.: Discretizing continuous attributes while learning Bayesian networks. In: ICML, pp. 157–165 (1996)

    Google Scholar 

  9. Gao, Y., Kennedy, L., Simpson, D., Gelman, A.: Improving multilevel regression and poststratification with structured priors. Bayesian Anal. 16(3), 719 (2021)

    Article  MathSciNet  Google Scholar 

  10. Handhayani, T., Cussens, J.: Kernel-based approach to handle mixed data for inferring causal graphs. arXiv preprint arXiv:1910.03055 (2019)

  11. Hanoch, O., Baştürk, N., Almeida, R.J., Habtewold, T.D.: Analysis of graphical causal models with discretized data. In: International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, pp. 223–234. Springer (2022)

    Google Scholar 

  12. Jensen, F.V., Nielsen, T.D.: Bayesian Networks and Decision Graphs, vol. 2. Springer (2007)

    Google Scholar 

  13. Johnson, S.R., Tomlinson, G.A., Hawker, G.A., Granton, J.T., Feldman, B.M.: Methods to elicit beliefs for Bayesian priors: a systematic review. J. Clin. Epidemiol. 63(4), 355–369 (2010)

    Article  Google Scholar 

  14. Kalisch, M., Bühlman, P.: Estimating high-dimensional directed acyclic graphs with the PC-algorithm. J. Mach. Learn. Res. 8(3) (2007)

    Google Scholar 

  15. Kalisch, M., Mächler, M., Colombo, D., Maathuis, M.H., Bühlmann, P.: Causal inference using graphical models with the R package pcalg. J. Stat. Softw. 47(11), 1–26 (2012)

    Article  Google Scholar 

  16. Maathuis, M.H., Kalisch, M., Bühlmann, P.: Estimating high-dimensional intervention effects from observational data. Ann. Stat. 37(6A), 3133–3164 (2009)

    Article  MathSciNet  Google Scholar 

  17. Maxwell, S.E., Delaney, H.D.: Bivariate median splits and spurious statistical significance. Psychol. Bull. 113(1), 181 (1993)

    Article  Google Scholar 

  18. McNeish, D.M.: Using data-dependent priors to mitigate small sample bias in latent growth models: a discussion and illustration using m plus. J. Educ. Behav. Stat. 41(1), 27–56 (2016)

    Article  MathSciNet  Google Scholar 

  19. Meek, C.: Causal inference and causal explanation with background knowledge. arXiv preprint arXiv:1302.4972 (2013)

  20. Mooij, J.M., Magliacane, S., Claassen, T.: Joint causal inference from multiple contexts. J. Mach. Learn. Res. 21(1), 3919–4026 (2020)

    MathSciNet  Google Scholar 

  21. Pearl, J., Verma, T.S.: A statistical semantics for causation. Stat. Comput. 2(2), 91–95 (1992)

    Article  Google Scholar 

  22. Perkovic, E., Kalisch, M., Maathuis, M.H.: Interpreting and using cpdags with background knowledge (2017). arXiv preprint arXiv:1707.02171

  23. Rohrer, J.M.: Thinking clearly about correlations and causation: gaphical causal models for observational data. Adv. Methods Pract. Psychol. Sci. 1(1), 27–42 (2018)

    Article  MathSciNet  Google Scholar 

  24. Scheines, R., Spirtes, P., Glymour, C., Meek, C., Richardson, T.: The tetrad project: constraint based aids to causal model specification. Multivar. Behav. Res. 33(1), 65–117 (1998)

    Article  Google Scholar 

  25. Sokolova, E., Groot, P., Claassen, T., Rhein, D.v., Buitelaar, J., Heskes, T.: Causal discovery from medical data: dealing with missing values and a mixture of discrete and continuous data. In: Conference on Artificial Intelligence in Medicine in Europe. pp. 177–181. Springer (2015)

    Google Scholar 

  26. Spirtes, P., Glymour, C.: An algorithm for fast recovery of sparse causal graphs. Soc. Sci. Comput. Rev. 9(1), 62–72 (1991)

    Article  Google Scholar 

  27. Spirtes, P., Glymour, C.N., Scheines, R., Heckerman, D.: Causation, Prediction, and Search. MIT press (2001)

    Google Scholar 

  28. Thoresen, M.: Spurious interaction as a result of categorization. BMC Med. Res. Methodol. 19(1), 1–8 (2019)

    Article  Google Scholar 

  29. Zhong, W., et al.: Inferring regulatory networks from mixed observational data using directed acyclic graphs. Front. Genet. 11, 8 (2020)

    Article  Google Scholar 

Download references

Acknowledgement

N. Baştürk is partially supported by an NWO grant number 195.187.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rui Jorge Almeida .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Baştürk, N., Rajapakshe, C., Almeida, R.J. (2024). Graphical Causal Models with Discretized Data and Background Information. In: Lesot, MJ., et al. Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2024. Lecture Notes in Networks and Systems, vol 1174. Springer, Cham. https://doi.org/10.1007/978-3-031-74003-9_19

Download citation

Publish with us

Policies and ethics