Abstract
In several application areas, discretized variables represent an underlying continuous variable. For example, the level of certain medical measures can be ‘low’, ‘medium’ or ‘high’, while the underlying measure is a continuous variable. The estimation of graphical causal models for data with discretized variables leads to biased estimates and underestimated causal relations. In this work, we study the effect of incorporating background information on causal relations when estimating causal models with discretized variables. We show that incorporating background information on the relations between variables improves graphical causal model estimates in case of discretized variables. We find particularly large gains in reducing omitted causal relations and in estimating causal relations correctly. We relate these improvements to the hyperparameter choice in graphical causal models and properties of the variables in the model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Barnwell-Ménard, J.L., Li, Q., Cohen, A.A.: Effects of categorization method, regression type, and variable distribution on the inflation of type-i error rate when categorizing a confounding variable. Stat. Med. 34(6), 936–949 (2015)
Cobb, B.R., Rumí, R., Salmerón, A.: Bayesian network models with discrete and continuous variables. In: Advances in Probabilistic Graphical Models, pp. 81–102 (2007)
Colombo, D., Maathuis, M.H., et al.: Order-independent constraint-based causal structure learning. J. Mach. Learn. Res. 15(1), 3741–3782 (2014)
Cornelisz, I., Cuijpers, P., Donker, T., van Klaveren, C.: Addressing missing data in randomized clinical trials: a causal inference perspective. PLoS ONE 15(7), e0234349 (2020)
Cristianini, N., Shawe-Taylor, J., Elisseeff, A., Kandola, J.: On kernel-target alignment. In: Advances in Neural Information Processing Systems, vol. 14 (2001)
Elwert, F.: Graphical causal models. In: Handbook of Causal Analysis for Social Research, pp. 245–273. Springer (2013)
Fang, Z., He, Y.: IDA with background knowledge. In: Peters, J., Sontag, D. (eds.) Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI). Proceedings of Machine Learning Research, vol. 124, pp. 270–279. PMLR (2020)
Friedman, N., Goldszmidt, M., et al.: Discretizing continuous attributes while learning Bayesian networks. In: ICML, pp. 157–165 (1996)
Gao, Y., Kennedy, L., Simpson, D., Gelman, A.: Improving multilevel regression and poststratification with structured priors. Bayesian Anal. 16(3), 719 (2021)
Handhayani, T., Cussens, J.: Kernel-based approach to handle mixed data for inferring causal graphs. arXiv preprint arXiv:1910.03055 (2019)
Hanoch, O., Baştürk, N., Almeida, R.J., Habtewold, T.D.: Analysis of graphical causal models with discretized data. In: International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, pp. 223–234. Springer (2022)
Jensen, F.V., Nielsen, T.D.: Bayesian Networks and Decision Graphs, vol. 2. Springer (2007)
Johnson, S.R., Tomlinson, G.A., Hawker, G.A., Granton, J.T., Feldman, B.M.: Methods to elicit beliefs for Bayesian priors: a systematic review. J. Clin. Epidemiol. 63(4), 355–369 (2010)
Kalisch, M., Bühlman, P.: Estimating high-dimensional directed acyclic graphs with the PC-algorithm. J. Mach. Learn. Res. 8(3) (2007)
Kalisch, M., Mächler, M., Colombo, D., Maathuis, M.H., Bühlmann, P.: Causal inference using graphical models with the R package pcalg. J. Stat. Softw. 47(11), 1–26 (2012)
Maathuis, M.H., Kalisch, M., Bühlmann, P.: Estimating high-dimensional intervention effects from observational data. Ann. Stat. 37(6A), 3133–3164 (2009)
Maxwell, S.E., Delaney, H.D.: Bivariate median splits and spurious statistical significance. Psychol. Bull. 113(1), 181 (1993)
McNeish, D.M.: Using data-dependent priors to mitigate small sample bias in latent growth models: a discussion and illustration using m plus. J. Educ. Behav. Stat. 41(1), 27–56 (2016)
Meek, C.: Causal inference and causal explanation with background knowledge. arXiv preprint arXiv:1302.4972 (2013)
Mooij, J.M., Magliacane, S., Claassen, T.: Joint causal inference from multiple contexts. J. Mach. Learn. Res. 21(1), 3919–4026 (2020)
Pearl, J., Verma, T.S.: A statistical semantics for causation. Stat. Comput. 2(2), 91–95 (1992)
Perkovic, E., Kalisch, M., Maathuis, M.H.: Interpreting and using cpdags with background knowledge (2017). arXiv preprint arXiv:1707.02171
Rohrer, J.M.: Thinking clearly about correlations and causation: gaphical causal models for observational data. Adv. Methods Pract. Psychol. Sci. 1(1), 27–42 (2018)
Scheines, R., Spirtes, P., Glymour, C., Meek, C., Richardson, T.: The tetrad project: constraint based aids to causal model specification. Multivar. Behav. Res. 33(1), 65–117 (1998)
Sokolova, E., Groot, P., Claassen, T., Rhein, D.v., Buitelaar, J., Heskes, T.: Causal discovery from medical data: dealing with missing values and a mixture of discrete and continuous data. In: Conference on Artificial Intelligence in Medicine in Europe. pp. 177–181. Springer (2015)
Spirtes, P., Glymour, C.: An algorithm for fast recovery of sparse causal graphs. Soc. Sci. Comput. Rev. 9(1), 62–72 (1991)
Spirtes, P., Glymour, C.N., Scheines, R., Heckerman, D.: Causation, Prediction, and Search. MIT press (2001)
Thoresen, M.: Spurious interaction as a result of categorization. BMC Med. Res. Methodol. 19(1), 1–8 (2019)
Zhong, W., et al.: Inferring regulatory networks from mixed observational data using directed acyclic graphs. Front. Genet. 11, 8 (2020)
Acknowledgement
N. Baştürk is partially supported by an NWO grant number 195.187.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Baştürk, N., Rajapakshe, C., Almeida, R.J. (2024). Graphical Causal Models with Discretized Data and Background Information. In: Lesot, MJ., et al. Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2024. Lecture Notes in Networks and Systems, vol 1174. Springer, Cham. https://doi.org/10.1007/978-3-031-74003-9_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-74003-9_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-74002-2
Online ISBN: 978-3-031-74003-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)