Skip to main content

Detection of Conditional Dependence Between Multiple Variables Using Multiinformation

  • Conference paper
  • First Online:
Computational Science – ICCS 2021 (ICCS 2021)

Abstract

We consider a problem of detecting the conditional dependence between multiple discrete variables. This is a generalization of well-known and widely studied problem of testing the conditional independence between two variables given a third one. The issue is important in various applications. For example, in the context of supervised learning, such test can be used to verify model adequacy of the popular Naive Bayes classifier. In epidemiology, there is a need to verify whether the occurrences of multiple diseases are dependent. However, focusing solely on occurrences of diseases may be misleading, as one has to take into account the confounding variables (such as gender or age) and preferably consider the conditional dependencies between diseases given the confounding variables. To address the aforementioned problem, we propose to use conditional multiinformation (CMI), which is a measure derived from information theory. We prove some new properties of CMI. To account for the uncertainty associated with a given data sample, we propose a formal statistical test of conditional independence based on the empirical version of CMI. The main contribution of the work is determination of the asymptotic distribution of empirical CMI, which leads to construction of the asymptotic test for conditional independence. The asymptotic test is compared with the permutation test and the scaled chi squared test. Simulation experiments indicate that the asymptotic test achieves larger power than the competitive methods thus leading to more frequent detection of conditional dependencies when they occur. We apply the method to detect dependencies in medical data set MIMIC-III.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/teisseyrep/cmi.

References

  1. Bellot, A., van der Schaar, M.: Conditional independence testing using generative adversarial networks. In: Advances in Neural Information Processing Systems, vol. 32, pp. 2199–2208 (2019)

    Google Scholar 

  2. Berrett, T.B., Wang, Y., Barber, R.F., Samworth, R.J.: The conditional permutation test for independence while controlling for confounders. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 82(1), 175–197 (2020)

    Article  MathSciNet  Google Scholar 

  3. Bühlmann, P., van de Geer, S.: Statistics for High-Dimensional Data, 1st edn. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-642-20192-9

  4. Candès, E., Fan, Y., Janson, L., Lv, J.: Panning for gold: model-x knockoffs for high-dimensional controlled variable selection. J. Roy. Stat. Soc. B 80, 551–577 (2018)

    Article  MathSciNet  Google Scholar 

  5. Chanda, P., et al.: Ambience: a novel approach and efficient algorithm for identifying informative genetic and environmental associations with complex phenotypes. Genetics 180, 1191–2010 (2008)

    Article  Google Scholar 

  6. Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley Series in Telecommunications and Signal Processing. Wiley-Interscience (2006)

    Google Scholar 

  7. Dawid, A.P.: Conditional independence in statistical theory. J. Roy. Stat. Soc.: Ser. B (Methodol.) 41(1), 1–15 (1979)

    MathSciNet  MATH  Google Scholar 

  8. Johnson, A.E.W., et al.: MIMIC-III, a freely accessible critical care database. Sci. Data 3, 1–9 (2016)

    Article  Google Scholar 

  9. Kubkowski, M., Mielniczuk, J.: Asymptotic distributions of interaction information. Methodol. Comput. Appl. Probab. 23, 291–315 (2020)

    Article  MathSciNet  Google Scholar 

  10. Kullback, S.: Information Theory and Statistics. Peter Smith (1978)

    Google Scholar 

  11. Li, C., Fan, X.: On nonparametric conditional independence tests for continuous variables. WIREs Comput. Stat. 12, 1–11 (2020)

    Article  MathSciNet  Google Scholar 

  12. Pearl, J.: Causality. Cambridge University Press, Cambridge (2009)

    Book  Google Scholar 

  13. Rowe, T., Troy, D.: The sampling distribution of the total correlation for multivariate gaussian random variables. Entropy 21, 921 (2019)

    Article  MathSciNet  Google Scholar 

  14. Runge, J.: Conditional independence testing based on a nearest neighbour estimator of conditional mutual information. In: Proceedings of the 21st International Conference on Artificial Intelligence and Statistics, PMLR, vol. 84, pp. 938–947 (2018)

    Google Scholar 

  15. Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search, 2nd edn. MIT Press (2000)

    Google Scholar 

  16. Studený, M.: Asymptotic behaviour of empirical multiinformation. Kybernetika 23, 124–135 (1987)

    MathSciNet  MATH  Google Scholar 

  17. Studený, M., Vejnarová, J.: The multiinformation as a tool for measuring stochastic dependence. In: Learning in Graphical Models, pp. 66–82. MIT Press (1999)

    Google Scholar 

  18. Tsamardinos, I., Aliferis, C., Statnikov, A.: Algorithms for large scale Markov Blanket discovery. In: FLAIRS Conference, pp. 376–381 (2003)

    Google Scholar 

  19. Tsamardinos, I., Borboudakis, G.: Permutation testing improves Bayesian network learning. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS (LNAI), vol. 6323, pp. 322–337. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15939-8_21

    Chapter  Google Scholar 

  20. Tsybakov, A.: Introduction to Nonparametric Estimation, 1st edn. Springer, New York (2009). https://doi.org/10.1007/b13794

  21. Watanabe, S.: Information theoretical analysis of multivariate correlation. IBM J. Res. Dev. 4, 66–82 (1960)

    Article  MathSciNet  Google Scholar 

  22. Zhang, K., Peters, J., Janzing, D., Schölkopf, B.: Kernel-based conditional independence test and application in causal discovery. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, UAI 2011, pp. 804–813 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jan Mielniczuk .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mielniczuk, J., Teisseyre, P. (2021). Detection of Conditional Dependence Between Multiple Variables Using Multiinformation. In: Paszynski, M., Kranzlmüller, D., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2021. ICCS 2021. Lecture Notes in Computer Science(), vol 12747. Springer, Cham. https://doi.org/10.1007/978-3-030-77980-1_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-77980-1_51

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-77979-5

  • Online ISBN: 978-3-030-77980-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics