Skip to main content
Log in

Independence test via mutual information in the presence of measurement errors

  • Original Paper
  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Among existing methods for independence test, mutual information (MI) has great popularity as it is invariant to monotone transformations and enjoys higher power in detecting nonlinear associations. In this paper, we propose a novel MI-based independence test in the presence of measurement errors. The conditional density functions involved in MI are estimated using a novel deconvolution double kernel method. The convergence rates of these estimates are derived under the assumption that the measurement errors are either ordinary or super smooth. In addition, the asymptotic behaviors of the resultant estimate of MI are established under both the null and alternative hypotheses. Extensive simulation studies and an application to the low-resolution observations of source stars dataset confirm the superior numerical performances of the proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

Data is provided within the manuscript.

References

  • Ai, C., Sun, L.H., Zhang, Z., Zhu, L.: Testing unconditional and conditional independence via mutual information. J. Econom. 39, 105335 (2022)

    MathSciNet  Google Scholar 

  • Berrett, T.B., Samworth, R.J.: Nonparametric independence testing via mutual information. Biometrika 106(3), 547–566 (2019)

    MathSciNet  Google Scholar 

  • Carroll, R.J., Hall, P.: Optimal rates of convergence for deconvolving a density. J. Am. Stat. Assoc. 83(404), 1184–1186 (1988)

    MathSciNet  Google Scholar 

  • Cao, D., Chen, Y., Chen, J., Zhang, H., Yuan, Z.: An improved algorithm for the maximal information coefficient and its application. Royal Soc. Open Sci. 8(2), 201424 (2021)

    Google Scholar 

  • Cinquegrana, G.C., Karakas, A.I.: The most metal-rich stars in the universe: chemical contributions of low-and intermediate-mass asymptotic giant branch stars with metallicities within 0.04\(\le z \le \) 0.10. Mon. Not. Royal Astron. Soc. 510(2), 1557–1576 (2022)

    Google Scholar 

  • De Gooijer, J.G., Zerom, D.: On conditional density estimation. Stat. Neerl. 57(2), 159–176 (2003)

    MathSciNet  Google Scholar 

  • Delaigle, A.: Deconvolution kernel density estimation. In: Handbook of Measurement Error Models, pp. 185–220. Chapman and Hall/CRC, Boca Raton (2021)

    Google Scholar 

  • Deb, N., Sen, B.: Multivariate rank-based distribution-free nonparametric testing using measure transportation. J. Am. Stat. Assoc. 118(541), 192–207 (2023)

    MathSciNet  Google Scholar 

  • Fan, J., Truong, Y.K.: Nonparametric regression with errors in variables. Ann. Stat. 21(4), 1900–1925 (1993)

    MathSciNet  Google Scholar 

  • Fan, J., Jiang, J.: Nonparametric inferences for additive models. J. Am. Stat. Assoc. 100(471), 890–907 (2005)

    MathSciNet  Google Scholar 

  • Fan, G., Liang, H., Shen, Y.: Penalized empirical likelihood for high-dimensional partially linear varying coefficient model with measurement errors. J. Multivar. Anal. 147, 183–201 (2016)

    MathSciNet  Google Scholar 

  • Fan, J., Zhang, Y., Zhu, L.: Independence tests in the presence of measurement errors: an invariance law. J. Multivar. Anal. 188(C), 104818 (2022)

    MathSciNet  Google Scholar 

  • Fokianos, K., Pitsillou, M.: Testing independence for multivariate time series via the auto-distance correlation matrix. Biometrika 105(2), 337–352 (2018)

    MathSciNet  Google Scholar 

  • Gretton, A., Fukumizu, K., Teo, C., et al.: A kernel statistical test of independence. Adv. Neural. Inf. Process. Syst. 20, 585–592 (2007)

    Google Scholar 

  • Gretton, A., Borgwardt, K.M., Rasch, M.J., et al.: A kernel two-sample test. J. Mach. Learn. Res. 13(1), 723–773 (2012)

    MathSciNet  Google Scholar 

  • Gonzalez, M.E., Silva, J.F., Videla, M., Orchard, M.E.: Data-driven representations for testing independence: modeling, analysis and connection with mutual information estimation. IEEE Trans. Signal Process. 70, 158–173 (2021)

    MathSciNet  Google Scholar 

  • Heller, R., Heller, Y., Gorfine, M.: A consistent multivariate test of association based on ranks of distances. Biometrika 100(2), 503–510 (2013)

    MathSciNet  Google Scholar 

  • Howes, L.M., Casey, A.R., Asplund, M., et al.: Extremely metal-poor stars from the cosmic dawn in the bulge of the Milky Way. Nature 527(7579), 484–487 (2015)

    Google Scholar 

  • Hubble, E.: A relation between distance and radial velocity among extra-galactic nebulae. Proc. Natl. Acad. Sci. 15(3), 168–173 (1929)

    Google Scholar 

  • Huang, W., Zhang, Z.: Nonparametric estimation of the continuous treatment effect with measurement error. J. R. Stat. Soc. Ser. B Stat Methodol. 85, 474–496 (2023)

    MathSciNet  Google Scholar 

  • Kim, T.W., Park, J.Y., Shin, J.Y.: Determining proper threshold levels for hydrological drought analysis based on independent tests. J. Korea Water Resour. Assoc. 53(3), 193–200 (2020)

    Google Scholar 

  • Kulkarni, V.P., Fall, S.M., Lauroesch, J.T., et al.: Hubble space telescope observations of element abundances in low-redshift damped Ly\(\alpha \) galaxies and implications for the global metallicity-redshift relation. Astrophys. J. 618(1), 68–90 (2005)

    Google Scholar 

  • Kulkarni, H., Khandait, H., Narlawar, U.W., Rathod, P., Mamtani, M.: Independent association of meteorological characteristics with initial spread of Covid-19 in India. Sci. Total Environ. 764, 142801 (2021)

    Google Scholar 

  • Leung, D., Drton, M.: Testing independence in high dimensions with sums of rank correlations. Ann. Stat. 46(1), 280–307 (2018)

    MathSciNet  Google Scholar 

  • Limnios, M., Clémençon, S.: On ranking-based tests of independence. In: International Conference on Artificial Intelligence and Statistics, pp. 577-585 (2024)

  • Marron, J.S., Wand, M.P.: Exact mean integrated squared error. Ann. Stat. 20(2), 712–736 (1992)

    MathSciNet  Google Scholar 

  • Mariano, M.G., Manuel, R.M.: A non-parametric independence test using permutation entropy. J. Econom. 144(1), 139–155 (2008)

    MathSciNet  Google Scholar 

  • Ma, L., Wu, X., Li, Z.: High-precision medicine bottles vision online inspection system and classification based on multifeatures and ensemble learning via independence test. IEEE Trans. Instrum. Meas. 70, 1–12 (2021)

    Google Scholar 

  • Neyman, J., Pearson, E.S.: IX. On the problem of the most efficient tests of statistical hypotheses. Philos. Trans. Royal Soc. London Series A 231, 289–337 (1933)

    Google Scholar 

  • Olagunju, A.: An empirical analysis of the impact of auditors independence on the credibility of financial statement in Nigeria. Res. J. Finance Account. 2(3), 82–99 (2011)

    Google Scholar 

  • Parzen, E.: On estimation of a probability density function and mode. Ann. Math. Stat. 33(3), 1065–1076 (1962)

    MathSciNet  Google Scholar 

  • Pethel, S.D., Hahs, D.W.: Exact test of independence using mutual information. Entropy 16(5), 2839–2849 (2014)

    Google Scholar 

  • Pilyugin, L.S., Lara-López, M.A., Grebel, E.K., et al.: The metallicity-redshift relations for emission-line SDSS galaxies: examination of the dependence on the star formation rate. Mon. Not. R. Astron. Soc. 432(2), 1217–1230 (2013)

    Google Scholar 

  • Reshef, D.N., Reshef, Y.A., Finucane, H.K., et al.: Detecting novel associations in large data sets. Science 334(6062), 1518–1524 (2011)

    Google Scholar 

  • Rosenblatt, M.: Remarks on some nonparametric estimates of a density function. Ann. Math. Stat. 27(3), 832–837 (1956)

    MathSciNet  Google Scholar 

  • Runge, J.: Conditional independence testing based on a nearest-neighbor estimator of conditional mutual information. PMLR 84, 938–947 (2018)

    Google Scholar 

  • Scott, D.W., Terrell, G.R.: Biased and unbiased cross-validation in density estimation. J. Am. Stat. Assoc. 82(400), 1131–1146 (1987)

    MathSciNet  Google Scholar 

  • Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948)

    MathSciNet  Google Scholar 

  • Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Chapman and Hall, London (1986)

    Google Scholar 

  • Stefanski, L.A., Carroll, R.J.: Deconvolving kernel density estimators. Statistics 21(2), 169–184 (1990)

    MathSciNet  Google Scholar 

  • Stone, C.J.: Optimal global rates of convergence for nonparametric regression. Ann. Stat. 10(4), 1040–1053 (1982)

    MathSciNet  Google Scholar 

  • Strauss, M.A., Willick, J.A.: The density and peculiar velocity fields of nearby galaxies. Phys. Rep. 261(5–6), 271–431 (1995)

    Google Scholar 

  • Székely, G.J., Rizzo, M.L., Bakirov, N.K.: Measuring and testing dependence by correlation of distances. Ann. Stat. 35(6), 2769–2794 (2007)

    MathSciNet  Google Scholar 

  • Su, L., White, H.: Testing conditional independence via empirical likelihood. J. Econom. 182(1), 27–44 (2014)

    MathSciNet  Google Scholar 

  • Tsybakov, A.B.: Introduction to Nonparametric Estimation. Springer, New York (2011)

    Google Scholar 

  • Wang, X.F., Wang, B.: Deconvolution estimation in measurement error models: the R package decon. J. Stat. Softw. 39(10), 1–24 (2011)

    Google Scholar 

  • Yabe, K., Ohta, K., Iwamuro, F., et al.: The mass-metallicity relation at z\(\sim \)1.4 revealed with Subaru/FMOS. Mon. Not. Royal Astron. Soc. 437(4), 3647–3663 (2014)

    Google Scholar 

  • Zeng, X., Xia, Y., Tong, H.: Jackknife approach to the estimation of mutual information. Proc. Natl. Acad. Sci. 115(40), 9956–9961 (2018)

    MathSciNet  Google Scholar 

  • Zhou, Y., Xu, K., Zhu, L., Li, R.: Rank-based indices for testing independence between two high-dimensional vectors. Ann. Stat. 52(1), 184–206 (2024)

    MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors thank the Associate Editor and two anonymous referees for constructive comments and helpful suggestions, which led to substantial improvements of this paper. They also thank Yingxing Li from Xiamen University and Yuexiao Dong from Temple University for their valuable comments and suggestions on improving the manuscript presentation. This research was supported by the National Social Science Fund of China (22BTJ018), Renmin University of China (22XNA026), and the National Natural Science Foundation of China (12225113, 12171477).

Author information

Authors and Affiliations

Authors

Contributions

Guoliang Fan: first author, conceived of the presented idea, methodology, computation and writing. Xinlin Zhang: co-first author, performed the computations, methodology and writing. Liping Zhu: corresponding author, conceived of the presented idea, developed the theory and writing. All authors reviewed the manuscript.

Corresponding author

Correspondence to Liping Zhu.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 292 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fan, G., Zhang, X. & Zhu, L. Independence test via mutual information in the presence of measurement errors. Stat Comput 34, 192 (2024). https://doi.org/10.1007/s11222-024-10502-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-024-10502-9

Keywords