Abstract
Among existing methods for independence test, mutual information (MI) has great popularity as it is invariant to monotone transformations and enjoys higher power in detecting nonlinear associations. In this paper, we propose a novel MI-based independence test in the presence of measurement errors. The conditional density functions involved in MI are estimated using a novel deconvolution double kernel method. The convergence rates of these estimates are derived under the assumption that the measurement errors are either ordinary or super smooth. In addition, the asymptotic behaviors of the resultant estimate of MI are established under both the null and alternative hypotheses. Extensive simulation studies and an application to the low-resolution observations of source stars dataset confirm the superior numerical performances of the proposed methods.













Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
Data is provided within the manuscript.
References
Ai, C., Sun, L.H., Zhang, Z., Zhu, L.: Testing unconditional and conditional independence via mutual information. J. Econom. 39, 105335 (2022)
Berrett, T.B., Samworth, R.J.: Nonparametric independence testing via mutual information. Biometrika 106(3), 547–566 (2019)
Carroll, R.J., Hall, P.: Optimal rates of convergence for deconvolving a density. J. Am. Stat. Assoc. 83(404), 1184–1186 (1988)
Cao, D., Chen, Y., Chen, J., Zhang, H., Yuan, Z.: An improved algorithm for the maximal information coefficient and its application. Royal Soc. Open Sci. 8(2), 201424 (2021)
Cinquegrana, G.C., Karakas, A.I.: The most metal-rich stars in the universe: chemical contributions of low-and intermediate-mass asymptotic giant branch stars with metallicities within 0.04\(\le z \le \) 0.10. Mon. Not. Royal Astron. Soc. 510(2), 1557–1576 (2022)
De Gooijer, J.G., Zerom, D.: On conditional density estimation. Stat. Neerl. 57(2), 159–176 (2003)
Delaigle, A.: Deconvolution kernel density estimation. In: Handbook of Measurement Error Models, pp. 185–220. Chapman and Hall/CRC, Boca Raton (2021)
Deb, N., Sen, B.: Multivariate rank-based distribution-free nonparametric testing using measure transportation. J. Am. Stat. Assoc. 118(541), 192–207 (2023)
Fan, J., Truong, Y.K.: Nonparametric regression with errors in variables. Ann. Stat. 21(4), 1900–1925 (1993)
Fan, J., Jiang, J.: Nonparametric inferences for additive models. J. Am. Stat. Assoc. 100(471), 890–907 (2005)
Fan, G., Liang, H., Shen, Y.: Penalized empirical likelihood for high-dimensional partially linear varying coefficient model with measurement errors. J. Multivar. Anal. 147, 183–201 (2016)
Fan, J., Zhang, Y., Zhu, L.: Independence tests in the presence of measurement errors: an invariance law. J. Multivar. Anal. 188(C), 104818 (2022)
Fokianos, K., Pitsillou, M.: Testing independence for multivariate time series via the auto-distance correlation matrix. Biometrika 105(2), 337–352 (2018)
Gretton, A., Fukumizu, K., Teo, C., et al.: A kernel statistical test of independence. Adv. Neural. Inf. Process. Syst. 20, 585–592 (2007)
Gretton, A., Borgwardt, K.M., Rasch, M.J., et al.: A kernel two-sample test. J. Mach. Learn. Res. 13(1), 723–773 (2012)
Gonzalez, M.E., Silva, J.F., Videla, M., Orchard, M.E.: Data-driven representations for testing independence: modeling, analysis and connection with mutual information estimation. IEEE Trans. Signal Process. 70, 158–173 (2021)
Heller, R., Heller, Y., Gorfine, M.: A consistent multivariate test of association based on ranks of distances. Biometrika 100(2), 503–510 (2013)
Howes, L.M., Casey, A.R., Asplund, M., et al.: Extremely metal-poor stars from the cosmic dawn in the bulge of the Milky Way. Nature 527(7579), 484–487 (2015)
Hubble, E.: A relation between distance and radial velocity among extra-galactic nebulae. Proc. Natl. Acad. Sci. 15(3), 168–173 (1929)
Huang, W., Zhang, Z.: Nonparametric estimation of the continuous treatment effect with measurement error. J. R. Stat. Soc. Ser. B Stat Methodol. 85, 474–496 (2023)
Kim, T.W., Park, J.Y., Shin, J.Y.: Determining proper threshold levels for hydrological drought analysis based on independent tests. J. Korea Water Resour. Assoc. 53(3), 193–200 (2020)
Kulkarni, V.P., Fall, S.M., Lauroesch, J.T., et al.: Hubble space telescope observations of element abundances in low-redshift damped Ly\(\alpha \) galaxies and implications for the global metallicity-redshift relation. Astrophys. J. 618(1), 68–90 (2005)
Kulkarni, H., Khandait, H., Narlawar, U.W., Rathod, P., Mamtani, M.: Independent association of meteorological characteristics with initial spread of Covid-19 in India. Sci. Total Environ. 764, 142801 (2021)
Leung, D., Drton, M.: Testing independence in high dimensions with sums of rank correlations. Ann. Stat. 46(1), 280–307 (2018)
Limnios, M., Clémençon, S.: On ranking-based tests of independence. In: International Conference on Artificial Intelligence and Statistics, pp. 577-585 (2024)
Marron, J.S., Wand, M.P.: Exact mean integrated squared error. Ann. Stat. 20(2), 712–736 (1992)
Mariano, M.G., Manuel, R.M.: A non-parametric independence test using permutation entropy. J. Econom. 144(1), 139–155 (2008)
Ma, L., Wu, X., Li, Z.: High-precision medicine bottles vision online inspection system and classification based on multifeatures and ensemble learning via independence test. IEEE Trans. Instrum. Meas. 70, 1–12 (2021)
Neyman, J., Pearson, E.S.: IX. On the problem of the most efficient tests of statistical hypotheses. Philos. Trans. Royal Soc. London Series A 231, 289–337 (1933)
Olagunju, A.: An empirical analysis of the impact of auditors independence on the credibility of financial statement in Nigeria. Res. J. Finance Account. 2(3), 82–99 (2011)
Parzen, E.: On estimation of a probability density function and mode. Ann. Math. Stat. 33(3), 1065–1076 (1962)
Pethel, S.D., Hahs, D.W.: Exact test of independence using mutual information. Entropy 16(5), 2839–2849 (2014)
Pilyugin, L.S., Lara-López, M.A., Grebel, E.K., et al.: The metallicity-redshift relations for emission-line SDSS galaxies: examination of the dependence on the star formation rate. Mon. Not. R. Astron. Soc. 432(2), 1217–1230 (2013)
Reshef, D.N., Reshef, Y.A., Finucane, H.K., et al.: Detecting novel associations in large data sets. Science 334(6062), 1518–1524 (2011)
Rosenblatt, M.: Remarks on some nonparametric estimates of a density function. Ann. Math. Stat. 27(3), 832–837 (1956)
Runge, J.: Conditional independence testing based on a nearest-neighbor estimator of conditional mutual information. PMLR 84, 938–947 (2018)
Scott, D.W., Terrell, G.R.: Biased and unbiased cross-validation in density estimation. J. Am. Stat. Assoc. 82(400), 1131–1146 (1987)
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948)
Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Chapman and Hall, London (1986)
Stefanski, L.A., Carroll, R.J.: Deconvolving kernel density estimators. Statistics 21(2), 169–184 (1990)
Stone, C.J.: Optimal global rates of convergence for nonparametric regression. Ann. Stat. 10(4), 1040–1053 (1982)
Strauss, M.A., Willick, J.A.: The density and peculiar velocity fields of nearby galaxies. Phys. Rep. 261(5–6), 271–431 (1995)
Székely, G.J., Rizzo, M.L., Bakirov, N.K.: Measuring and testing dependence by correlation of distances. Ann. Stat. 35(6), 2769–2794 (2007)
Su, L., White, H.: Testing conditional independence via empirical likelihood. J. Econom. 182(1), 27–44 (2014)
Tsybakov, A.B.: Introduction to Nonparametric Estimation. Springer, New York (2011)
Wang, X.F., Wang, B.: Deconvolution estimation in measurement error models: the R package decon. J. Stat. Softw. 39(10), 1–24 (2011)
Yabe, K., Ohta, K., Iwamuro, F., et al.: The mass-metallicity relation at z\(\sim \)1.4 revealed with Subaru/FMOS. Mon. Not. Royal Astron. Soc. 437(4), 3647–3663 (2014)
Zeng, X., Xia, Y., Tong, H.: Jackknife approach to the estimation of mutual information. Proc. Natl. Acad. Sci. 115(40), 9956–9961 (2018)
Zhou, Y., Xu, K., Zhu, L., Li, R.: Rank-based indices for testing independence between two high-dimensional vectors. Ann. Stat. 52(1), 184–206 (2024)
Acknowledgements
The authors thank the Associate Editor and two anonymous referees for constructive comments and helpful suggestions, which led to substantial improvements of this paper. They also thank Yingxing Li from Xiamen University and Yuexiao Dong from Temple University for their valuable comments and suggestions on improving the manuscript presentation. This research was supported by the National Social Science Fund of China (22BTJ018), Renmin University of China (22XNA026), and the National Natural Science Foundation of China (12225113, 12171477).
Author information
Authors and Affiliations
Contributions
Guoliang Fan: first author, conceived of the presented idea, methodology, computation and writing. Xinlin Zhang: co-first author, performed the computations, methodology and writing. Liping Zhu: corresponding author, conceived of the presented idea, developed the theory and writing. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Fan, G., Zhang, X. & Zhu, L. Independence test via mutual information in the presence of measurement errors. Stat Comput 34, 192 (2024). https://doi.org/10.1007/s11222-024-10502-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-024-10502-9