Abstract
This paper focuses on the support recovery of the Gaussian graphical model (GGM) with false discovery rate (FDR) control. The graceful symmetrized data aggregation (SDA) technique which involves sample splitting, data screening and information pooling is exploited via a node-based way. A matrix of test statistics with symmetry property is constructed and a data-driven threshold is chosen to control the FDR for the support recovery of GGM. The proposed method is shown to control the FDR asymptotically under some mild conditions. Extensive simulation studies and a real-data example demonstrate that it yields a better FDR control while offering reasonable power in most cases.
Similar content being viewed by others
References
Lauritzen S L, Graphical Models, Clarendon Press, Oxford, 1996.
Meinshausen N and Bühlmann P, High-dimensional graphs and variable selection with the lasso, Annals of Statistics, 2006, 34(3): 1436–1462.
Tibshirani R, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society: Series B (Methodological), 1996, 58(1): 267–288.
Candès E and Tao T, The dantzig selector: Statistical estimation when p is much larger than n, Annals of Statistics, 2007, 35(6): 2313–2351.
Cai T, Liu W D, and Luo X, A constrained l1 minimization approach to sparse precision matrix estimation, Journal of the American Statistical Association, 2011, 106(494): 594–607.
Sun T N and Zhang C H, Sparse matrix inversion with scaled lasso, Journal of Machine Learning Research, 2013, 14(1): 3385–3418.
Yuan M and Lin Y, Model selection and estimation in the gaussian graphical model, Biometrika, 2007, 94(1): 19–35.
Friedman J, Hastie T, and Tibshirani R, Sparse inverse covariance estimation with the graphical lasso, Biostatistics, 2008, 9(3): 432–441.
Witten D M, Friedman J H, and Simon N, New insights and faster computations for the graphical lasso, Journal of Computational and Graphical Statistics, 2011, 20(4): 892–900.
Mazumder R and Hastie T, Exact covariance thresholding into connected components for large-scale graphical lasso, Journal of Machine Learning Research, 2012, 13(1): 781–794.
Mazumder R and Hastie T, The graphical lasso: New insights and alternatives, Electronic Journal of Statistics, 2012, 6: 2125.
Rothman A J, Bickel P J, Levina E, et al., Sparse permutation invariant covariance estimation, Electronic Journal of Statistics, 2008, 2: 494–515.
Ravikumar P, Wainwright M J, Raskutti G, et al., High-dimensional covariance estimation by minimizing l1-penalized log-determinant divergence, Electronic Journal of Statistics, 2011, 5: 935–980.
Benjamini Y and Hochberg Y, Controlling the false discovery rate: A practical and powerful approach to multiple testing, Journal of the Royal Statistical Society: Series B (Methodological), 1995, 57(1): 289–300.
Benjamini Y and Yekutieli D, The control of the false discovery rate in multiple testing under dependency, Annals of Statistics, 2001, 29(4): 1165–1188.
Drton M and Perlman M D, Multiple testing and error control in gaussian graphical model selection, Statistical Science, 2007, 22(3): 430–449.
Liu W D, Gaussian graphical model estimation with false discovery rate control, Annals of Statistics, 2013, 41(6): 2948–2978.
Li J Z and Maathuis M H, GGM knockoff filter: False discovery rate control for gaussian graphical models, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2021, 83(3): 534–558.
Barber R F and Candès E J, Controlling the false discovery rate via knockoffs, Annals of Statistics, 2015, 43(5): 2055–2085.
Zhou J, Li Y, Zheng Z M, et al., Reproducible learning in large-scale graphical models, Journal of Multivariate Analysis, 2022, 189: 104934.
Candès E, Fan Y Y, Janson L, et al., Panning for gold: ‘Model-x’ knockoffs for high dimensional controlled variable selection, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2018, 80(3): 551–577.
Anderson T W, An Introduction to Multivariate Statistical Analysis (Wiley Series in Probability and Statistics), World Publishing Co., Beijing, 2003.
Du L L, Guo X, Sun W G, et al., False discovery rate control under general dependence by symmetrized data aggregation, Journal of the American Statistical Association, 2023, 118(541): 607–621.
Meinshausen N, Meier L, and Bühlmann P, P-values for high-dimensional regression, Journal of the American Statistical Association, 2009, 104(488): 1671–1681.
Wang X Y and Leng C L, High dimensional ordinary least squares projection for screening variables, Journal of the Royal Statistical Society: Series B: (Statistical Methodology), 2016, 78(3): 589–611.
Wasserman L and Roeder K, High dimensional variable selection, Annals of Statistics, 2009, 37(5A): 2178.
Barber R F, Candès E J, and Samworth R J, Robust inference with knockoffs, Annals of Statistics, 2020, 48(3): 1409–1431.
De Geer S A V and Bühlmann P, On the conditions used to prove oracle results for the lasso, Electronic Journal of Statistics, 2009, 3: 1360–1392.
Bühlmann P and Mandozzi J, High-dimensional variable screening and bias in subsequent inference, with an empirical comparison, Computational Statistics, 2014, 29(3): 407–430.
Liu W D and Shao Q M, Phase transition and regularized bootstrap in large-scale t-tests with false discovery rate control, Annals of Statistics, 2014, 42(5): 2003–2025.
Xia Y, Cai T X, and Cai T T, Testing differential networks with applications to the detection of gene-gene interactions, Biometrika, 2015, 102(2): 247–266.
Zhao T, Liu H, Roeder K, et al., The huge package for high-dimensional undirected graph estimation in R, Journal of Machine Learning Research, 2012, 13(1): 1059–1062.
Zhang R, Ren Z, and Chen W, SILGGM: An extensive r package for efficient statistical inference in large-scale gene networks, PLoS Computational Biology, 2018, 14(8): e1006369.
Drton M and Maathuis M H, Structure learning in graphical modeling, Annual Review of Statistics and Its Application, 2017, 4: 365–393.
He Y, Zhang X S, Wang P P, et al., High dimensional gaussian copula graphical model with FDR control, Computational Statistics & Data Analysis, 2017, 113: 457–474.
Cai T, Liu W D, and Xia Y, Two-sample covariance matrix testing and support recov6ery in high-dimensional and sparse settings, Journal of the American Statistical Association, 2013, 108(501): 265–277.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors declare no conflict of interest.
Additional information
This research was supported partially by the China National Key R&D Program under Grant Nos. 2019YFC1908502, 2022YFA1003703, 2022YFA1003802, and 2022YFA1003803, and the National Natural Science Foundation of China under Grant Nos. 11925106, 12231011, 11931001, and 11971247.
Rights and permissions
About this article
Cite this article
Zhang, Y., Liu, Y. & Wang, Z. Support Recovery of Gaussian Graphical Model with False Discovery Rate Control. J Syst Sci Complex 36, 2605–2623 (2023). https://doi.org/10.1007/s11424-023-2123-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11424-023-2123-y