Abstract
For high-dimensional nonparametric Behrens-Fisher problem in which the data dimension is larger than the sample size, the authors propose two test statistics in which one is U-statistic Rank-based Test (URT) and another is Cauchy Combination Test (CCT). CCT is analogous to the maximum-type test, while URT takes into account the sum of squares of differences of ranked samples in different dimensions, which is free of shapes of distributions and robust to outliers. The asymptotic distribution of URT is derived and the closed form for calculating the statistical significance of CCT is given. Extensive simulation studies are conducted to evaluate the finite sample power performance of the statistics by comparing with the existing method. The simulation results show that our URT is robust and powerful method, meanwhile, its practicability and effectiveness can be illustrated by an application to the gene expression data.
Similar content being viewed by others
References
Ozaki K, Functional SNPs in the lymphotoxin-alpha gene that are associated with susceptibility to myocardial infarction, Nature Genetics, 2002, 32(4): 650–654.
Klein R J, Zeiss C, Chew E Y, et al., Complement factor H polymorphism in age-related macular degeneration, Science, 2005, 308(5720): 385–389.
Potthoff R F, Use of the Wilcoxon statistic for a generalized Behrens-Fisher problem, Annals of Mathematical Statistics, 1963, 34: 1596–1599.
Xie T, Cao R, and Yu P, Rank-based test for partial functional linear regression models, Journal of Systems Science and Complexity, 2020, 33(5): 1571–1584.
Brunner E, Munzel U, and Puri M L, The multivariate nonparametric Behrens-Fisher problem, Journal of Statistical Planning and Inference, 2002, 108: 37–53.
O’Brien P C, Procedures for comparing samples with multiple endpoints, Biometrics, 1984, 40: 1079–1087.
Huang P, Tilley B C, Woolson R F, et al., Adjusting O’Brien’s test to control type I error for the generalized nonparametric Behrens-Fisher problem, Biometrics, 2005, 61: 532–539.
Liu A, Li Q, Liu C, et al., A rank-based test for comparison of multidimensional outcomes, Journal of the American Statistical Association, 2010, 105: 578–587.
Li Z, Cao F, Zhang J, et al., Summation of absolute value test for multiple outcome comparison with moderate effect, Journal of Systems Science and Complexity, 2013, 26(3): 462–469.
Bonferroni C E, Teoria statistica delle classi e calcolo delle probabilità, Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze, 1936, 8: 3–62.
Mann H B and Whitney D R, On a test of whether one of two random variables is stochastically larger than the other, Annals of Mathematical Statistics, 1947, 18(1): 50–60.
Liu Y and Xie J, Cauchy combination test: A powerful test with analytic p-value calculation under arbitrary dependency structures, Journal of the American Statistical Association, 2019, 115: 393–402.
Bu D L, Yang Q L, Meng Z, et al., Truncated tests for combining evidence of summary statistics, Genetic Epidemiology, 2020, 44: 687–701.
Yankner B A, A century of cognitive decline, Nature, 2000, 404(6774): 125.
Lu T, Pan Y, Kao S, et al., Gene regulation and DNA damage in the ageing human brain, Nature, 2004, 429: 883–891.
Li Z B, Liu A, Li Z, et al., Rank-based tests for comparison of multiple endpoints among several populations, Statistics and Its Interface, 2014, 7(1): 9–18.
Li J, Zhang W, Zhang S, et al., A theoretic study of a distance-based regression model, Science in China Series A Mathematics, 2019, 62: 979–998.
Wang J, Li J, Xiong W, et al., Group analysis of distance matrices, Genetic Epidemiology, 2020, 44: 620–628.
Koroljuk V S and Borovskich Yu V, Theory of U-Statistics, Kluwer Academic Publishers, The Netherlands, 1994.
Hoeffding W and Robbins H, The central limit theorem for dependent random variables, Duke Mathematics Journal, 1948, 15: 773–780.
Diananda P H, The central limit theorem for m-dependent variables, Mathematical Proceedings of the Cambridge Philosophical Society, 1955, 51: 92–95.
Orey S A, Central limit theorems for m-dependent random variables, Duke Mathematics Journal, 1958, 25: 543–546.
Berk K N, A central limit theorem for m-dependent random variables with unbounded m, Annals of Probability, 1973, 1: 352–354.
Romano J P and Wolf M, A more general central limit theorem for m-dependent random variables with unbounded m, Statistics and Probability Letters, 2000, 47: 115–124.
Author information
Authors and Affiliations
Corresponding author
Additional information
This paper was supported by Beijing Natural Science Foundation under Grant No. Z180006 and the National Nature Science Foundation of China under Grant No. 11722113.
This paper was recommended for publication by Editor LI Qizhai.
Rights and permissions
About this article
Cite this article
Meng, Z., Li, N. & Yuan, A. Testing High-Dimensional Nonparametric Behrens-Fisher Problem. J Syst Sci Complex 35, 1098–1115 (2022). https://doi.org/10.1007/s11424-021-0257-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11424-021-0257-3