Skip to main content
Log in

Power, FDR and conservativeness of BB-SGoF method

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Beta-binomial sequential goodness-of-fit (or BB-SGoF) method for multiple testing has been recently proposed as a suitable modification of the sequential goodness-of-fit (SGoF) multiple testing method when the tests are correlated in blocks. In this paper we investigate for the first time the power, the FDR and the conservativeness of BB-SGoF in an intensive Monte Carlo simulation study. Important features such as automatic selection of the number of existing blocks and preliminary testing for independence are explored. Our study reveals that (a) BB-SGoF method roughly maintains the properties of original SGoF in the dependent case, reporting a small value for the probability that the number of false positives exceeds the number of false negatives with p value below \(\gamma \); (b) BB-SGoF weakly controls for FDR even when the beta-binomial model is violated and the number of blocks \(k\) is unknown; and that (c) the loss of power of the automatic selector for the number of blocks relative to the benchmark method which uses the true \(k\) varies depending on the proportion and the type (strong, intermediate or weak) of the effects, being strongly influenced by the within-block correlation too.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B 57(1):289–300

    MathSciNet  MATH  Google Scholar 

  • Carvajal-Rodríguez A, de Uña-Álvarez J (2011) Assessing significance in high-throughput experiments by sequential goodness of fit and q-value estimation. PLoS ONE 6(9):e24700

  • Carvajal-Rodríguez A, de Uña-Álvarez J, Rolán-Álvarez E (2009) A new multitest correction (SGoF) that increases its statistical power when increasing the number of tests. BMC Bioinform 10(209):1–14

    Google Scholar 

  • Castro-Conde I, de Uña-Álvarez J (2014a) sgof: An R Package for multiple testing problems. The R Journal (in press). http://journal.r-project.org/archive/accepted/conde-alvarez.pdf

  • Castro-Conde I, de Uña-Álvarez J (2014b) sgof: Multiple hypothesis testing. R package version 2.1.1. http://cran.r-project.org/web/packages/sgof/sgof.pdf

  • de Uña-Álvarez J (2011) On the statistical properties of SGoF multitesting method. Stat Appl Genet Mol Biol 10(1):Article Id 18

  • de Uña-Álvarez J (2012) The beta-binomial SGoF method for multiple dependent tests. Stat Appl Genet Mol Biol 11(3):Article Id 14

  • de Uña-Álvarez J, Carvajal-Rodríguez A (2010) ‘SGoFicance Trace’: assessing significance in high dimensional testing problems. PLoS ONE 5(12):e15930

  • Donoho D, Jin J (2004) Higher criticism for detecting sparse heterogeneous mixtures. Ann Stat 32(3):962–994

    Article  MathSciNet  MATH  Google Scholar 

  • Donoho D, Jin J (2008) Higher criticism thresholding: optimal feature selection when useful features are rare and weak. Proc Natl Acad Sci 105(39):14,790–14,795

    Article  Google Scholar 

  • Dudoit S, Shaffer JP, Boldrick JC (2003) Multiple hypothesis testing in microarray experiments. Stat Sci 18(1):71–103

    Article  MathSciNet  MATH  Google Scholar 

  • Dudoit S, van der Laan MJ (2008) Multiple testing procedures with applications to genomics. Springer, Berlin ISBN: 978-0-387-49316-9

  • Genovese C, Wasserman L (2002) Operating characteristics and extensions of the FDR procedure. J R Stat Soc B 64:499–518

    Article  MathSciNet  MATH  Google Scholar 

  • Genovese C, Wasserman L (2004) A stochastic process approach to false discovery control. Ann Stat 32(3):1038–1061

    MathSciNet  Google Scholar 

  • Genz A, Bretz F, Miwa T, Mi X, Leisch F, Scheipl F, Hothorn T (2014) mvtnorm: Multivariate normal and t distributions. R package version 1.0-0

  • Hedenfalk I, Duggan D et al (2001) Gene-expression profiles in hereditary breast cancer. N Engl J Med 344(8):539–548

    Article  Google Scholar 

  • Lehman E, Romano J (2005) Generalizations of the familywise error rate. Ann Stat 33(3):1138–1154

    Article  Google Scholar 

  • Martínez-Camblor P (2014) On correlated z-values distribution in hypothesis testing. Comput Stat Data Anal 79:30–43

    Article  Google Scholar 

  • Moerkerke B, Goetghebeur E, De Riek J, Roldán-Ruiz I (2006) Significance and impotence: towards a balanced view of the null and the alternative hypotheses in marker selection for plant breeding. J R Stat Soc A 169(1):61–79

    Article  Google Scholar 

  • Nichols T, Hayasaka S (2003) Controlling the familywise error rate in functional neuroimaging: a comparative review. Stat Methods Med Res 12(5):419–446

    Article  MathSciNet  MATH  Google Scholar 

  • Norris AW, Kahn CR (2006) Analysis of gene expression in pathophysiological states: balancing false discovery and false negative rates. Proc Natl Acad Sci 103(3):649–653

    Article  Google Scholar 

  • R Core Team (2014) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/

  • Storey J (2003) The positive false discovery rate: a bayesian interpretation and the q-value. Ann Stat 31:2013–2035

    Article  MathSciNet  MATH  Google Scholar 

  • Storey J, Tibshirani R (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci 100(16):9440–9445

    Article  MathSciNet  MATH  Google Scholar 

  • Tarone R (1979) Testing the goodness of fit of the binomial distribution. Biometrika 66(3):585–590

    Article  MATH  Google Scholar 

  • Tukey JW (1976) T13 N: the higher criticism. Course Notes, Statistics 411, Princeton University

  • van der Laan MJ, Dudoit S, Pollard K (2004) Augmentation procedures for control of the generalized family-wise error rate and tail probabilities for the proportion of false positives. Stat Appl Genet Mol Biol 3(1):Article Id 15. www.bepress.com/sagmb/vol3/iss1/art15

Download references

Acknowledgments

Work was supported by the Grant MTM2011-23204 (FEDER support included) of the Spanish Ministry of Science and Innovation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Irene Castro-Conde.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (R 2 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Castro-Conde, I., de Uña-Álvarez, J. Power, FDR and conservativeness of BB-SGoF method. Comput Stat 30, 1143–1161 (2015). https://doi.org/10.1007/s00180-015-0553-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-015-0553-2

Keywords

Navigation