Skip to main content
Log in

Weighted multiple testing procedure for grouped hypotheses with k-FWER control

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

In this paper, k-FWER (generalized familywise error rate) control for grouped hypotheses testing is considered. We offer the weights for the p-values in each group, by maximizing an objective function, which is the expectation of the proportion of rejected hypotheses. This objective function utilizes not only the information of the proportion of true null hypotheses, but also the null and non-null distributions of the p-values in each group. When this information is known prior, our weighted testing procedure controls k-FWER for arbitrarily dependent p-values. When this information is unknown, and is estimated from the data, our procedure asymptotically controls k-FWER under the weak dependence assumption of the p-values in each group. The new procedure is shown to be more powerful than some existing procedures both in theory and simulations. For illustration, the proposed procedure is applied to analyse the adequate yearly progress data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Basu P, Cai T, Das K, Sun W (2018) Weighted false discovery rate control in large-scale multiple testing. J Am Stat Assoc. https://doi.org/10.1080/01621459.2017.1336443

    MathSciNet  MATH  Google Scholar 

  • Benjamini Y, Cohen R (2017) Weighted false discovery rate controlling procedures for clinical trials. Biostatistics 18:91–104

    Article  MathSciNet  Google Scholar 

  • Benjamini Y, Heller R (2007) False discovery rates for spatial signals. J Am Stat Assoc 102:1272–1281

    Article  MathSciNet  MATH  Google Scholar 

  • Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B 57:289–300

    MathSciNet  MATH  Google Scholar 

  • Benjamini Y, Hochberg Y (2000) On the adaptive control of the false discovery rate in multiple testing with independent statistics. J Educ Behav Stat 25:60–83

    Article  Google Scholar 

  • Cai T, Sun W (2009) Simultaneous testing of grouped hypotheses: finding needles in multiple haystacks. J Am Stat Assoc 104:1467–1481

    Article  MathSciNet  MATH  Google Scholar 

  • Clements N, Sarkar SK, Guo W (2012) Astronomical transient detection controlling the false discovery rate. In: Feigelson E, Babu G (eds) Statistical challenges in modern astronomy V. Springer, New York, pp 383–396

    Chapter  Google Scholar 

  • Efron B (2008) Simultaneous inference: when should hypothesis testing problems be combined? Ann Appl Stat 2:197–223

    Article  MathSciNet  MATH  Google Scholar 

  • Genovese C, Wasserman L (2004) A stochastic process approach to false discovery control. Ann Stat 32:1035–1061

    Article  MathSciNet  MATH  Google Scholar 

  • Guo W, Romano JP (2007) A generalized Sidak-Holm procedure and control of generalized error rates under independence. Stat Appl Genet Mol Biol 6:3

    Article  MathSciNet  MATH  Google Scholar 

  • Hu J, Zhao H, Zhou H (2010) False discovery rate control with groups. J Am Stat Assoc 105:1215–1227

    Article  MathSciNet  MATH  Google Scholar 

  • Jin J (2008) Proportion of non-zero normal means: universal oracle equivalences and uniformly consistent estimators. J R Stat Soc Ser B 70:461–493

    Article  MathSciNet  MATH  Google Scholar 

  • Jin J, Cai T (2007) Estimating the null and the proportion of nonnull effects in large-scale multiple comparisons. J Am Stat Assoc 102:495–506

    Article  MATH  Google Scholar 

  • Kellerer H, Pferschy U, Pisinger D (2004) Knapsack problems. Springer, Berlin

    Book  MATH  Google Scholar 

  • Lehmann EL, Romano JP (2005) Generalizations of the familywise error rate. Ann Stat 33:1138–1154

    Article  MathSciNet  MATH  Google Scholar 

  • Liu Y, Sarkar SK, Zhao Z (2016) A new approach to multiple testing of grouped hypotheses. J Stat Plan Inference 179:1–14

    Article  MathSciNet  MATH  Google Scholar 

  • Romano JP, Shaikh AM (2006) Stepup procedures for control of generalizations of the familywise error rate. Ann Stat 34:1850–1873

    Article  MathSciNet  MATH  Google Scholar 

  • Sarkar SK (2007) Stepup procedures controlling generalized FWER and generalized FDR. Ann Stat 35:2405–2420

    Article  MathSciNet  MATH  Google Scholar 

  • Sarkar SK (2008) Generalizing Simes’ test and Hochberg’s stepup procedure. Ann Stat 36:337–363

    Article  MathSciNet  MATH  Google Scholar 

  • Storey JD, Taylor JE, Siegmund D (2004) Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach. J R Stat Soc Ser B 66:187–205

    Article  MathSciNet  MATH  Google Scholar 

  • Sun W, Cai T (2007) Oracle and adaptive compound decision rules for false discovery rate control. J Am Stat Assoc 102:901–912

    Article  MathSciNet  MATH  Google Scholar 

  • van der Lann MJ, Dudoit S, Pollard KS (2004) Augmentation procedures for control of the generalized family-wise error rate and tail probabilities for the proportion of false positives. Stat Appl Genet Mol Biol 3(1):15

    MathSciNet  MATH  Google Scholar 

  • Wang L, Xu X (2012) Step-up procedure controlling generalized family-wise error rate. Stat Probab Lett 82:775–782

    Article  MathSciNet  MATH  Google Scholar 

  • Zhao H (2014) Adaptive FWER control procedure for grouped hypotheses. Stat Probab Lett 95:63–70

    Article  MathSciNet  MATH  Google Scholar 

  • Zhao H, Zhang J (2014) Weighted \(p\)-value procedures for controlling FDR of grouped hypotheses. J Stat Plan Inference 151–152:90–106

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The author thanks the reviewers and associate editor for their useful comments, which greatly improved the quality of the paper. The author also thanks Professor Wenguang Sun for sharing his data and code. The work was supported by the National Natural Science Foundation of China (Grant Nos. 11626227, 11671398) and the Fundamental Research Funds for the Central Universities (Grant No. 2015QS03).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Wang.

Appendices

Appendix

Proof of Proposition 1

Proof

$$\begin{aligned} power(\varvec{\omega }^*,\frac{k}{n}\alpha )&=\frac{1}{n_{T1}}\sum _{g=1}^G\sum _{j\in I_{g1}}P\left( p_{gj}\le \frac{k\alpha }{na_{T0}}\right) \\&=\sum _{g=1}^G\frac{n_{g1}}{n_{T1}}F\left( \frac{k\alpha }{na_{T0}}\right) \\&=F\left( \frac{k\alpha }{na_{T0}}\right) , \end{aligned}$$

and

$$\begin{aligned} power(\varvec{\omega }^{**},\frac{k}{n}\alpha )&=\frac{1}{n_{T1}}\sum _{g=1}^G\sum _{j\in I_{g1}}P\left( p_{gj}\le \frac{k\alpha }{na_{g0}}\right) \nonumber \\&=\sum _{g=1}^G\frac{n_{g1}}{n_{T1}}F\left( \frac{k\alpha }{na_{g0}}\right) \\&\ge F\left( \frac{k\alpha }{n\sum _{g=1}^G\frac{n_{g1}}{n_{T1}}a_{g0}}\right) ,\nonumber \end{aligned}$$
(5)

where the last inequality follows from the convexity of \(F(\cdot /x)\), and the equality holds if and only if \(a_{10}=\cdots =a_{G0}\). Thus, it only needs to verify \(na_{T0}\ge n\sum _{g=1}^G\frac{n_{g1}}{n_{T1}}a_{g0}\). Since

$$\begin{aligned}&na_{T0}- n\sum _{g=1}^G\frac{n_{g1}}{n_{T1}}a_{g0}\\&\quad =\sum _{g=1}^Gn_{g0}-\frac{n}{n_{T1}}\sum _{g=1}^{G}n_{g1}a_{g0}\\&\quad =\sum _{g=1}^Gn_{g0}-\frac{n}{n-\sum _{g=1}^Gn_{g0}}\sum _{g=1}^{G}(n_g-n_{g0})\frac{n_{g0}}{n_g}\\&\quad =\frac{\sum _{g=1}^Gn_{g}\sum _{g=1}^{G}\frac{n_{g0}^2}{n_g}-(\sum _{g=1}^Gn_{g0})^2}{n-\sum _{g=1}^Gn_{g0}}\ge 0, \end{aligned}$$

where the last inequality follows from Cauchy-Schwartz inequality, and the equality holds if and only if \(n_{10}/n_1=\cdots =n_{G0}/n_G\), e.g. \(a_{10}=\cdots =a_{G0}\).

Thus, the proof is completed. \(\square \)

Remark 1

In the proof of Theorem 2 in Zhao (2014), \(power(\varvec{\omega }^{**},\frac{k}{n}\alpha )=\sum _{g=1}^Ga_{g1}F(\frac{k\alpha }{na_{g0}})\), which is not true, and we correct in formula (5).

1.1 Proof of Theorem 1

Proof

Denote V be the number of false rejections.

$$\begin{aligned} k\text {-}\mathrm {FWER}&\le \frac{E(V)}{k}\\&=\frac{1}{k}E\left\{ \sum _{g=1}^G\sum _{j\in I_{g0}}I\left( p_{gj}\le \omega _{g} \frac{k}{n}\alpha \right) \right\} \\&=\frac{n}{k}\sum _{g=1}^Ga_ga_{g0}F_{g0}\left( \omega _g\frac{k}{n}\alpha \right) \\&=\frac{n}{k}\sum _{g=1}^Ga_ga_{g0}\omega _g\frac{k}{n}\alpha \\&=\frac{n}{k}\frac{k}{n}\alpha =\alpha . \end{aligned}$$

\(\square \)

1.2 Proof of Theorem 2

Proof

For simplicity, we drop the subscript n in \(V_n\) and \(k_n\) throughout the proof.

(1). Since \(\frac{k}{n}\alpha \rightarrow c\alpha \), following the proof of Lemma 3 in Zhao and Zhang (2014), we can get

$$\begin{aligned} {\hat{r}}\left( \varvec{\omega }_0\left( {\hat{r}}, \frac{k}{n}\alpha \right) ,\frac{k}{n}\alpha \right) {\mathop {\rightarrow }\limits ^{\mathrm {P}}}{\tilde{r}}(\varvec{\omega }_0({\tilde{r}}, c\alpha ),c\alpha ), \text { as } n\rightarrow \infty , \end{aligned}$$

and

$$\begin{aligned} {\widehat{power}}\left( \varvec{\omega }_0\left( {\hat{r}}, \frac{k}{n}\alpha \right) ,\frac{k}{n}\alpha \right) {\mathop {\rightarrow }\limits ^{\mathrm {P}}}\widetilde{power}(\varvec{\omega }_0({\tilde{r}}, c\alpha ),c\alpha ), \text { as } n\rightarrow \infty . \end{aligned}$$

(2) Denote \(\varvec{\omega }_0({\hat{r}}, \frac{k}{n}\alpha )=(\omega _{10}({\hat{r}}, \frac{k}{n}\alpha ),\ldots ,\omega _{G0}({\hat{r}}, \frac{k}{n}\alpha ))'\), and \(\varvec{\omega }_0({\tilde{r}}, c\alpha )=(\omega _{10}({\tilde{r}}, c\alpha ),\ldots ,\omega _{G0}({\tilde{r}}, c\alpha ))'\). Similarly,

$$\begin{aligned} \frac{1}{n}\sum _{g=1}^G\sum _{j\in I_{g0}}I\left( p_{gj}\le \omega _{g0}({\hat{r}}, \frac{k}{n}\alpha )\frac{k}{n}\alpha \right) {\mathop {\rightarrow }\limits ^{\mathrm {P}}}\sum _{g=1}^G\pi _g\pi _{g0}F_{g0}\left( \omega _{g0}({\tilde{r}}, c\alpha )c\alpha \right) . \end{aligned}$$

By dominated convergence theorem, we have

$$\begin{aligned} E\left[ \frac{1}{n}\sum _{g=1}^G\sum _{j\in I_{g0}}I\left( p_{gj}\le \omega _{g0}({\hat{r}}, \frac{k}{n}\alpha )\frac{k}{n}\alpha \right) \right] \rightarrow \sum _{g=1}^G\pi _g\pi _{g0}F_{g0}\left( \omega _{g0}({\tilde{r}}, c\alpha )c\alpha \right) . \end{aligned}$$

Then for WT\((\varvec{\omega }_0({\hat{r}}, \frac{k}{n}\alpha ),\frac{k}{n}\alpha )\),

$$\begin{aligned} k\text {-}\mathrm {FWER}&\le \frac{E(V)}{k}\\&=\frac{n}{k}E\left\{ \frac{1}{n}\sum _{g=1}^G\sum _{j\in I_{g0}}I\left( p_{gj}\le \omega _{g0}({\hat{r}}, \frac{k}{n}\alpha )\frac{k}{n}\alpha \right) \right\} \\&=\left[ \frac{1}{c}+o(1)\right] \left[ \sum _{g=1}^G\pi _g\pi _{g0}F_{g0}\left( \omega _{g0}({\tilde{r}}, c\alpha )c\alpha \right) +o(1)\right] \\&=\frac{1}{c}\sum _{g=1}^G\pi _g\pi _{g0}\omega _{g0}({\tilde{r}}, c\alpha )c\alpha +o(1)\\&=\alpha +o(1). \end{aligned}$$

Immediately, \(\limsup _{n\rightarrow \infty } P(V\ge k)\le \alpha .\)

(3) Since \({\hat{\pi }}_{g0}{\mathop {\rightarrow }\limits ^{\mathrm {P}}}\pi _{g0}\in (0,1)\), \({\hat{\pi }}_{T0}=\sum _{g=1}^Ga_g{\hat{\pi }}_{g0}{\mathop {\rightarrow }\limits ^{\mathrm {P}}}\pi _{T0}=\sum _{g=1}^G\pi _g\pi _{g0}\), and \(\frac{1}{{\hat{\pi }}_{T0}}\frac{k}{n}\alpha {\mathop {\rightarrow }\limits ^{\mathrm {P}}}\frac{1}{\pi _{T0}}c\alpha \). It is easy to see

$$\begin{aligned} \frac{1}{n}\sum _{g=1}^G\sum _{j\in I_{g0}}I\left( p_{gj}\le \frac{1}{{\hat{\pi }}_{T0}}\frac{k}{n}\alpha \right) {\mathop {\rightarrow }\limits ^{\mathrm {P}}}\sum _{g=1}^G\pi _g\pi _{g0}F_{g0}\left( \frac{1}{\pi _{T0}}c\alpha \right) . \end{aligned}$$

By dominated convergence theorem, we have

$$\begin{aligned} E\left[ \frac{1}{n}\sum _{g=1}^G\sum _{j\in I_{g0}}I\left( p_{gj}\le \frac{1}{{\hat{\pi }}_{T0}}\frac{k}{n}\alpha \right) \right] \rightarrow \sum _{g=1}^G\pi _g\pi _{g0}F_{g0}\left( \frac{1}{\pi _{T0}}c\alpha \right) . \end{aligned}$$

As similar as that for WT\((\varvec{\omega }_0({\hat{r}}, \frac{k}{n}\alpha ),\frac{k}{n}\alpha )\), we can get

$$\begin{aligned} k\text {-}\mathrm {FWER}\le \alpha +o(1), \end{aligned}$$

which means \(\limsup _{n\rightarrow \infty } P(V\ge k)\le \alpha .\)

(4) Since \(\frac{1}{{\hat{\pi }}_{g0}}\frac{k}{n}\alpha {\mathop {\rightarrow }\limits ^{\mathrm {P}}}\frac{1}{\pi _{g0}}c\alpha \) for all g, the proof is similar, and is omitted.

\(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, L. Weighted multiple testing procedure for grouped hypotheses with k-FWER control. Comput Stat 34, 885–909 (2019). https://doi.org/10.1007/s00180-018-0833-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-018-0833-8

Keywords

Navigation