Skip to main content
Log in

Stochastically optimal bootstrap sample size for shrinkage-type statistics

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

In nonregular problems where the conventional \(n\) out of \(n\) bootstrap is inconsistent, the \(m\) out of \(n\) bootstrap provides a useful remedy to restore consistency. Conventionally, optimal choice of the bootstrap sample size \(m\) is taken to be the minimiser of a frequentist error measure, estimation of which has posed a major difficulty hindering practical application of the \(m\) out of \(n\) bootstrap method. Relatively little attention has been paid to a stronger, stochastic, version of the optimal bootstrap sample size, defined as the minimiser of an error measure calculated directly from the observed sample. Motivated by this stronger notion of optimality, we develop procedures for calculating the stochastically optimal value of \(m\). Our procedures are shown to work under special forms of Edgeworth-type expansions which are typically satisfied by statistics of the shrinkage type. Theoretical and empirical properties of our methods are illustrated with three examples, namely the James–Stein estimator, the ridge regression estimator and the post-model-selection regression estimator.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Ahmed, S.E., Saleh, A.K., Md, E., Volodin, A.I., Volodin, I.N.: Asymptotic expansion of the coverage probability of James–Stein estimators. Theory Probab. Appl. 51, 1–14 (2007)

    Article  Google Scholar 

  • Ahmed, S.E., Volodin, A.I., Volodin, I.N.: High order approximation for the coverage probability by a confident set centered at the positive-part James–Stein estimator. Stat. Probab. Lett. 79, 1823–1828 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  • Bickel, P., Götze, F., van Zwet, W.: Resampling fewer than \(n\) observations: gains, losses and remedies for losses. Stat. Sinica 7, 1–31 (1997)

    MATH  Google Scholar 

  • Bickel, P.J., Sakov, A.: Extrapolation and the bootstrap. Sankhyā Ser. A 64, 640–652 (2002)

    MATH  MathSciNet  Google Scholar 

  • Bickel, P., Sakov, A.: On the choice of \(m\) in the \(m\) out of \(n\) bootstrap and its application to confidence bounds for extrema. Stat. Sinica 18, 967–985 (2008)

    MATH  MathSciNet  Google Scholar 

  • Claeskens, G., Hjort, N.L.: Model Selection and Model Averaging. Cambridge University Press, Cambridge (2008)

    Book  MATH  Google Scholar 

  • Datta, S., McCormick, W.P.: Bootstrap inference for a first-order autoregression with positive innovations. J. Am. Stat. Assoc. 90, 1289–1300 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  • Götze, F.: Asymptotic approximations and the bootstrap. IMS Bulletin, 56th AMS-Meeting, p. 305 (1993)

  • Götze, F., Rac̆kauskas, A.: Adaptive choice of bootstrap sample sizes. In: State of the Art in Probability and Statistics, pp. 286–309. I.M.S Publications, London (2001)

  • Hall, P.: The Bootstrap and Edgeworth Expansion. Springer, New York (1992)

    Book  Google Scholar 

  • Hall, P., Horowitz, J.L., Jing, B.: On blocking rules for the bootstrap with dependent data. Biometrika 82, 561–574 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  • Hoerl, A.E., Kennard, R.W., Baldwin, K.F.: Ridge regression: some simulations. Commun. Stat. 4, 105–123 (1975)

    Article  MATH  Google Scholar 

  • Lahiri, S.N.: On bootstrapping M-estimators. Sankhyā Ser. A 54, 157–170 (1992)

    MATH  Google Scholar 

  • Mammen, E.: When Does Bootstrap Work: Asymptotic Results and Simulations. Lecture Notes in Statistics, vol. 77. Springer, New York, Heidelberg (1992)

  • Politis, D.N., Romano, J.P., Wolf, M.: Subsampling. Springer, New York (1999)

    Book  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stephen M. S. Lee.

Additional information

Supported by Grants from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project Nos. HKU 703207P and HKU 702508P).

Appendix

Appendix

1.1 Proof of Proposition 1

For \(\rho \left( \mathcal {L}_{m,n}^*-\mathcal {L}_n\right) \) to converge to zero in probability, we must have \(m=o(n)\) necessarily. It then follows from (A.0) that

$$\begin{aligned} \mathcal {L}_{m,n}^*(\cdot )-\mathcal {L}_n(\cdot )&= m^{-\alpha }\hat{A}_1(\cdot )+ (m/n)^{\beta }\hat{A}_2(\cdot )\nonumber \\&+\,O_p(\delta _{n,F})+ o_p\left( m^{-\alpha }+(m/n)^\beta \right) . \end{aligned}$$
(6)

It is clear that \(\rho \left( m^{-\alpha } \hat{A}_1+ (m/n)^{\beta }\hat{A}_2\right) \) is minimised at

$$\begin{aligned} m=m'_{ opt }=\Omega _p\left( n^{\beta /(\alpha +\beta )}\right) , \end{aligned}$$
(7)

so that

$$\begin{aligned}&\min _m\,\rho \left( m^{-\alpha }\hat{A}_1+(m/n)^{\beta }\hat{A}_2\right) \nonumber \\&\quad =\rho \left( {m'_{ opt }}^{\!\!\!\!-\alpha }\hat{A}_1+(m'_{ opt }/n)^{\beta } \hat{A}_2\right) \nonumber \\&\quad =\Omega _p\left( n^{-\alpha \beta /(\alpha +\beta )}\right) . \end{aligned}$$
(8)

If \(\delta _{n,F}=o(n^{-\alpha \beta /(\alpha +\beta )})\), then necessarily \(\delta _{n,F}=o_p\) \(\left( \rho \left( m^{-\alpha } \hat{A}_1+(m/n)^{\beta }\hat{A}_2\right) \right) _{}\) for any \(m\), so that, using (6), (7) and (8),

$$\begin{aligned}&\min _m\,\rho \left( \mathcal {L}_{m,n}^*-\mathcal {L}_n\right) =\rho \left( {m'_{ opt }}^{\!\! \!\!-\alpha }\hat{A}_1+(m'_{ opt }/n)^{\beta }\hat{A}_2\right) \\&\quad \times \left\{ 1+ o_p(1)\right\} = \rho \left( \mathcal {L}_{m'_{ opt },n}^*-\mathcal {L}_n\right) \left\{ 1+o_p(1)\right\} , \end{aligned}$$

which implies \(m_{ opt }=m'_{ opt }(1+o_p(1))\) and the results of part (i) follow.

Under the assumption of part (ii), we have \(\rho \left( m^{-\alpha } \hat{A}_1+\right. \) \(\left. (m/n)^{\beta }\hat{A}_2\right) =O_p(\delta _{n,F})\) whenever

$$\begin{aligned} m=O_p\left( n\delta _{n,F}^{1/\beta }\right) \text{ and } m^{-1}=O_p\left( \delta _{n,F}^{1/\alpha }\right) ; \end{aligned}$$
(9)

and \(\rho \left( m^{-\alpha }\hat{A}_1+(m/n)^{\beta }\hat{A}_2\right) \) has an order exceeding \(\delta _{n,F}\) if (9) fails to hold. It thus follows, using (6), that \(\rho \left( \mathcal {L}_{m_{ opt },n}^*-\right. \) \(\left. \mathcal {L}_n\right) =\underset{m}{\min }\,\rho \left( \mathcal {L}_{m,n}^*-\mathcal {L}_n\right) =O_p\left( \delta _{n,F}\right) \), with \(m=m_{ opt }\) satisfying (9). The last assertion of (ii) follows trivially by noting (7) and that \(m=\Omega _p\left( n^{\beta /(\alpha +\beta )}\right) \) satisfies (9), which completes the proof of part (ii).

1.2 Proof of Proposition 2

In view of Proposition 1, it suffices to prove that \(\hat{m}=m'_{ opt }(1+o_p(1))\) for \(\hat{m}=\hat{m}_k^{(1)}, \hat{m}_k^{(2)}\) and \(\hat{m}_u\).

Consider first the case \(\hat{m}=\hat{m}_u\). It follows from the proof of Bickel and Sakov’s (2008) Theorem 3 that \(\underset{m}{\inf }\, \rho (\mathcal {L}_{m,n}^*-\mathcal {L}_n)=\Omega _p(n^{-\alpha \beta /(\alpha +\beta )})\) and \(\hat{m}_q=\Omega _p(m_{ opt })=\Omega _p(n^{\beta /(\alpha +\beta )})\). Under (A.0) and the conditions assumed on the \(m_i\)’s, we have

$$\begin{aligned}&\left\{ \mathcal {L}_{m_1,n}^*(\cdot )-\mathcal {L}_{m_3,n}^*(\cdot )\right\} \left\{ \mathcal {L}_{m_2,n}^*(\cdot )-\mathcal {L}_{m_4,n}^*(\cdot )\right\} ^{-1}\\&\quad =(m_1/m_2)^{-\alpha }(1+o_p(1)), \end{aligned}$$

so that \(\hat{\alpha }(\cdot )=\alpha \left\{ 1+o_p((\ln n)^{-1})\right\} \). It follows that

$$\begin{aligned}&m_1^{\hat{\alpha }(\cdot )}\left\{ \mathcal {L}_{m_1,n}^*(\cdot )-\mathcal {L}_{m_4,n}^* (\cdot )\right\} \\&\quad =m_1^{\alpha }\left\{ \mathcal {L}_{m_1,n}^*(\cdot )-\mathcal {L}_{m_4, n}^*(\cdot )\right\} +o_p(1)=\hat{A}_1(\cdot )+o_p(1). \end{aligned}$$

Similar arguments show that \(\hat{\beta }(\cdot )=\beta \left\{ 1+o_p((\ln n)^{-1})\right\} \), and hence

$$\begin{aligned} l_1^{-\hat{\beta }(\cdot )}\left\{ \mathcal {L}_{l_1,n}^*(\cdot )-\mathcal {L}_{l_4,n}^* (\cdot )\right\} =n^{-\beta }\hat{A}_2(\cdot )+o_p(n^{-\beta }). \end{aligned}$$

Using (A.0) again, we have

$$\begin{aligned}&m^{-\alpha }\hat{A}_1+(m/n)^{\beta }\hat{A}_2=(m/m_1)^{-\hat{\alpha }} \left\{ \mathcal {L}_{m_1,n}^*-\mathcal {L}_{m_4,n}^*\right\} \\&\quad +\, (m/l_1)^{\hat{\beta }} \left\{ \mathcal {L}_{l_1,n}^*-\mathcal {L}_{l_4,n}^*\right\} + o_p(m^{-\alpha }+m^{\beta }n^{-\beta }), \end{aligned}$$

which implies that \(\hat{m}_u=m'_{ opt }(1+o_p(1))\). The same result holds for \(\hat{m}_k^{(1)}\) as an immediate corollary.

To show that \(\hat{m}^{(2)}_k=m'_{ opt }(1+o_p(1))\), we note first that, for bootstrap sample sizes \(M_1,M_2\) satisfying \(M_i=\Omega (n^{\beta /(\alpha +\beta )})\), we have, according to (A.0), that

$$\begin{aligned}&\mathcal {L}_{M_1,n}^*(\cdot )-\mathcal {L}_{M_2,n}^*(\cdot )=(M_1^{-\alpha }-M_2^{-\alpha }) \hat{A}_1(\cdot )\\&\quad +\, n^{-\beta }(M_1^{\beta }-M_2^{\beta })\hat{A}_2 (\cdot )+o_p(n^{-\alpha \beta /(\alpha +\beta )}). \end{aligned}$$

Setting \((M_1,M_2)\) to be \((m_1,m_2)\) and \((m_2,m_3)\) in the above expansion yields two equations from which we can obtain solutions, up to order \(o_p(1)\), for \(\hat{A}_1(\cdot )\) and \(\hat{A}_2(\cdot )\). The objective functions used for defining \(m_{ opt }'\) and \(\hat{m}^{(2)}_k\) are therefore asymptotically equivalent to first order, so that \(\hat{m}^{(2)}_k=m'_{ opt }(1+o_p(1))\). This completes our proof.

1.3 Proof of (A.0) for example 3.1

Define \(\mathcal{Z}_n=n^{1/2}\Sigma ^{-1/2}(\bar{X}-\theta _n)\) and \(\mathcal{V}_n=n^{1/2}(\hat{\Sigma }-\Sigma )\). Note that \(\mathcal{Z}_n\) converges weakly to a \(p\)-variate standard normal random vector \(\mathcal{Z}\). For any \(\psi \in {{\mathbb {R}}}^p\), define \(\mathcal{R}_n(\psi ) =\left( \Sigma +n^{-1/2}\mathcal{V}_n\right) ^{-1/2} \left( \psi + \Sigma ^{1/2}\mathcal{Z}_n\right) \), and \(\mathcal {J}_n(\cdot |\psi )\) to be the distribution function of

$$\begin{aligned}&\left\| \left( 1-a\left\| \mathcal{R}_n(\psi )\right\| ^{-2}\right) \mathbf{1}\{\left\| \mathcal{R}_n(\psi )\right\| ^{2}>k\}\mathcal{R}_n (\psi )\right. \\&\quad -\left. \left( \Sigma +n^{-1/2}\mathcal{V}_n\right) ^{-1/2}\psi \right\| ^2. \end{aligned}$$

Under suitable moment and Cramér’s conditions on \(F\), Hall (1992) derives Edgeworth expansions under a general smooth function model setting. The method can be applied to establish an Edgeworth expansion for the joint density of \((\mathcal{Z}_n,\mathcal{V}_n)\), which can then be integrated to provide an expansion for \(\mathcal {J}_n(\cdot |\psi )\) of the form

$$\begin{aligned} \mathcal {J}_n(x|\psi )=\mathcal {J}_\infty (x|\psi )+n^{-1/2}\mathcal {J}_{\infty ,1}(x| \psi )+o(n^{-1/2}), \end{aligned}$$
(10)

uniformly over \(x>0\) and \(\psi \) in an open neighbourhood of 0, where \(\mathcal {J}_{\infty }(\cdot |\psi )\) and \(\mathcal {J}_{\infty ,1}(\cdot |\psi )\) are functions depending smoothly on \(\psi \) and the moments of \(F\). Ahmed et al. (2007) show that when \(\psi \rightarrow 0, \mathcal {J}_\infty (\cdot |\psi )\) depends on \(\psi \) through \(\psi ^{\mathrm{T}}\Sigma ^{-1}\psi \). Writing \(J_\infty (\cdot |\psi ^{\mathrm{T}}\Sigma ^{-1}\psi )\) for \(\mathcal {J}_\infty (\cdot |\psi )\), setting \(\psi =n^{1/2}\theta _n\rightarrow 0\) and Taylor expanding (10) about \(\psi =0\), we obtain an expansion for \(\mathcal {L}_n(c^2)\), given by

$$\begin{aligned} \mathcal {J}_n(c^2|n^{1/2}\theta _n)&= J_\infty (c^2|0)+\partial J_\infty (c^2|0) \left( n\,\theta _n^{\mathrm{T}}\Sigma ^{-1}\theta _n\right) \nonumber \\&\quad +\,n^{-1/2}\mathcal {J}_{\infty ,1}(c^2|0) + o\left( n^{-1/2}+n\Vert \theta _n\Vert ^2\right) \nonumber \\&\quad \longrightarrow J_\infty (c^2|0) \text{ as } n\rightarrow \infty , \end{aligned}$$
(11)

where \(\partial J_\infty (\cdot |0)=\left. (\partial /\partial \tau )J_\infty (\cdot |\tau )\right| _{\tau =0}\). It follows that \(c^2\) should be set at \(J_\infty ^{-1}(1-\kappa |0)+o_p(1)\) for \(D_k(c^2)\) to be asymptotically correct. The bootstrap distribution \(\mathcal {L}^*_{m,n}(\cdot )\) has an expansion given by the sample version of (10) at \(\psi =m^{1/2}\bar{X}\), that is

$$\begin{aligned} \mathcal {L}^*_{m,n}(x)&= \hat{\mathcal {J}}_\infty (x|m^{1/2}\bar{X})+m^{-1/2} \hat{\mathcal {J}}_{\infty ,1}(x|m^{1/2}\bar{X})\nonumber \\&\quad +\,o_p(m^{-1/2}), \end{aligned}$$
(12)

where \(\hat{\mathcal {J}}_\infty (\cdot |\cdot )\) and \(\hat{\mathcal {J}}_{\infty ,1}(\cdot |\cdot )\) are sample versions of \(\mathcal {J}_\infty (\cdot |\cdot )\) and \(\mathcal {J}_{\infty ,1}(\cdot |\cdot )\), respectively, obtained by replacing population with sample moments of \(F\) in the definitions of the latter. For the case \(m=n, m^{1/2}\bar{X}=\Sigma ^{1/2}\mathcal{Z}_n+o_p(1)\), so that, by (12), \(\mathcal {L}^*_{n,n}(\cdot )\) converges in probability to a random distribution function \(\mathcal {J}_\infty (\cdot |\Sigma ^{1/2}\mathcal{Z})\). It follows that \(\mathcal {L}^{*-1}_{n,n}(1-\kappa )\) fails to converge in probability to the correct limit \(J_\infty ^{-1}(1-\kappa |0)\). For the case \(m\rightarrow \infty \) and \(m=o(n)\), we have \(m^{1/2}\bar{X}=m^{1/2}n^{-1/2}\left( n^{1/2}\theta _n+\Sigma ^{1/2}\mathcal{Z}_n\right) =o_p(1)\). As in (11), we can expand (12) to obtain

$$\begin{aligned} \mathcal {L}^*_{m,n}(x)&= J_\infty (x|0)+mn^{-1}\partial J_\infty (x|0)\mathcal{Z}_n^{\mathrm{T}}\mathcal{Z}_n\nonumber \\&\quad +\,m^{-1/2}\mathcal {J}_{\infty ,1}(x|0)+ o_p\left( m^{-1/2}+mn^{-1}\right) \!,\nonumber \\ \end{aligned}$$
(13)

which converges in probability to \(J_\infty (x|0)\), inversion of which yields the asymptotically correct limit. It is clear from (11) and (13) that (A.0) holds with \(\varepsilon _n(F)=\partial J_\infty (\cdot |0)\) \(\left( n\,\theta _n^{\mathrm{T}}\Sigma ^{-1}\theta _n\right) (1+o(1)), \alpha =1/2, \beta =1\) and \(\lambda =\infty \), so that \(\delta _{n,F}=|\varepsilon _n(F)|\).

1.4 Proof of (A.0) for example 3.2

Define \(Z_n=n^{1/2}(b(0)-\xi _n), V_n=n^{1/2}\left( n^{-1}X^{\mathrm{T}}X-V_0\right) \) and \(w_n=n^{1/2}(s^2-\sigma ^2)\). For any \(\psi \in {{\mathbb {R}}}^p\), define \(\mathcal {H}_n(\cdot |\psi )\) to be the distribution function of

$$\begin{aligned}&\left( V_0+n^{-1/2}V_n+\frac{p(\sigma ^2+n^{-1/2}w_n)}{\Vert Z_n+ \psi \Vert ^2}I_p\right) ^{-1}\nonumber \\&\quad (V_0+n^{-1/2}V_n) (Z_n+\psi )-\psi , \end{aligned}$$
(14)

so that \(\mathcal {L}_n(\cdot )=\mathcal {H}_n(\cdot |n^{1/2}\xi _n)\). Edgeworth expansions developed by Lahiri (1992) for M-estimators can be adapted to show that, under regularity conditions on the joint distribution of \((X_1,Y_1), \mathcal {H}_n(\cdot |\psi )\) admits an Edgeworth-type expansion of the form

$$\begin{aligned} \mathcal {H}_n(x|\psi )=\mathcal {H}_\infty (x|\psi )\,{+}\,n^{-1/2}\mathcal {H}_{\infty , 1}(x|\psi )\,{+}\,o(n^{-1/2}), \end{aligned}$$
(15)

uniformly over \(x\in {{\mathbb {R}}}\) and \(\psi \) in an open neighbourhood of 0, where \(\mathcal {H}_\infty (x|\psi )\) and \(\mathcal {H}_{\infty ,1}(x|\psi )\) depend smoothly on \(\psi \) and the moments of \((X_1,Y_1)\). Letting \(\nabla \mathcal {H}_\infty (\cdot |0)=\left. (\partial /\partial \psi )\mathcal {H}_\infty (\cdot | \psi )\right| _{\psi =0}\), setting \(\psi =n^{1/2}\xi _n\) and Taylor expanding about \(\psi =0\), we have

$$\begin{aligned} \mathcal {H}_n(x|n^{1/2}\xi _n)&= \mathcal {H}_\infty (x|0)+n^{1/2}\nabla \mathcal {H}_\infty (x |0)^{\mathrm{T}}\xi _n\nonumber \\&\quad +\,n^{-1/2}\mathcal {H}_{\infty ,1}(x|0)+o(n^{-1/2}+n^{1/2}\Vert \xi _n\Vert )\nonumber \\&\quad \longrightarrow \mathcal {H}_\infty (x|0). \end{aligned}$$
(16)

The \(m\) out of \(n\) bootstrap analogues of \(Z_n, V_n\) and \(w_n\) are given respectively by \(Z^*_m=m^{1/2}(b^*_m(0)-b(0)), V^*_m=m^{1/2}\left( m^{-1}X^{*\text {T}} X^*-n^{-1}X^{\mathrm{T}}X\right) \) and \(w^*_m=m^{1/2}(s^{*2}_m-s^2)\). Drawing on the analogy between (14) and

$$\begin{aligned}&m^{1/2}\left( \hat{b}_m^*-b(0)\right) \\&\quad = \left( n^{-1}X^{\mathrm{T}}X+m^{-1/2}V^*_m+\frac{p(s^2+m^{-1/2} w^*_m)}{\Vert Z^*_m+m^{1/2}b(0)\Vert ^2}I_p\right) ^{-1}\\&\quad \times \left( n^{-1}X^{\mathrm{T}}X+m^{-1/2} V^*_m\right) \left( Z^*_m+m^{1/2}b(0)\right) \!-\!m^{1/2}b(0), \end{aligned}$$

we deduce an expansion analogous to (15) for \(\mathcal {L}^*_{m,n}\), given by

$$\begin{aligned} \mathcal {L}^*_{m,n}(x)&= \hat{\mathcal {H}}_\infty (x|m^{1/2}b(0))+m^{-1/2} \hat{\mathcal {H}}_{\infty ,1}(x|m^{1/2}b(0))\nonumber \\&\quad +\,o_p(m^{-1/2}), \end{aligned}$$
(17)

where \(\hat{\mathcal {H}}_\infty (\cdot |\cdot )\) and \(\hat{\mathcal {H}}_{\infty ,1}(\cdot |\cdot )\) are obtained from \(\mathcal {H}_\infty (\cdot |\cdot )\) and \(\mathcal {H}_{\infty ,1}(\cdot |\cdot )\), respectively, by substituting sample moments of \((X,Y)\) for their population moments in the definitions of the latter. If \(m=n\), we have \(n^{1/2}b(0)=Z_n+o(1)\) converging in distribution to \(Z\sim N(0,\sigma ^2V_0^{-1})\), so that, by (17), \(\mathcal {L}^*_{n,n}\) converges in probability to a random distribution function \(\mathcal {H}_\infty (\cdot |Z)\), which fails to estimate \(\mathcal {L}_n(\cdot )\) consistently. If \(m\rightarrow \infty \) and \(m=o(n)\), then \(m^{1/2}b(0)=m^{1/2}n^{-1/2}\left( n^{1/2}\xi _n+Z_n\right) =o_p(1)\), which yields for \(\mathcal {L}^*_{m,n}\) an expansion analogous to (16), given by

$$\begin{aligned} \mathcal {L}^*_{m,n}(x)&= \hat{\mathcal {H}}_\infty (x|0)+m^{1/2}n^{-1/2}\nabla \hat{\mathcal {H}}_\infty (x|0)^{\mathrm{T}}Z_n\nonumber \\&\quad +\,m^{-1/2}\hat{\mathcal {H}}_{\infty ,1}(x|0)\nonumber \\&\quad +\,o_p\left( m^{-1/2}+m^{1/2}n^{-1/2}\right) , \end{aligned}$$
(18)

which converges in probability to the correct limit \(\mathcal {H}_\infty (x|0)\). Note that the expansions (16) and (18) satisfy (A.0) with \(\varepsilon _n(F)=n^{1/2}\nabla \mathcal {H}_\infty (x|0)^{\mathrm{T}}\xi _n(1+o(1))\) and \(\alpha =\beta =\lambda =1/2\), so that \(\delta _{n,F}= \max \left\{ |\varepsilon _n(F)|,n^{-1/2}\right\} \).

1.5 Proof of (A.0) for example 3.3

Write, for brevity, \(\Delta g_{ij}(w_1,w_2)=g_i(w_1)-g_j(w_2)\) for any \(w_1,w_2\in {{\mathbb {R}}}^5, i,j=0,1,2\). Define, for \(c_1,c_2,c_3\in {{\mathbb {R}}}\),

$$\begin{aligned}&\mathcal {K}_n(c_1,c_2,c_3)\\&\quad = \mathbb {P}\left( n^{1/2}\Delta g_{11}(\bar{W}, \mu _W)\le c_1,\;|n^{1/2}\Delta g_{00}(\bar{W},\mu _W)+c_3|> \kappa \right) \nonumber \\&\quad +\,\mathbb {P}\left( n^{1/2}\Delta g_{22}(\bar{W},\mu _W)\le c_2,\;|n^{1/2}\Delta g_{00}(\bar{W},\mu _W)+c_3|\le \kappa \right) , \end{aligned}$$

so that \(\mathcal {L}_n(\cdot )=\mathcal {K}_n\left( \cdot \,,\,\cdot -n^{1/2}\beta _{0, n}\rho _W,\,n^{1/2}\beta _{0,n}\right) \), where \(\rho _W=\mathbb {E}[X_1]/\mathbb {E}[X^2_1]\). Edgeworth expansions under the smooth function model can be applied to the random vector \([g_0(\bar{W}),g_1(\bar{W}),g_2(\bar{W})]^{\mathrm{T}}\) to obtain

$$\begin{aligned} \mathcal {K}_n(c_1,c_2,c_3)&= \mathcal {K}_\infty (c_1,c_2,c_3)+n^{-1/2}\mathcal {K}_{\infty ,1} (c_1,c_2,c_3)\nonumber \\&\quad +\,o(n^{-1/2}), \end{aligned}$$
(19)

uniformly over \((c_1,c_2,c_3)\in {{\mathbb {R}}}^3\), where \(\mathcal {K}_\infty (\cdot )\) and \(\mathcal {K}_{\infty ,1}(\cdot )\) are smooth functions depending on the moments of \((X_1,Y_1)\). Setting \((c_1,c_2,c_3)=\left( t,\,t-n^{1/2} \beta _{0,n}\rho _W,\,n^{1/2}\beta _{0,n}\right) \) in (19) and noting that \(n^{1/2}\beta _{0,n}=o(1)\), we have

$$\begin{aligned} \mathcal {L}_n(t)&= \mathcal {K}_\infty (t,t,0)+n^{-1/2}\mathcal {K}_{\infty ,1}(t,t,0)\nonumber \\&\quad +\,n^{1/2} \beta _{0,n}\left\{ \partial _3\mathcal {K}_\infty (t,t,0)-\rho _W\partial _2 \mathcal {K}_\infty (t,t,0)\right\} \nonumber \\&\quad +\,o\left( n^{-1/2}+n^{1/2}|\beta _{0,n}|\right) \; \longrightarrow \;\mathcal {K}_\infty (t,t,0), \end{aligned}$$
(20)

where \(\partial _j\mathcal {K}_\infty \) denotes the partial derivative of \(\mathcal {K}_\infty \) with respect to its \(j\)th argument. Denote by \(\hat{\mathcal {K}}_{\infty }\) and \(\hat{\mathcal {K}}_{\infty ,1}\) the sample analogues of \(\mathcal {K}_\infty \) and \(\mathcal {K}_{\infty ,1}\), respectively. The \(m\) out of \(n\) bootstrap version of \(\mathcal {K}_n\) has an expansion analogous to (19), given by

$$\begin{aligned} \hat{\mathcal {K}}_m(c_1,c_2,c_3)&= \hat{\mathcal {K}}_\infty (c_1,c_2,c_3)+m^{-1/2} \hat{\mathcal {K}}_{\infty ,1}(c_1,c_2,c_3)\nonumber \\&\quad +\,o_p(m^{-1/2}), \end{aligned}$$
(21)

which converges in probability to \(\mathcal {K}_\infty (c_1,c_2,c_3)\). Note that

$$\begin{aligned} \mathcal {L}^*_{m,n}(t)&= \mathbf{1}\left\{ n^{1/2}|g_{0}(\bar{W})|> \kappa \right\} \nonumber \\&\quad \times \,\hat{\mathcal {K}}_m\left( t,t+m^{1/2}\Delta g_{12}(\bar{W},\bar{W}),m^{1/2}g_0 (\bar{W})\right) \nonumber \\&\quad +\,\mathbf{1}\left\{ n^{1/2}|g_{0}(\bar{W})| \le \kappa \right\} \nonumber \\&\quad \times \,\hat{\mathcal {K}}_m\left( t-m^{1/2}\Delta g_{12}(\bar{W},\bar{W}),t,m^{1/2}g_0(\bar{W})\right) .\nonumber \\ \end{aligned}$$
(22)

By the Central Limit Theorem, \(n^{1/2}[\,\Delta g_{00}(\bar{W}, \mu _W),\,\Delta g_{11}\) \((\bar{W},\mu _W),\,g_{22}(\bar{W},\mu _W)\,]^{\mathrm{T}}\) converges in distribution to a trivariate normal random vector \(\mathcal {W}=[\mathcal {W}_0,\mathcal {W}_1,\mathcal {W}_2]^{\mathrm{T}}\). For the conventional bootstrap with \(m=n\), we have that

$$\begin{aligned}&n^{1/2}\left[ \Delta g_{12}(\bar{W},\bar{W}),g_0(\bar{W})\right] \\&\quad = n^{1/2}\left[ \Delta g_{11}(\bar{W},\mu _W)-\Delta g_{22}(\bar{W}, \mu _W),\Delta g_{00}(\bar{W},\mu _W)\right] \\&\qquad -n^{1/2}\beta _{0,n}\left[ \rho _W,-1\right] \end{aligned}$$

converges in distribution to \(\left[ \mathcal {W}_1- \mathcal {W}_2, \mathcal {W}_0\right] \). Thus the bootstrap distribution function (22) converges in probability to a random function

$$\begin{aligned}&\mathbf{1}\left\{ \mathcal {W}_0>\kappa \right\} \mathcal {K}_\infty \left( t,\,t +\mathcal {W}_1-\mathcal {W}_2, \mathcal {W}_0\right) \\&\quad + \mathbf{1}\left\{ \mathcal {W}_0\le \kappa \right\} \mathcal {K}_\infty \left( t- \mathcal {W}_1+\mathcal {W}_2,t, \mathcal {W}_0\right) , \end{aligned}$$

which fails to capture the correct limit \(\mathcal {K}_\infty (t,t,0)\). For \(m=o(n)\) and \(m\rightarrow \infty \), both \(m^{1/2}\Delta g_{12}(\bar{W},\bar{W})\) and \(m^{1/2}g_0(\bar{W})\) are of order \(\Omega _p(m^{1/2}n^{-1/2})=o_p(1)\). It follows, by (21) and Taylor expansion, that (22) can be expanded as

$$\begin{aligned}&\hat{\mathcal {K}}_\infty (t,t,0)+m^{-1/2}\hat{\mathcal {K}}_{\infty ,1}(t,t,0)+(m/n)^{1/ 2}\nonumber \\&\quad \left[ n^{1/2}g_0(\bar{W})\,\partial _3\hat{\mathcal {K}}_\infty (t,t,0) \right] \nonumber \\&\quad +\,(m/n)^{1/2}\left[ \mathbf{1}\{n^{1/2}|g_0(\bar{W})|>\kappa \} \partial _2\hat{\mathcal {K}}_\infty (t,t,0)\right. \nonumber \\&\quad \left. -\,\,\mathbf{1}\{n^{1/2}|g_0(\bar{W})|\le \kappa \}\partial _1\hat{\mathcal {K}}_\infty (t,t,0)\right] \nonumber \\&\quad \times \,n^{1/2}\Delta g_{12}(\bar{W},\bar{W})+o_p\left( m^{-1/2}+ m^{1/2}n^{-1/2}\right) , \end{aligned}$$
(23)

which converges to the correct limit \(\mathcal {K}_\infty (t,t,0)\). Note that (A.0) is satisfied by the expansions (20) and (23), which have the same forms as (16) and (18), respectively, established in Sect. 6.4.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wei, B., Lee, S.M.S. & Wu, X. Stochastically optimal bootstrap sample size for shrinkage-type statistics. Stat Comput 26, 249–262 (2016). https://doi.org/10.1007/s11222-014-9493-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-014-9493-x

Keywords

Mathematics Subject Classification (2010)

Navigation