Abstract
In this paper we propose a new method for estimating parameters in a single-index model under censoring based on the Beran estimator for the conditional distribution function. This, likelihood-based method is also a useful and simple tool used for bandwidth selection. Additionally, we perform an extensive simulation study comparing this new Beran-based approach with other existing method based on Kaplan–Meier integrals. Finally, we apply both methods to a primary biliary cirrhosis data set and propose a bootstrap test for the parameters.
Similar content being viewed by others
References
Beran R (1981) Nonparametric regression with randomly censored survival data. Technical Report, University of California, Berkley
Bouaziz O, Lopez O (2010) Conditional density estimation in a censored single-index model. Bernoulli 16:514–542
Carroll RJ, Fan J, Gijbels I, Wand MP (1997) Generalized partially linear single-index models. J Am Stat Assoc 92:477–489
Delecroix M, Hristache M, Patilea V (2006) On semiparametric M-estimation in single-index regression. J Stat Plan Inference 136:730–769
Escanciano JC, Song K (2010) Testing single-index restrictions with a focus on average derivatives. J Econ 156:377–391
Fleming T, Harrington D (1991) Counting processes and survival analysis. Wiley, New York
González-Manteiga W, Cadarso-Suárez C (1994) Asymptotic properties of a generalized Kaplan–Meier estimator with some applications. J Nonparametr Stat 4:65–78
Härdle W, Hall P, Ichimura H (1993) Optimal smoothing in single-index models. Ann Stat 21:157–178
Härdle W, Mammen E, Proença I (2001) A bootstrap test for single index models. Statistics 35:427–451
Hristache M, Juditsky A, Spokoiny V (2001) Direct estimation of the index coefficient in a single-index model. Ann Stat 29:595–623
Huang ZS (2012) Corrected empirical likelihood inference for rightcensored partially linear single-index model. J Multivar Anal 105:276–284
Huang ZS, Lin B, Feng F, Pang Z (2013) Efficient penalized estimating method in the partially varying-coefficient single-index model. J Multivar Anal 114:189–200
Huang ZS, Zhang R (2011) Efficient empirical–likelihood-based inferences for the single-index model. J Multivar Anal 102:937–947
Ichimura H (1993) Semiparametric least squares (SLS) and weighted SLS estimation of single-index models. J Econ 58:71–120
Iglesias-Pérez MC, González-Manteiga W (2003) Bootstrap for the conditional distribution function with truncated and censored data. Ann Inst Stat Math 55:331–357
Strzalkowska-Kominiak E, Cao R (2013) Maximum likelihood estimation for conditional distribution single-index models under censoring. J Multivar Anal 114:74–98
Stute W, Zhu LX (2005) Nonparametric checks for single-index models. Ann Stat 33:1048–1083
Wang JL, Xue L, Zhu L, Chong YS (2010) Estimation for a partial-linear single-index model. Ann Stat 38:246–274
Xia Y, Härdle W (2006) Semi-parametric estimation of partially linear single-index models. J Multivar Anal 97:1162–1184
Zhang R, Huang Z, Lv Y (2010) Statistical inference for the index parameter in single-index models. J Multivar Anal 101:1026–1041
Zhu LX, Xue LG (2006) Empirical likelihood confidence regions in a partially linear single-index model. J R Stat Soc Ser B 68:549–570
Acknowledgments
The authors acknowledge financial support from Ministerio de Economía y Competitividad Grant MTM2011-22392 (EU ERDF support included). Additionally, Ewa Strzalkowska-Kominiak acknowledges financial support from a Juan de la Cierva scholarship and ECO2011-25706 from the Spanish Ministerio de Economía y Competitividad.
Author information
Authors and Affiliations
Corresponding author
Appendix: Proof of Theorem 1
Appendix: Proof of Theorem 1
Lemma 1
Let \(\tilde{F}_{n\theta }(y|\theta '\mathbf {x})\) be the Beran estimator. Under B1-B5, we have
-
a)
\( \nabla _{\theta }\log (1-\tilde{F}_{n\theta }(y|\theta '\mathbf {x}))\mathop {\rightarrow }\limits ^{n\rightarrow \infty } \nabla _{\theta }\log (1- F_{\theta }(y|\theta '\mathbf {x}))\) in probability and consequently
-
b)
\(\nabla _{\theta }\tilde{F}_{n\theta }(y|\theta '\mathbf {x})\mathop {\rightarrow }\limits ^{n\rightarrow \infty } \nabla _{\theta }F_{\theta }(y|\theta '\mathbf {x})\) in probability.
Proof
Recall that
where
Then
Hence
Moreover,
Let
and
Hence if \(H\) is continuous, it is easy to show that
Finally, using (vi) in González-Manteiga and Cadarso-Suárez (1994), it is obvious that
As to b), since \(\tilde{F}_{n\theta }(y|\theta '\mathbf {x} )\rightarrow F_{\theta }(y|\theta '\mathbf {x} )\) and \( \nabla _{\theta }\tilde{F}_{n\theta }(y|\theta '\mathbf {x} )=-\nabla _{\theta }\log (1-\tilde{F}_{n\theta }(y|\theta '\mathbf {x} ))(1-\tilde{F}_{n\theta }(y|\theta '\mathbf {x} ))\), the proof is completed.
Lemma 2
Let \(\tilde{f}_{\theta }(y|\theta '\mathbf {x} )\) be the estimated density and \(\tilde{F}_{\theta }^S(y|\theta '\mathbf {x} )\) the smoothed distribution function estimator defined in (2) and (3). Then, under B1-B5 and when \(n\rightarrow \infty \)
-
a)
\( \nabla _{\theta }\tilde{f}_{\theta }(y|\theta '\mathbf {x} )_{\theta =\theta _0}\rightarrow \nabla _{\theta }f_{\theta }(y|\theta '\mathbf {x} )_{\theta =\theta _0}\).
-
b)
\( \nabla _{\theta }(1-\tilde{F}_{\theta }^S(y|\theta '\mathbf {x} ))_{\theta =\theta _0}\rightarrow \nabla _{\theta }(1-F_{\theta }(y|\theta '\mathbf {x} ))_{\theta =\theta _0}\).
Proof
Recall,
Since \(\sum _{i=1}^n B_{in}(\theta '\mathbf {x} )=1\), we have
where
Hence
Furthermore,
where
and
As in the proof of Lemma 1, we have
Hence, it is easy to show that
where
and
As to \(A_{1n}(y,\theta '\mathbf {x} )\), it is a sum of iid random variables, whose variance goes to zero when \(n\rightarrow \infty \) and \(n h_1^3 h_2\rightarrow \infty \). Moreover, the expectation of \(A_{1n}(y,\theta '\mathbf {x} )\) equals
Setting, \(\theta =\theta _0\), under B2, B3 and using Taylor expansion, we obtain
Furthermore, using Lemmas 4 and 5 of Strzalkowska-Kominiak and Cao (2013), we have
and
Recall, that the last equation holds even if the conditional distribution function of \(C\) given \(\theta _0'X\), \(G_{\theta _0}(y|\theta _0'\mathbf {x} )\), does not follow the single-index model assumption.
Finally, we obtain
Similarly, we may show that
Hence
As to \(B_n(y,\theta '\mathbf {x} )\), by a Taylor expansion and using Lemma 1 for the Beran estimator \(G_{n\theta }\), we obtain
This completes the proof of a). Finally, since
the proof of b) is similar.
Lemma 3
Let
be the theoretical log-likelihood function and \(\theta _0\) the true parameter. Then
when \(n\) goes to infinity.
Proof
We have
Lemma 2 complete the proof.
Similarly, we can show that
Lemma 4
Under conditions B1–B5,
where \(\tilde{l}_n^{[2]}(\theta )\) and \(l^{[2]}(\theta )\) denote the Hessian matrices of \(\tilde{l}_n(\theta )\) and \(E(l_n^{[2]}(\theta ))\), respectively.
Proof of Theorem 1
Using Theorem 1 in Strzalkowska-Kominiak and Cao (2013), it is easy to prove that, under B1,
Hence using a Taylor expansion, we have
where \(\hat{\theta }_n^*\) is between \(\hat{\theta }_n\) and \(\theta _0\). Furthermore,
Finally, using Lemmas 3 and 4 and since \(\nabla _{\theta } l_n(\theta )\rightarrow E(\nabla _{\theta } l_n(\theta )_{\theta =\theta _0})\) in probability, the proof is completed.
Rights and permissions
About this article
Cite this article
Strzalkowska-Kominiak, E., Cao, R. Beran-based approach for single-index models under censoring. Comput Stat 29, 1243–1261 (2014). https://doi.org/10.1007/s00180-014-0489-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-014-0489-y