An empirical study of a test for polynomial relationships in randomly right censored regression models
Introduction
In recent years, there has been a great deal of interest in the regression analysis of randomly right censored data, particularly in the context of survival analysis in clinical trials where patients often survive beyond the end of the trial period or are lost to follow-up for some reason. A general way to model such a situation is to introduce a censoring variable apart from the response and explanatory variables. More precisely, denote by an independent sample from an unknown lifetime distribution function and the associated observable covariates. Let be an independent sample from a so-called censoring distribution function . Hence, we observe where with being the indicator function. It is further assumed that the distribution function satisfies , where and .
Suppose that satisfywhere are error terms which are generally assumed to be independent and identically distributed random variables with mean zero and common variance , and is an unknown regression function. The main problem in right censored regression analysis is, as usual, to specify based on the data . Although some nonparametric smoothing techniques have been discussed for fitting the model (see, for example, Fan and Gijbels, 1994, Wang and Zheng, 1997, Wang and Li, 2002), the parametric regression model, that is, the form of is known except for some parameters, is still one of the commonly used models for analyzing randomly right censored data (see Miller, 1976, Buckley and James, 1979, Koul et al., 1981, Leurgans, 1987, Stute, 1993, He and Huang, 2003, Li and Wang, 2003) because of its simplicity and wide applications.
However, similar to the problem encountered in analyzing complete data through a parametric regression model, if the assumed regression relationship deviates away from the real structure of data seriously, the conclusions will be misleading. Therefore, development of statistical tests for some parametric regression relationships is also an important issue in randomly right censored regression analysis. To the best of our knowledge, Kim (1993) has constructed a generalized Pearson statistic to handle this problem and studied the large sample behavior of the test statistic. Nikabadza and Stute (1997) have developed a method through transforming the general model check to another one from which asymptotic distribution-free full model checks are available. Stute et al. (2000) have extended the test based on the empirical process of the regressors marked by the residuals in general regression to the case of the right censored regression, and have provided the asymptotic distribution of the underlying marked empirical process.
Motivated by the frequently used nonparametric regression models for checking parametric regression relationships in the case of complete data (for example, see Azzalini and Bowman, 1993, Härdle and Mammen, 1993, Jayasuriya, 1996, Fan and Gijbels, 1996, Fan et al., 2001, Mei et al., 2003), we propose in this paper a relatively simple test for a polynomial regression model with randomly right censored data. With the properly transformed data, a test statistic is constructed through comparing the residual sums of squares obtained by, respectively, fitting a polynomial regression model and a nonparametric model. Two bootstrap procedures, namely the residual-based bootstrap and the naive bootstrap procedures, are suggested to derive the -value of the test. Simulation results demonstrate that the residual-based bootstrap performs more satisfactorily than the naive bootstrap for approximating the null distribution of the test statistic and the test method with the residual-based bootstrap is quite powerful in identifying the polynomial relationships in randomly right censored regression. Although this paper is only an empirical study and the theoretical proof of the validity of the bootstrap approximations remains to be investigated, the proposed test method may be of some practical use with the support of the simulation results.
The remainder of this paper is organized as follows. In Section 2, a test statistic is constructed from the viewpoint of analysis of variance to check a polynomial relationship for a right censored data set with the local polynomial smoothing technique. Section 3 contains two bootstrap procedures to derive the -value of the test. Simulations are conducted in Section 4 to empirically assess the performance of the test and the Stanford heart transplant data are further analyzed in Section 5.
Section snippets
Construction of test statistic
For simplicity and notational convenience, we will restrict the discussion to the univariate explanatory variable case. When we have right censored data instead of complete data , the commonly used method to fit a regression relationship between the response variable and the explanatory variable is to firstly transform the incomplete data in an appropriate way. In this aspect, Buckley and James (1979) as well as Koul et al. (1981) have proposed some
Calculation of the -value
In order to calculate in (12), we should firstly obtain the null distribution of the test statistic . We may surmise in our case that is asymptotically distributed as a distribution based on the fact that the form of is similar to that of the generalized likelihood ratio test statistic proposed by Fan et al. (2001) in the nonparametric regression setting where the observed data are complete. Note that the asymptotic null distribution of their test statistic was proved to be with
Simulation studies
Since the validity of the bootstrap procedures for approximating the null distribution of the test statistic remains to be investigated, simulations are conducted in this section to assess the performance of the proposed test method.
Analysis of the Stanford heart transplant data
In this section, we applied the proposed test to the Stanford heart transplant data.
The Stanford heart transplant program began in October 1967. By February 1980, 184 patients had received heart transplants and a few of them had multiple transplants. The final data contain the censored or uncensored survival times of these patients in February 1980 and their ages at the time of their first transplant. The original data can be found in Miller and Halpern (1982). This data set has been widely
Acknowledgments
This work is supported by the National Natural Science Foundation of China (No. 10531030 and 60675013). The authors would like to thank the associate editor and two anonymous referees for their invaluable suggestions which lead to a substantial improvement of this paper.
References (37)
- et al.
Goodness-of-fit tests for the Cox model via bootstrap method
J. Statist. Plann. Inference
(1995) Goodness-of-fit tests for semiparametric biased sampling models
J. Statist. Plann. Inference
(2004)Consistent estimation under random censorship when covariates are present
J. Multivariate Anal.
(1993)- et al.
NN goodness-of-fit test for linear models
J. Statist. Plann. Inference
(1996) - et al.
Empirical likelihood semiparametric regression analysis under random censorship
J. Multivariate Anal.
(2002) - et al.
On the use of nonparametric regression for checking linear relationships
J. Roy. Statist. Soc. B
(1993) - et al.
Linear regression with censored data
Biometrika
(1979) - et al.
Functional-coefficient regression models for nonlinear time series
J. Amer. Statist. Assoc.
(2000) Regression models and life tables
J. Roy. Statist. Soc. B
(1972)- et al.
Goodness of fit tests in random coefficient regression models
Ann. Inst. Statist. Math.
(1999)