Skip to main content
Log in

Case influence diagnostics in the Kaplan-Meier estimator and the log-rank test

  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

One or few observations can be highly influential on the Kaplan-Meier estimator, and consequently on the log-rank test statistic in comparing two survival functions. In this paper we derive case influence diagnostics for the Kaplan-Meier estimator and the log-rank test. We note that diagnostics in this context is quite different from the regression context where observations are usually assumed to be independent. Simulation studies are done to present some guidelines to determine influential observations deserving special attention. Illustrative examples are also given.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure1
Figure2

Similar content being viewed by others

References

  • Belsley, D.A., Kuh, E., and Welsch, R.E. (1980).Regression Diagnostics: Identifying Influential Data and Source of Collinearity. Wiley, New York.

    Book  Google Scholar 

  • Chatterjee, S. and Hadi, A.S. (1986). Influential observations, high leverage and outliers in linear regression (discussion).Statistical Science 1, 379–416.

    Article  MathSciNet  Google Scholar 

  • Cook, R.D. and Weisberg, S. (1982).Residuals and Influence in Regression. Chapman and Hall, London.

    MATH  Google Scholar 

  • Cox, D.R. and Snell, E.J. (1968). A general definition of residuals.Journal of the Royal Statistical Society, Series B30, 248–275.

    MathSciNet  MATH  Google Scholar 

  • Crowley, J. and Hu, M. (1977). Covariance analysis of heart transplant data.Journal of the American Statistical Association 872, 27–36.

    Article  Google Scholar 

  • Efron, B. (1967). The two sample problem with censored data.Proc. Fifth Berkeley Symposium in Mathematical statistics, IV, New York, Prentice Hall, 831–853.

    Google Scholar 

  • Gill, R.D. (1980).Censoring and Stochastic Integrals, Mathematical center Tracts 124, Mathematisch Centrum, Amsterdam.

    Book  Google Scholar 

  • Kaplan, E.L. and Meier, P. (1958). Nonparametric estimation from incomplete observations.Journal of the American Statistical Association 53, 457–481.

    Article  MathSciNet  Google Scholar 

  • Kim, C. and Kim, W. (1998). Some diagnostic results in nonparametric density estimation.Communications in Statistics — Theory and Methods 27, 291–303.

    Article  MathSciNet  Google Scholar 

  • Lagakos, S.W. (1981). The graphical evaluation of explanatory variables in proportional hazards models.Biometrika 68, 93–98.

    Article  MathSciNet  Google Scholar 

  • Lawless, J.F. (1982).Statistical Models and Methods for Lifetime Data. Wiley, New York.

    MATH  Google Scholar 

  • Mantel, N. and Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease.Journal of the National Cancer Institute 22, 719–748.

    Google Scholar 

  • Padgett, W.J. (1986). A kernel-type estimator of a quantile function from right-censored data.Journal of the American Statistical Association 81, 215–222.

    Article  MathSciNet  Google Scholar 

  • Peto, R. and Peto, J. (1972). Asymptotically efficient rank invariant test procedures (with discussion).Journal of the Royal Statistical Society, Series A 135, 185–206.

    Article  Google Scholar 

  • Schoenfeld, D. (1982). Partial residuals for the proportional hazards regression model.Biometrika 69, 239–241.

    Article  Google Scholar 

  • Storer, B.E. and Crowley, J. (1985). A diagnostic for Cox regression and conditional likelihoods.Journal of the American Statistical Association 80, 139–147.

    Article  MathSciNet  Google Scholar 

  • Therneau, T.M., Grambsch, P.M., and Fleming, T.R. (1990). Martingale-based residuals for survival models.Biometrika 77, 147–160.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

The first author was supported by the Korea Research Foundation Grant(KRF-2002-041-C00049), and he is a member of the Research Institute of Computer, Information and Communication at Pusan National University.

Appendix : Proof of Eq. (2.5)

Appendix : Proof of Eq. (2.5)

For Xit,

$$\log \hat{S}(t)=\sum_{j=1}^{i-1} \delta_{j} \log (\frac{m-j}{m-j+1})+\sum_{j=i}^{k} \delta_{j} \log (\frac{m-j}{m-j+1})$$

and

$$\begin{aligned} \log \hat{S}_{-i}(t) &=\sum_{j=1}^{k-1} \delta_{j(i)}^{*} \log (\frac{m-j-1}{m-j}) \\ &=\sum_{j=1}^{i-1} \delta_{j} \log (\frac{m-j-1}{m-j})+\sum_{j=i}^{k-1} \delta_{j(i)}^{*} \log (\frac{m-j-1}{m-j}) \\ &=\sum_{j=1}^{i-1} \delta_{j} \log (\frac{m-j-1}{m-j})+\sum_{j=i+1}^{k} \delta_{j} \log (\frac{m-j}{m-j+1}) \end{aligned}$$

Also, for Xi > t,

$$\log \hat{S}(t)-\log \hat{S}_{-i}(t)=\sum_{j=1}^{k} \delta_{j} \log (\frac{m-j}{m-j+1})-\sum_{j=1}^{k} \delta_{j} \log (\frac{m-j-1}{m-j})$$

which completes the proof.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, C., Bae, W. Case influence diagnostics in the Kaplan-Meier estimator and the log-rank test. Computational Statistics 20, 521–534 (2005). https://doi.org/10.1007/BF02741312

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02741312

Keywords

Navigation