Elsevier

Handbook of Statistics

Volume 23, 2003, Pages 123-141
Handbook of Statistics

The Missing Censoring-Indicator Model of Random Censorship

https://doi.org/10.1016/S0169-7161(03)23007-8Get rights and content

Publisher Summary

This chapter provides an overview of several methods of estimation of a survival function S(t) in the classical random censorship model when the censoring indicators are missing for a subset of the study subjects. The model is called the missing censoring-indicator (MCI) model of random censorship. Two well-known missingness mechanisms are presented. The existing asymptotically efficient estimators are the reduced data nonparametric maximum likelihood estimator (NPMLE) and an estimator obtained through standard estimating equation methodology. A semiparametric estimator is also presented and its asymptotic variance is compared with the information bound for estimating S(t). To facilitate comparison of the asymptotic variances of the semiparametric and (efficient) nonparametric estimators of H1(t) and S(t), the chapter calculates the information bound for estimating H1(t) and S(t) from the expressions for their efficient influence curves obtained by van der Laan and McKeague.

Introduction

In survival analysis, right-censored data are described by n independent and identically distributed (i.i.d.) copies of the observable pair (X,δ), where X is the minimum of a survival time T and a censoring time C which is independent of T, and δ is an indicator variable (called the censoring indicator henceforth) signifying whether the observed X equals T or otherwise. When the censoring indicator is always observed, the well-known Kaplan–Meier estimator (Kaplan and Meier (1958)), KME henceforth, is the nonparametric maximum likelihood estimator (NPMLE) of the survival function S(t) of T, having several appealing asymptotic properties which also includes asymptotic efficiency (cf. Wellner (1982)). For more on the KME, see Shorack and Wellner, 1986, Fleming and Harrington, 1991, Andersen et al., 1993, among others.

Sometimes, however, the censoring indicator is not observed for a subset of the subjects investigated (e.g., in a bioassay experiment some subjects might not be autopsied to save expense, or the results of an autopsy may be inconclusive), leading to the missing censoring-indicator (MCI) model of random censorship. Specifically, let ξ be an indicator variable that may depend on X=min(T,C). The missingness indicator ξ assumes the value 1 when the censoring indicator δ is observed and takes the value 0 otherwise. The observed data in the MCI model of random censorship are n i.i.d. copies of Y=(X,ξ,σ) where σ=ξδ. The KME is inapplicable in the MCI model. In this article, we provide a general overview of several available estimators of S(t) in the MCI model under two well-known types of missing mechanisms. We also propose and analyze, under the less restrictive of the two missing mechanisms, a semiparametric estimator and show that it is more efficient than its nonparametric counterparts whenever the parametric component is correctly specified. The proposed semiparametric estimator is a direct extension of its classical random censorship model counterpart, originating in the work of Dikta (1998).

The censoring indicators are said to be missing completely at random (MCAR) if the probability that ξ equals 1 does not depend on either X or δ, implying that the missing mechanism is independent of everything else observed in the MCI model. Assuming that the censoring indicators are MCAR allows consistent estimation of S(t) through the KME applied to the complete cases. Moreover, the maximum likelihood estimator (MLE) of P(ξ=1), which is readily estimated under MCAR using the fully observed ξ's, can be employed to provide improved estimators of S(t), see Lo, 1991, Gijbels et al., 1993.

More generally, if the above probability depends on the fully observed X alone and not on the potentially unobservable δ, i.e., P(ξ=1∣X,δ)=P(ξ=1∣X), then the censoring indicators are said to be missing at random (MAR); van der Laan and McKeague (1998) note that (1.1) is also the “minimal coarsening at random (CAR) assumption” needed for asymptotic efficiency in the MCI model. In other words, under MAR, the observability of the censoring indicator δ depends only on X and not on the value of δ. The effect of the MAR assumption is that P(ξ=1∣X) is an infinite dimensional parameter, a function π(x), whose estimation is relatively more difficult. More importantly, however, MAR facilitates efficient estimation of S(t) unlike MCAR.

Denote the conditional probability of an uncensored observation given X=x by p(x) and the conditional probability of an observable uncensored observation given X=x by q(x). Interestingly, Eq. (1.1) is equivalent to the statement that, conditionally on X, the missingness and censoring indicators are independent: q(x)=P(σ=1∣X=x)=P(ξ=1∣X)P(δ=1∣X)=π(x)p(x). This facilitates estimation of the parametric component in the semiparametric estimator that we analyze using only the “complete cases”; see Tsiatis et al. (2002) for a nice application involving MAR when covariates are present. See Little and Rubin (1987) for more details about the missing mechanisms MCAR and MAR.

CAR is a generalization of MAR to the case of coarse data, which arise when a random quantity of interest is not observed directly, but instead a subset of values (of the sample space) in which the unobserved random quantity lies is observed. For example, in the MCI model, when the missingness indicator is 0, the censoring indicator's precise value is not observed, only the fact that it assumes a value in {0,1} is known, and this gives rise to coarse data. CAR allows the coarsening mechanism to be ignored while making inferences, see Heitjan and Rubin (1991); see also Jacobsen and Keiding, 1995, Gill et al., 1997.

An obvious drawback of the “complete-case estimator” is in its high inefficiency when the degree of missingness is considerable. Several authors have proposed improvements over the complete-case estimator, assuming MCAR. Dinse (1982) used the EM algorithm and obtained an NPMLE. Lo (1991), however, showed that the NPMLE is non-unique and some of them are inconsistent, and constructed two estimators one of which is consistent and asymptotically normal. Gijbels et al. (1993) proposed a convex combination of two estimators, one of which was a modified Lo-estimator. McKeague and Subramanian (1998) proposed an estimator that employed Nelson–Aalen estimators of certain cumulative transition intensities, and showed how their approach can be used to obtain the estimators of Lo, 1991, Gijbels et al., 1993. None of these approaches, however, investigated asymptotic efficiency. For a somewhat different censorship model, Mukherjee and Wang (1992) derived the NPMLE of S(t) when the hazard rate is increasing, assuming that the censoring indicator is never observed but that the censoring distribution is always known instead.

The important issue of asymptotically efficient estimation in the MCI model was addressed by van der Laan and McKeague (1998), who introduced a sieved NPMLE, under a slightly stronger CAR assumption than (1.1), and obtained its influence curve. The influence curve of S(t), the estimator of S(t), is the random function denoted by IC(Y,t) satisfying the equation n1/2S(t)−S(t)=n−1/2i=1nIC(Yi,t)+op(1). Indeed S(t)−S(t) is asymptotically an average of n i.i.d random quantities where the ith quantity is the influence curve evaluated at the observed data Yi. They also obtained the efficient influence curve for estimating S(t). The variance of the efficient influence curve is the information bound for estimating S(t), and is the smallest asymptotic variance that an estimator can hope to achieve; an estimator that attains the information bound is deemed to be asymptotically efficient.

The survival function S(t) is a smooth (compactly differentiable) functional of the cumulative hazard function Λ(t), via the product integral mapping (cf. Gill and Johansen (1990)). The cumulative hazard function Λ(t) is in turn a compactly differentiable functional (cf. Gill (1989)) of the subdistribution function H1(t)=P(Xt, δ=1) and the distribution function H(t)=P(Xt) through the relation Λ(t)=∫0t[dH1(s)/(1−H(s))]. Note that both H1(t) and H(t) are functionals of the bivariate distribution of (X,δ). Therefore, van der Laan and McKeague (1998) explained that an efficient estimator of the bivariate distribution of (X,δ) would readily yield an efficient estimator of S(t). Indeed Eq. (1.1) is essential for identifiability of the subdensity fX,δ(x,d), where d=0 or 1. Van der Laan and McKeague noted that the subdensity fX,δ(x,1)=P(X=x,δ=1) can be written as fX,δ(x,1)=fX,σ,ξ(x,1,1)fξ∣X,δ(ξ∣x,1), and pointed out that Eq. (1.1) was needed to make the denominator of (1.2) identifiable which in turn also implies that the subdensity fX,δ(x,1) will be identifiable provided that the denominator, which by (1.1) is π(x), is bounded away from zero.

Van der Laan and McKeague's (1998) approach of estimating the joint distribution of (X,δ) was to present the problem in the framework of nonparametric estimation of a bivariate distribution from bivariate right-censored data in which the first component X is completely observed but the second component δ is right-censored by a discrete random variable. An important outcome of their work is the benchmark asymptotic variance (information bound) for comparing competing estimators of S(t) in the MCI model. Van der Laan and McKeague also employed the approach of Robins and Rotnitzky (1992) and proposed an explicit efficient estimator of H1(t) using the general theory of semi-parametric efficiency bounds; see Bickel et al. (1993) for more on information bound theory. Since the survival function is a compactly differentiable functional of H1(t) and H(t), replacing H1(t) and H(t) with their respective efficient estimators would immediately lead to an efficient estimator of S(t), see van der Vaart (1991).

All the aforementioned approaches of S(t) in the MCI model are nonparametric, in that they do not suppose that the observed data are random samples from specific populations, letting the data speak for themselves instead. Sometimes scientific rationale may suggest that the parametric approach be employed for estimating some quantities in a proposed model, while the nonparametric approach be pursued for estimating other quantities. This is called a semiparametric approach. An example of a semiparametric model is the well-known and highly popular Cox proportional hazards model (Cox, 1972), in which the parametric component is specified by the regression parameter and the (infinite dimensional) nonparametric part is the baseline hazard function. It turns out that the semiparametric approach can be gainfully employed for estimating S(t) in the MCI model. The link is provided by a representation noted by Dikta (1998).

Denote the cumulative hazard function of X by ΛH(t). In the classical random censorship model, Dikta (1998) noted a representation in which Λ(t) is expressible in terms of p(t) and ΛH(t), and provided a semiparametric alternative to the KME, by assuming a parametric model p(t,θ) for the conditional probability of an uncensored observation and estimating the parametric component via maximum likelihood. He estimated ΛH(t) using a standard estimator that utilizes the empirical distribution of the fully observed X. Dikta further proved that his semiparametric estimator of S(t) is more efficient than the KME, in the sense of having smaller asymptotic variance, provided p(t) is not misspecified.

Dikta's approach has been employed in a different setting as well. Sun and Zhu (2000) proposed a Dikta-type semiparametric estimator of a survival function in a left-truncated and right-censored model and derived its large-sample properties. Zhu et al. (2002) studied resampling methods for testing the goodness-of-fit of Dikta's semiparametric model.

Interestingly, Dikta's (1998) work finds yet another application, namely estimation in the MCI model, providing an attractive and very simple semiparametric alternative to the nonparametric approach pursued by researchers thus far in this model. The MAR assumption implies that the distribution of ξ is free of θ and this implies that the MLE of θ can be computed based on just the complete cases, see Section 3. Therefore, provided that the parametric model for p(t,θ) is specified correctly, the only additional computation relative to the KME is the calculation of a parameter estimate using maximum likelihood. The simplicity of the approach is perhaps a compelling rationale for choosing a semiparametric estimator for S(t) over a nonparametric estimator in the MCI model. Indeed, the semiparametric estimator would not require cumbersome bandwidth calculations based on estimates of densities and their derivatives that are so pervasive with the nonparametric estimators, see, for example, van der Laan and McKeague (1998). When the model for p(t,θ) is specified correctly, we show as in Dikta (1998) that the semiparametric estimator is more efficient (asymptotically) than the nonparametric estimators proposed by van der Laan and McKeague (1998).

Needless to say, correct specification of p(t,θ) is very crucial for estimating S(t) well. Similar to the argument employed by Tsiatis et al. (2002), one may argue that logit(p(t))=log{p(t)/(1−p(t))} is the difference between the log survival and log censoring hazards, which is a smooth function of t, hence a logistic model, after incorporation of additional polynomial terms, should, in most cases, provide an appropriate representation for p(t,θ). Cox and Snell (1989) provide methods that would be useful for analyzing binary data and for model-checking.

The article is organized as follows. Section 2 contains an overview of the existing estimators of S(t) in the classical random censorship model and the MCI model. In Section 3, the new semiparametric estimator of S(t) is introduced, analyzed, and its asymptotic variance compared with the information bound for nonparametric estimation of S(t). The article ends with a Conclusion section.

Section snippets

Overview of the estimators of a survival function

In this section, we first review the KME and Dikta's (1998) semiparametric estimator in the classical random censorship model. We then present an overview of the estimators proposed in the MCI model, first under MCAR and then under MAR (or more generally CAR). To facilitate comparison of the asymptotic variances of the semiparametric and (efficient) nonparametric estimators of H1(t) and S(t), we also calculate the information bound for estimating H1(t) and S(t) from the expressions for their

Semiparametric estimation in the MCI model

The likelihood based on the observed data (X1,ξ1,σ1),…,(Xn,ξn,σn) is given by i=1nπ(Xi)ξip(Xi,θ)σi1−p(Xi,θ)ξi−σi1−π(Xi)1−ξih(Xi). Under MAR the distribution of ξ is free of θ, which implies that Ln(θ)=i=1np(Xi,θ)σi1−p(Xi,θ)ξi−σi may be used as a likelihood for θ. Let θ̂ denote the MLE of θ0. Recall that pr(u,θ)=∂p(u,θ)/∂θr, r=1,…,k, D(u,θ)=(p1(u,θ),…,pk(u,θ))T. For each r,s=1,…,k, let pr,s(u,θ)=2p(u,θ)/∂θr∂θs. Under appropriate regularity conditions and using a standard Taylor-expansion

Conclusion

This article provides an overview of several methods of estimation of a survival function S(t) in the classical random censorship model when the censoring indicators are missing for a subset of the study subjects. The model is called the missing censoring-indicator (MCI) model of random censorship. Two well-known missingness mechanisms were presented. The currently existing asymptotically efficient estimators are the reduced-data NPMLE and an estimator obtained through standard estimating

Acknowledgements

This research was supported by the University of Maine Summer Faculty Research Fund Award, 2002.

References (29)

  • Gijbels, I., Lin, D.Y., Ying, Z. (1993). Non- and semi-parametric analysis of failure time data with missing failure...
  • R.D. Gill

    Non- and semi-parametric maximum likelihood estimators and the von Mises method (part I)

    Scand. J. Statist.

    (1989)
  • R.D. Gill et al.

    A survey of product-integration with a view toward application in survival analysis

    Ann. Statist.

    (1990)
  • R.D. Gill et al.

    Coarsening at random

  • Cited by (20)

    • Bootstrap likelihood ratio confidence bands for survival functions under random censorship and its semiparametric extension

      2016, Journal of Multivariate Analysis
      Citation Excerpt :

      Nonparametric adjustments are perhaps cumbersome and hence might be unsatisfactory, with the need to supply conditional function estimates and attendant optimal bandwidths. SRCMs pose no such limitations, since model parameters can be estimated consistently from the complete cases [33], allowing as before the computation of the right side of Eq. (2.14) and the left side of Eq. (2.15). Both the nonparametric and the semiparametric approaches can produce simultaneous confidence bands with non-monotone limits near the tails.

    • Two-sample location-scale estimation from semiparametric random censorship models

      2014, Journal of Multivariate Analysis
      Citation Excerpt :

      With SRCMs, however, parameters of the model for the binary response data can still be estimated from the complete cases [21], and the location–scale parameter estimates can be obtained as described herein.

    • Model assisted Cox regression

      2014, Journal of Multivariate Analysis
    • Model-based confidence bands for survival functions

      2013, Journal of Statistical Planning and Inference
    View all citing articles on Scopus
    View full text