MCMC-based estimation methods for continuous longitudinal data with non-random (non)-monotone missingness

https://doi.org/10.1016/j.csda.2010.04.026Get rights and content

Abstract

The analysis of incomplete longitudinal data requires joint modeling of the longitudinal outcomes (observed and unobserved) and the response indicators. When non-response does not depend on the unobserved outcomes, within a likelihood framework, the missingness is said to be ignorable, obviating the need to formally model the process that drives it. For the non-ignorable or non-random case, estimation is less straightforward, because one must work with the observed data likelihood, which involves integration over the missing values, thereby giving rise to computational complexity, especially for high-dimensional missingness. The stochastic EM algorithm is a variation of the expectation–maximization (EM) algorithm and is particularly useful in cases where the E (expectation) step is intractable. Under the stochastic EM algorithm, the E-step is replaced by an S-step, in which the missing data are simulated from an appropriate conditional distribution. The method is appealing due to its computational simplicity. The SEM algorithm is used to fit non-random models for continuous longitudinal data with monotone or non-monotone missingness, using simulated, as well as case study, data. Resulting SEM estimates are compared with their direct likelihood counterparts wherever possible.

Introduction

Clinical studies typically entail the evaluation of a particular response of interest over a period of time. Often, the longitudinal nature renders the data arising from such studies more susceptible to incompleteness, primarily in the response, the extent and nature of which can have serious implications on the resulting conclusions. Any meaningful analysis should therefore attempt to address the incompleteness in the data. With the recent developments in the area of missing data, it has been shown that ignoring the missing data (i.e., analyzing only the complete cases) would be valid only under the most restrictive, unrealistic assumptions (Rubin, 1976). A more principled approach should account for, rather than ignore, the mechanism driving the missingness, R, in addition to that driving the responses, Y, which can be partitioned into its observed and missing parts as Y=(Yo,Ym). Such an approach would thereby entail working with the joint distribution, f(y,r|β,ψ)=f(yo,ym,r|β,ψ), where β and ψ are the parameter vectors characterizing the response and non-response processes, respectively.

The missingness or non-response process, R, can be broadly classified as being either monotone (or dropout), in which the unobserved measurements within a longitudinal series all occur after a particular measurement occasion, or non-monotone, for which missing values arise intermittently within the series. In addition, the missingness process, be it monotone or not, can be classified using the taxonomy introduced by Rubin (1976). A mechanism is said to be missing completely at random (MCAR) if the processes governing the missingness and the outcomes are independent, possibly conditionally on covariates. A less rigid assumption would be missing at random (MAR), for which the missingness may depend on the observed outcomes and on covariates but, given these, not further on the unobserved outcomes. When, in addition to such dependencies, the unobserved data provide further information about the missing data mechanism, then the mechanism is referred to as being missing not at random (MNAR).

When data are incomplete, likelihood-based approaches involve working with the observed data likelihood, which is obtained by integrating the full data likelihood L=f(yo,ym,r|β,ψ) over the missing components. As shown by Rubin (1976), the resulting expression simplifies under MCAR and MAR to L=f(yo|β)f(r|yo,ψ) and inferences about β can be drawn independently of the missing data model. This property, termed ignorability allows for the use of standard software, provided it can handle unbalanced data. The MNAR case, however, does not admit simplification and integration over the missing components is explicitly necessary. Depending on the complexity of the model used, not only is the approximation/evaluation of the integral itself involved, but its subsequent maximization can be computationally demanding.

Broadly speaking, the evaluation and subsequent maximization of the observed data likelihood under non-ignorable missingness may be approached using either a stochastic or a non-stochastic solution. Non-stochastic solutions would involve maximization of either a direct evaluation or a numerical approximation of the integral in going from L to L. Diggle and Kenward (1994) provide such a solution for continuous Gaussian data with monotone missingness, while Troxel et al. (1998) propose a method for the non-monotone case. The latter authors combined a Gaussian model with first-order anti-dependence structure for the outcomes with logistic regressions for the non-response indicators, further assuming independence among the latter. And yet despite these simplifying, but somewhat restrictive, assumptions, Troxel et al. (1998) reported that computations can become quite complicated and intractable for more than three or four repeated measurements. Stochastic methods, on the other hand, involve (iteratively) simulating random draws from the underlying non-standard distribution via Markov chains to fill in the missing data, and subsequently maximizing the likelihood using the completed data. For this reason, the stochastic approach offers a less computationally demanding alternative because it precludes any numerical integration.

This paper focuses on the use of the stochastic expectation–maximization (SEM) algorithm (Celeux and Diebolt, 1985) for fitting selection models for continuous longitudinal data with MNAR missingness, be it monotone or non-monotone, with subsequent comparison of resulting stochastic solutions with their non-stochastic counterparts. Through a simulation study (Section 5), the merits of the stochastic approach over non-stochastic methods are highlighted, thereby emphasizing the value of the former as a practical alternative to the latter. The stochastic EM algorithm was further applied to fit models for two case studies (described in Section 2 and the results of which are presented in Section 6). The various modeling frameworks considered are described in Section 3, while estimation procedures are presented in Section 4. Some concluding remarks are given in Section 7.

Section snippets

Age-related macular degeneration trial

The first case study data were obtained from a randomized multicenter clinical trial comparing an experimental treatment (interferon-α) with a corresponding placebo in the treatment of patients with age-related macular degeneration (ARMD) (Pharmacological Therapy for Macular Degeneration Study Group, 1997), a condition under which patients progressively lose vision. In the trial, a patient’s visual acuity was assessed at 4, 12, 24, and 52 weeks by his/her ability to read lines of letters on

Model frameworks

Suppose that for subject i,i=1,2,,N, a sequence of measurements Yij is designed to be measured at time points tij,j=1,2,,ni, resulting in a vector Yi=(Yi1,Yi2,,Yini) of measurements for each participant. Further, for each measurement in the series, define the response indicators as: Rij=1 if Yij is observed and Rij=0 otherwise, which are organized into a vector Ri=(Ri1,Ri2,,Rini) of parallel structure to Yi. Typically, Yi can be partitioned into two sub-vectors: Yio consisting of those Yij

Estimation methodology: The stochastic EM algorithm

The expectation–maximization (EM) algorithm (Dempster et al., 1977), which is a general purpose iterative algorithm for calculating maximum likelihood estimates, has become quite popular in handling incomplete data problems. The fundamental idea behind the EM algorithm is to associate with the given incomplete data problem, a complete data problem for which maximum likelihood estimation is computationally more tractable. The algorithm alternates the following 2 steps. The expectation or E-step

Simulation study

To evaluate the performance of the SEM algorithm in fitting non-ignorable models for incomplete Gaussian longitudinal data, simulations were conducted for both the monotone and non-monotone situations. Data generating models, simulation settings and subsequent results under each case are described in the following subsections.

SEM results were compared with those obtained using Newton–Raphson with numerical integration, which will hereinafter be referred to as the direct likelihood approach. The

ARMD: monotone case

A subset of the ARMD data consisting of 226 subjects having either a complete set of responses or dropout type of missingness was analyzed. For the 4 repeated measurements, a saturated means model was formulated, with the treatment (interferon-α vs. placebo) received by the patient, denoted xi, as covariate, and with an unstructured covariance, i.e.,E(Yij|xi,tij,β)=β0j+β1jxiItij(j)andΣi=[σrc], for i=1,2,,226 and j,r,c=1,2,3,4. For dropout, a model of the form (1) was considered. The

Discussion

In this paper, the performance of the stochastic EM algorithm in fitting non-ignorable selection models for continuous longitudinal data with monotone or non-monotone missingness was assessed. Stochastic EM is a convenient alternative to the deterministic EM algorithm because, by replacement of the E-step of the latter with a simulation step under the former, integration of the complete data likelihood over the missing values is altogether avoided since the S (simulation) step renders the data

Acknowledgement

The authors gratefully acknowledge financial support from the Interuniversity Attraction Pole Research Network P6/03 of the Belgian Government (Belgian Science Policy).

References (26)

  • A.M. Gad et al.

    Analysis of longitudinal data with intermittent missing values using the stochastic EM algorithm

    Computational Statistics and Data Analysis

    (2006)
  • I. Jansen et al.

    The nature of sensitivity in missing not at random models

    Computational Statistics and Data Analysis

    (2006)
  • Celeux, G., Chauveau, D., Diebolt, J., 1995. On stochastic versions of the EM algorithm. Technical Report 2514. INRIA...
  • G. Celeux et al.

    The SEM algorithm: a probabilistic teacher algorithm derived from the EM algorithm for the mixture problem

    Computational Statistics Quarterly

    (1985)
  • G. Celeux et al.

    A stochastic approximation type EM algorithm for the mixture problem

    Stochastics and Stochastics Reports

    (1992)
  • R. Crouchley et al.

    The common structure of several recent statistical models for dropout in repeated continuous responses

    Statistical Modelling

    (2002)
  • J.R. Dale

    Global cross-ratio models for bivariate, discrete, ordered responses

    Biometrics

    (1986)
  • A.P. Dempster et al.

    Maximum likelihood from incomplete data via the EM algorithm (with discussion)

    Journal of the Royal Statistical Society, Series B

    (1977)
  • J. Diebolt et al.

    Stochastic EM: method and application

  • P.J. Diggle et al.

    Informative dropout in longitudinal data analysis (with discussion)

    Applied Statistics

    (1994)
  • B. Efron

    Missing data, imputation, and the bootstrap

    Journal of the American Statistical Association

    (1994)
  • A. Gelman et al.

    Inference from iterative simulation using multiple sequences

    Statistical Science

    (1992)
  • G.F.V. Glonek et al.

    Multivariate logistic models

    Journal of the Royal Statistical Society, Series B

    (1995)
  • Cited by (0)

    View full text