Influence analyses of nonlinear mixed-effects models
Introduction
There has been a great deal of recent interest in mixed-effects models for repeated-measures data that arise in different areas of investigation, such as economics and pharmacokinetics. Repeated-measures data are generated by observing a number of subjects (individuals) repeatedly under differing experimental conditions. Observations on the same subject are usually made at different times, as in longitudinal studies. Mixed-effects models assume that the intrasubject model that relates the response variable to time is the same to all subjects, but the model parameters may vary with subject. The linear mixed-effects model is no doubt an important statistical tool which is frequently used for evaluating the performance of products, for determining sampling designs and quality-control procedures, and for statistical genetics, particularly longitudinal studies. However, many repeated-measures data, such as growth data, dose-response data and pharmacokinetic data, are often inherently nonlinear with respect to a given response regression function. Several different nonlinear mixed-effects models have been proposed in recent years (see Sheiner and Beal, 1980; Mallet et al., 1988; Lindstrom and Bates, 1990; Davidian and Gallant, 1993; Vonesh and Carter, 1992; Pinheiro and Bates, 1995; Walker, 1996; Vonesh et al., 2002 among others). Due to the complexity of the models, obtaining the maximum likelihood (ML) estimates is a nontrivial problem. ML estimation was pioneered by Beal and Sheiner (1979), and since then a number of algorithms have been proposed for achieving the approximate ML solution, including Lindstrom and Bates (1990), Beal and Sheiner (1992), Pinheiro and Bates (1995), and Davidian and Gallant (1993). Recently, Walker (1996) introduced an EM algorithm (Dempster et al., 1977) for exact ML estimation.
Detecting outliers and influential observations and studying the sensitivity about the departure from basic assumptions are important issues in statistical analysis.
Following the pioneering work of Cook 1977, Cook 1986, this area of research has received much attention, see Belsley et al. (1980), Banerjee and Frees (1997), Christensen et al. (1992), Chatterjee and Hadi (1988), Crichley et al. (2001), Lesaffre and Verbeke (1998), Zhu and Lee (2001), Zhu et al. (2001), and Lee and Xu (2002) among others. For the nonlinear mixed-effects models, very little has been done on achieving the local influence measures and the case deletion measures. The main objective of this paper is to develop some methods to obtain these measures. Given the ML estimates, we obtain the diagnostic measures via the methods proposed by Zhu and Lee (2001) and Zhu et al. (2001). The key idea of the development is to treat the random effects as hypothetical missing data and work with the conditional expectation of the complete-data log-likelihood function in the EM algorithm (Dempster et al., 1977). Diagnostic measures for local influence are based on the conformal normal curvature (Poon and Poon, 1999), whilst the case-deletion measures are based on the one-step approximation of Cook and Weisberg (1982). These diagnostic measures cannot be obtained in closed form as they involve intractable integrals. A Metropolis–Hastings (MH) algorithm (Metropolis et al., 1953; Hastings, 1970) is implemented to simulate a sufficiently large sample of random effects from the appropriate conditional distribution for approximating these integrals. As this sample can be obtained as a by-product in the estimation, the computational burden induced is light.
The paper is organized as follows. Section 2 introduces the nonlinear mixed-effects models and the ML estimation. The diagnostics measures are derived in Section 3. Two real examples are give in Section 4. Some technical details are given in the appendices.
Section snippets
Nonlinear mixed-effects model and its ML estimation
Consider the following nonlinear mixed-effects model as proposed by Pinheiro and Bates (1995). In the first stage the jth observation on the ith subject is modeled aswhere f is a nonlinear function of a subject-specific parameter vector φij and the predictor xij, εij is a normally distributed noised term, I is the total number of subjects, and ni is the number of observations on the ith subject. In the second stage the subject-specific parameter vector is
Diagnostic analysis
There are basically two approaches for detecting influential observations that seriously influence results of a statistical analysis. The first approach is the case-deletion approach, in which the impact of deleting an observation to estimation is direct assessed by some metrics such as the likelihood distance and the Cook's distance (see, Cook, 1977). The second approach is the local influence approach (Cook, 1986), in which the stability of the estimation outputs with respect to the model
Illustrative examples
In the following examples, all quantities for achieving diagnostic measures are based on formulas , , , , with T=2000 observations generated by the MH algorithm from the appropriate conditional distributions. The benchmark is taken to be 1/m+2SM(0). Results are obtained by computer programs written in C language, listing of these programs can be obtained from the authors upon request. Example 1 Orange trees data The data, which consist of seven measurements of the trunk circumference (in millimeters) on each of five
Discussion
As the observed-data log-likelihood function of the nonlinear mixed-effect models is rather complicated, it is difficult to detect influential observations by direct application of the traditional approaches given in Cook 1977, Cook 1986. In this paper, we propose a procedure for computing case-deletion measures and local influence diagnostics on the basis of the conditional expectation of the complete-data log-likelihood function in relation to the EM algorithm. As observations simulated at
Acknowledgements
The research is fully supported by a grant (CUHK 4356/00H) from the Research Grant Council of the Hong Kong Special Administration Region.
References (35)
- et al.
Influence diagnostics for linear longitudinal models
J. Amer. Statist. Assoc.
(1997) - et al.
Nonlinear Regression Analysis and Its Applications
(1988) - Beal, S.L., Sheiner, L.B., 1979. NONMEM Users’ Guide, Part I. Division of Clinical Pharmacology, University of...
- Beal, S.L., Sheiner, L.B., 1992. NONMEM Users’ Guide, Part VII, Conditional Estimation Methods, NONMEM Project Group....
- et al.
Outliers
Technometrics
(1983) - et al.
Regression Diagnostics: Identifying Influential Data and Sources of Collinearity
(1980) - et al.
Sensitivity Analysis of Linear Regression
(1988) - et al.
Case-deletion diagnostics for mixed models
Technometrics
(1992) Detection of influential observations in linear regression
Technometrics
(1977)Assessment of local influence
J. Roy. Statist. Soc. Ser. B
(1986)