Abstract
In medicine and health sciences mixed effects models are often used to study time-structured data. Optimal designs for such studies have been shown useful to improve the precision of the estimators of the parameters. However, optimal designs for such studies are often derived under the assumption of a zero autocorrelation between the errors, especially for binary data. Ignoring or misspecifying the autocorrelation in the design stage can result in loss of efficiency. This paper addresses robustness of Bayesian D-optimal designs for the logistic mixed effects model for longitudinal data with a linear or quadratic time effect against incorrect specification of the autocorrelation. To find the Bayesian D-optimal allocations of time points for different values of the autocorrelation, under different priors for the fixed effects and different covariance structures of the random effects, a scalar function of the approximate variance–covariance matrix of the fixed effects is optimized. Two approximations are compared; one based on a first order penalized quasi likelihood (PQL1) and one based on an extended version of the generalized estimating equations (GEE). The results show that Bayesian D-optimal allocations of time points are robust against misspecification of the autocorrelation and are approximately equally spaced. Moreover, PQL1 and extended GEE give essentially the same Bayesian D-optimal allocation of time points for a given subject-to-measurement cost ratio. Furthermore, Bayesian optimal designs are hardly affected either by the choice of a covariance structure or by the choice of a prior distribution.






Similar content being viewed by others
References
Abebe HT, Tan FES, van Breukelen GJP, Berger MPF (2013) Bayesian design for dichotomous repeated measurements with autocorrelation. Stat Methods Med Res. doi:10.1177/0962280213508850
Abebe HT, Tan FES, van Breukelen GJP, Berger MPF (2014a) Bayesian D-optimal designs for the two parameter logistic mixed effects model. Comput Stat Data Anal 71:1166–1176
Abebe HT, Tan FES, van Breukelen GJP, Serroyen J, Berger MPF (2014b) On the choice of a prior for Bayesian D-optimal designs for the logistic regression model with a single predictor. Commun Stat Simul Comput 43(7):1811–1824
Atkinson AC, Haines LM (1996) Designs for nonlinear and generalized linear models. Handb Stat 13:437–475
Atkinson AC, Donev AN, Tobias RD (2007) Optimum Experimental Designs, with SAS. Clarendon, Oxford
Ausems M, Mesters I, Van Breukelen G, De Vries H (2002) Short-term effects of a randomized computer-based out-of-school smoking prevention trial aimed at Dutch elementary schoolchildren. Prev Med 34:581–589
Berger MPF, Tan FES (2004) Robust designs for linear mixed effects models. Appl Stat 53:569–581
Berger MPF, Wong WK (2009) An introduction to optimal designs for social and biomedical research. Wiley, Chichester
Breslow NE, Clayton DG (1993) Approximate inference in generalized linear mixed models. J Am Stat Assoc 88:9–25
Carriére I, Bouyer J (2002) Choosing marginal or random-effects models for longitudinal binary studies: application to self-reported disability among older persons. BMC Med Res Methodol 2:15
Chaloner K, Larntz K (1989) Optimal Bayesian designs applied to logistic regression experiments. J Stat Plan Inference 18:191–208
Chaloner K, Verdinelli I (1995) Bayesian experimental design. A review. Stat Sci 10:273–304
Chernoff H (1953) Locally optimal designs for estimating parameters. Ann Math Stat 24:586–602
Chipman H, Welch W (1996) D-optimal design for generalized linear models, unpublished manuscript downloaded from http://math.acadiau.ca/chipmanh/publications.html
Dargent-Molina P, Favier F, Grandjean H, Baudoin C, Schott AM, Hausherr E, Meunier PJ, Bréart G (1996) Fall-related factors and risk of hip fracture: the EPIDOS prospective study. Lancet 348:145–149
Diggle PJ, Liang K-Y, Zeger S (1994) Analysis of longitudinal data. Clarendon Press, Oxford
Gilks WR, Wild P (1992) Adaptive rejection sampling for Gibbs sampling. Appl Stat 41:337–348
Gilks WR, Best NG, Tan KKC (1995) Adaptive rejection metropolis sampling within Gibbs sampling. Appl Stat 44:455–472
Han C, Chaloner K (2004) Bayesian experimental designs for nonlinear mixed models with application to HIV dynamics. Biometrics 60:25–33
Jang W, Lim J (2009) A numerical study of PQL estimation biases in generalized linear mixed models under heterogeneity of random effects. Commun Stat Simul Comput 38:692–702
Khuri AI, Mukherjee B, Sinha BK, Ghosh M (2006) Design issues for generalized linear models: a review. Stat Sci 21:376–399
Matlab (2010) Handbook Matlab, version 7.10.0.499(R2010a). MathWorks Inc., MA, USA
Moerbeek M, Breukelen GJP, Berger MPF (2001) Optimal experimental designs for multilevel logistic models. Statistician 50(1):17–30
Moerbeek M, Breukelen GJP, Berger MPF (2003) A comparison of estimation methods for multilevel logistic models. Comput Stat 18(1):19–37
Moerbeek M, Maas CJM (2005) Optimal experimental designs for multilevel logistic models with two binary predictors. Commun Stat Theory Methods 34:1151–1167
Molenberghs G, Verbeke G (2005) Models for discrete longitudinal data. Springer, New York
Niaparast M (2009) On optimal design for a Poisson regression model with random intercept. Stat Probab Lett 79:741–747
Niaparast M, Schwabe R (2013) Optimal design for quasi-likelihood estimation in Poisson regression with random coefficients. J Stat Plan Inference 143:296–306
Ouwens MJNM, Tan FES, Berger MPF (2006) A maximin criterion for the logistic random intercept model with covariates. J Stat Plan Inference 136:962–981
Silvey SD (1980) Optimal design: an introduction to the theory for parameter estimation. Chapman and Hall, New York
Sinha Sk, Xu X (2011) Sequential D-optimal designs for generalized linear mixed models. J Stat Plan Inference 141:1394–1402
Sommer A, Katz J, Tarwotjo I (1984) Increased risk of respiratory infection and diarrhea in children with pre-existing mild vitamin A deficiency. Am J Clin Nutr 40:1090–1095
Tan FES, Berger MPF (1999) Optimal allocation of time points for random effects models. Commun Stat Simul Comput 28:517–540
Tekle FB, Tan FES, Berger MPF (2008) Maximin D-optimal designs for binary longitudinal responses. Comput Stat Data Anal 52:5253–5262
Woods DC, van de Ven P (2006) Blocked designs for experiments with correlated non-normal response. Technometrics 53:173–182
Zeger SL, Liang K-Y, Albert PS (1988) Model for longitudinal data: a generalized estimating equation approach. Biometrics 44:1049–1060
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendix: Derivation for the relative efficiency Eq. (14)
Appendix: Derivation for the relative efficiency Eq. (14)
To compare designs we compute their efficiencies using the concept of equivalent sample size (see Atkinson et al. 2007, page 152 and Berger and Wong 2009, page 37). Let \({\text { Var}}\left( {\hat{\varvec{\beta }}_{\xi _s}} \right) \) and \( {\text {Var}}\left( {\hat{\varvec{\beta }}_{\xi _q } } \right) \) be the variance–covariance matrices of \( \hat{\varvec{\beta }} \) for the design \( \xi _s \) with \( s \) time points and the design \( \xi _q \) with \( q \) time points, respectively and \( N_{s} \) and \( N_{q} \) are the number of subjects for the design \( \xi _s \) and \( \xi _q\), respectively. For the D-criterion and a given model with \( p \) parameters, the RE of design \( \xi _s \) compared to design \( \xi _q \) is given by:
where the two determinants in (15) are both based on one subject only, and the factor \( N_{s}/N_{q} \) takes into account the sample size per design.
This RE (15) can be rewritten as follows:
Rewriting \( N_{s} \) and \( N_{q} \) in terms of cost ratio \( k \) and number of time points for the same total cost using the cost function Eq. (13), i.e., \( N_s =\frac{C}{C_2 \left( {k+s} \right) } \) and \( N_q =\frac{C}{C_2 \left( {k+q} \right) }\), we obtain
This RE (17) is for locally optimal design, i.e., for given parameter values. By generalizing this to Bayesian design, the RE of design \( \xi _s \) compared to design \( \xi _q \) with prior distribution \( \pi \) for \( \varvec{\beta } \) becomes as follows:
Thus, using the Bayesian D-optimality criterion (12), the RE will be:
When the ratio \( \frac{k+q}{k+s} \) is one, that is, if either \( q=s \) or the cost ratio \( k \) is very large, this RE (19) becomes the same as the RE given by Chaloner and Larntz (1989).
Rights and permissions
About this article
Cite this article
Abebe, H.T., Tan, F.E.S., Van Breukelen, G.J.P. et al. Robustness of Bayesian D-optimal design for the logistic mixed model against misspecification of autocorrelation. Comput Stat 29, 1667–1690 (2014). https://doi.org/10.1007/s00180-014-0512-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-014-0512-3