Computation of c-optimal designs for models with correlated observations

https://doi.org/10.1016/j.csda.2016.10.019Get rights and content

Abstract

In the optimal design of experiments setup, different optimality criteria can be considered depending on the objectives of the practitioner. One of the most used is c-optimality, which for a given model looks for the design that minimizes the variance of the linear combination of the parameters’ estimators given by vector c. c-optimal designs are needed when dealing with standardized criteria, and are specially useful when c is taken to be each one of the Euclidean vectors since in that case they provide the best designs for estimating the individual parameters. The well known procedure proposed by Elfving for independent observations is the origin of the procedure that can be used in the correlation framework. Some analytical results are shown for the model with constant covariance, but even in this case the computational task can become quite hard. For this reason, an algorithmic procedure is proposed; it can be used when dealing with a general model and some covariance structures.

Introduction

Let us assume the linear modelY=Xβ+ϵ, where Y=(y1,,yn)t denotes the observations vector, X=(f(x1),,f(xn))t is the design matrix, with f(x)=(f1(x),,fm(x))t and the fi(x) are linearly independent on the experimental domain X, β is the m-vector parameters and ϵ the error terms vector, that will be assumed normally-distributed with covariance matrix Σ.

For non-independent observations, the information matrix for a design ξ is given by M(ξ)=XtΣ1X. The inverse of the information matrix M(ξ) is proportional to the covariance matrix of the estimators of the parameters of the model, thus an optimality criterion typically minimizes a function of M1(ξ).

There is a known set of techniques for obtaining optimal designs when the model is linear in the parameters (see for instance Fedorov and Hackl, 1997 or Atkinson et al., 2007), but in most cases assuming independent observations. For a non-linear model, the usual approach is to linearize it and use the standard toolbox for the linearized model. In this case initial values are needed for the non-linear parameters, and thus the obtained designs will be locally optimal.

For a nonzero m-dimensional vector c, the c-optimality criterion tries to find the design that minimizes the variance of the best linear unbiased estimator of ctβ. When c is taken to be each one of the Euclidean vectors (1,0,0,), (0,1,0,),, c-optimality will provide the best designs for the estimation of each parameter, which in particular are needed for the standardized criteria (Dette, 1997), that take into account the scale of the parameters. They can be as well used for checking how good a specific design is for the estimation of each one of the model parameters. Specifically, a design ξ is c-optimal if minimizes Var(ctβˆ)=ctM(ξ)c, where M is a generalized inverse (more information on generalized inverses can be found for instance in Yanai et al., 2011 or Pukelsheim, 2006, where a study of generalized inverses related with c-optimality is performed). c-optimal designs are very often singular (that is, the information matrix of the corresponding design is singular), which increases the difficulty of obtaining them. Elfving (1952) provided a graphical method for finding c-optimal designs when the observations are independent. This identifies a key point corresponding to the intersection of the line defined by c with the boundary of the ‘Elfving Set’ (the convex hull of f(X)f(X)). This optimal point will be a convex combination of, at most, m points of the set ±f(X) (Fellman, 1974). The result can be stated as follows:

Theorem 1.1

Assuming model   (1)   and independent observations, the approximate designξc={x1xrp1pr},where the xi are the support points and pi is the weight that point xi has in the design. (p1++pr=1) is c-optimal if there exists a point c belonging to the line defined by vector c and to the boundary of the convex hull of f(X)f(X)that can be expressed as ±p1f(x1)±±prf(xr). That is, it has a maximal norm within the points of the convex hull that can be expressed as γc with γ a scalar number.

The proof to the original theorem can be found in Elfving (1952).

This procedure is specially suitable for two-dimensional models, for which the required convex hull is easy to obtain, and produces approximate designs. Pukelsheim and Torsney (1991) give a method for computing c-optimal weights given the support points, López-Fidalgo and Rodríguez-Díaz (2004) generalize the Elfving’s method to the multi-dimensional case and Harman and Jurik (2008) improve the computation task by using linear programming. Pukelsheim (2006) contains as well an updated approach to Elfving’s theorem.

In the next section, a procedure for obtaining c-optimal designs when the observations are correlated is presented. Roughly speaking, the idea is to use a change of variables such that it turns the information matrix (2) into the shape X̃tX̃, and thus some ideas from Elfving’s method for independent observations can be applied, now to the new design matrix X̃. Although the method is not suitable for every covariance structure it can be used to solve some cases, and several examples of these applications are shown in Section  3. Finally, Section  4 describes a summary of the results appearing in the paper, introducing as well some other cases where the technique could be applied. Recent papers related to the keys of the present work are Tommasi et al. (2014), which computes c-optimal designs for log-linear models, and Dette et al. (in press), where optimal designs for comparing models under correlation are computed.

Section snippets

A derivation of Elfving’s method for correlated observations

A strictly positive definite covariance matrix will be assumed, that is, every eigenvalue of Σ will be assumed to be greater than zero. Through the paper, different examples using strictly positive definite stationary covariance kernels ρ() will be shown (see  Dette et al., 2013, Dette et al., 2015), giving rise to the covariance structure cov[y(xi),y(xj)]=ρ(|xixj|). For this type of correlation, different support points should be assumed in order to avoid singular covariance matrices, thus

Examples of application

In the following, some examples dealing with different stationary covariance kernels will be shown. In first place, the case of constant covariance will be studied.

Conclusions and discussion

A new procedure for the computation of c-optimal designs in the correlated setup is introduced, relating the problem to that of independent observations and thus being able to use some ideas from Elfving (1952). Analytical results have been obtained for two-parameter models with an intercept for two-point designs, both assuming constant covariance or a strictly positive definite covariance kernel. A general procedure for a number of observations greater than the number of parameters has been

Acknowledgments

Research was supported by the Spanish Ministry of Economy and Competitiveness and Junta de Castilla y León (Grants ‘MTM 2013-47879-C2-2-P’ and ‘SA130U14’).

References (19)

  • M.G. Kenward et al.

    An improved approximation to the precision of fixed effects from restricted maximum likelihood

    Comput. Statist. Data Anal.

    (2009)
  • C. Tommasi et al.

    Integral approximations for computing optimum designs in random effects logistic regression models

    Comput. Statist. Data Anal.

    (2014)
  • A.C. Atkinson et al.

    Optimum Experimental Designs, with SAS

    (2007)
  • H. Dette

    Designing experiments with respect to ‘Standarized’ optimality criteria

    J. R. Stat. Soc. Ser. B Stat. Methodol.

    (1997)
  • H. Dette et al.

    A geometric characterization of c-optimal designs for heteroscedastic regression

    Ann. Statist.

    (2009)
  • H. Dette et al.

    Optimal designs for linear models with correlated observations

    Ann. Statist.

    (2013)
  • H. Dette et al.

    Design for linear regression models with correlated errors

  • H. Dette et al.

    Optimal designs for comparing regression models with correlated observations

    Comput. Statist. Data Anal.

    (2016)
  • G. Elfving

    Optimum allocation in linear regression theory

    Ann. Math. Statist.

    (1952)
There are more references available in the full text version of this article.

Cited by (0)

View full text