Choosing cross-over designs when few subjects are available

https://doi.org/10.1016/j.csda.2007.05.002Get rights and content

Abstract

Cross-over designs are used extensively for experiments in many fields. If the n subjects are relatively scarce compared to the t treatments then universally optimal designs do not exist under these restrictions and a computational procedure is usually required to select the design. This arises, for example, if the subjects comprise several animals which are in short supply, due perhaps to weight or age limitations. It is shown that cyclic cross-over designs are available that have lower average variances for direct and carry-over elementary treatment contrasts than other cyclic cross-over designs described in the literature. Examples of these improved designs are given for typical values of t and n. It is further shown that in these circumstances it is sensible to guard against a choice of design that can become disconnected if a few observations are lost during the experimentation. These points are illustrated in detail by considering the selection of a cross-over design for an experiment involving seven treatments applied to four subjects.

Introduction

Cross-over designs are employed extensively in different areas of experimental research, including agriculture, psychology and clinical trials. Many of these designs are arranged so that each of n subjects receives a sequence of t distinct treatments, one treatment for each of p consecutive time-periods, and the response is measured at the end of each period. The chief purpose of this is to eliminate a potential source of variation between subjects when comparing treatment effects. A cross-over design is uniform if each treatment occurs equally often in a period and, for each subject, each treatment occurs in the same number of periods. The most common uniform designs occur with p=t and attention is confined to this case in the present paper; for consideration of other uniform cross-over designs, see Jones and Kenward (2003) and Bate and Jones (2006). Furthermore, the designs usually need to take account of possible effects from immediately preceding treatments that may affect observed responses. These effects are defined as (first-order) carry-over effects. A cross-over design is balanced for carry-over effects if, within the subjects, every treatment precedes every other treatment the same number of times. Hedayat and Afsarinejad (1978) consider choosing cross-over designs when n is a multiple of t: they show that a balanced uniform design is universally optimal for the estimation of both direct and carry-over treatment effects, provided that the class of competing designs is confined to uniform cross-over designs. This restriction is weakened by Cheng and Wu (1980), Kunert (1984), Hedayat and Yang, 2003, Hedayat and Yang, 2004 and others; see also Jones and Kenward (2003, Section 4.3) and Bate and Jones (2006).

Situations sometimes arise when the subjects are relatively scarce compared to the number of treatments and n cannot be chosen to be a multiple of t. For example, this situation may occur if the subjects under investigation are animals which are in short supply. Perhaps a transgenic knock-out strain is difficult to breed so that producing sufficient animals within a specific weight and age range is difficult and few subjects are available at any one time. The same reasoning applies to a clinical trial where the disease is uncommon and it is usually quite difficult to find patients. A similar case arises if the subjects are volunteers who have a specialized knowledge. In such circumstances it is necessary to consider cross-over designs where there are fewer subjects than treatments, but much of the published work on cross-over designs does not apply to cases where n<t. However, two approaches to this problem have been considered in the literature.

Russell (1991) suggests a two-stage procedure for obtaining a design with fewer subjects than treatments. The first stage begins with a t×t parent Latin square, and this choice of square depends on whether there are an even or an odd number of treatments. If t is even then a balanced square due to Williams (1949) is suggested; if t is odd then Russell proposes a particular square which has a “nearly balanced” construction. Both the Williams and the Russell squares require the specification of the initial column only, then the remainder of the square is found by the well-known cyclic generation method and, following Houston (1966), such Latin squares are termed cyclic parent squares. The second stage of Russell's procedure is to select n columns of the parent square such that the average variance of elementary contrasts for direct or for carry-over treatment effects is minimized. Russell suggests a systematic computational method for examining sets of n columns from the t columns of the square, which he terms the “basic sets”, and the optimal design is found from among these basic sets. This procedure is therefore straightforward to apply and leads to a structured design based on a cyclic square. A good, if not efficient, cross-over design is obtained by this method, provided that the dimensions of the design are not abnormally large. Further developments of cross-over designs have built on the approach of Russell (1991); e.g. cross-over designs constructed so that a test for interaction between direct effects and carry-over effects can be made (Russell and Lewis, 1997) and cross-over designs with carry-over effects resulting from two unrelated factors (Lewis and Russell, 1998).

A different approach to the problem of obtaining a cross-over design is advocated by Eccleston and Whitaker (1999), John and Russell (2003) and others. This approach is to employ a general computational search through many possible designs with the required specification. A design which is found to satisfy a chosen criteria of optimality is accepted as the final design, e.g. John and Russell (2003) seek designs which maximize the average efficiency factors for direct and/or carry-over effects. One would expect this approach to give a more efficient design than that of Russell (1991) because these algorithms search through a larger set of designs. However, a “best” design may not be given by this procedure; for example, Jones and Donev (1996, p. 1437) remark that a search algorithm “can get trapped at a local optimum” so that the selected design “will more than likely be one that is useful in practice”. In general, a cross-over design obtained by this approach will not be unique and will depend upon a starting value, i.e. the random seed chosen. So the method should be regarded as providing a family of designs upon repeated searches with different starting values. The number of designs making up this family will be dependent on resources, particularly the time and the perseverance of the operator.

The purposes of this paper are twofold and each is aimed at clarifying the position for the practitioner who is seeking a uniform cross-over design when p=t and n<t. First, it is pointed out in Section 2 of this paper that cross-over designs where there are fewer subjects than treatments can be obtained which retain the structural simplicity of the Russell approach but may be more efficient than the designs of Russell (1991). These designs are referred to as small cyclic cross-over designs and are obtained by enlarging the set of parent squares to the set of all balanced cyclic squares (n even) or all nearly balanced cyclic squares (n odd), for which the Williams and the Russell squares are special cases. This is achieved by means of two simple algorithms which are straightforward to apply. A design which is efficient relative to this larger set is guaranteed by the method. It is possible in principle to enlarge these sets still further by basing the parent squares on multiplication tables of non-cyclic groups or even loops (confer Denes and Keedwell, 1974, Chapter 3). Such an enlargement would destroy the simple form of parent squares which provide designs described here; however, an interesting exception to this is the case t=9 for which a column-balanced non-cyclic square exists (Hedayat and Afsarinejad, 1978). Examples of the small cyclic and other designs are given in Sections 2 and 3. The designs are tabulated in the Appendix for t=5 up to t=11 and for 4nt-1.

The second purpose of this paper is to draw attention to a serious problem in the field when performing planned experiments using some cross-over designs of this size, which include experiments involving animal or human subjects. This problem, which is referred to by several researchers, is the possibility that one or more observations are lost during the experimental process; for example, Senn (2002, p. 6) comments that: “missing observations continue to be one of the major problems in interpreting clinical trials, and cross-overs are no exception”. If some values are lost when there are few subjects compared to treatments, there is a real possibility that the eventual design used for the experiment is disconnected with respect to either direct effects or carry-over effects or, in many cases, both sets of treatment effects. The characteristic property of a disconnected eventual design is that the test of the usual null hypotheses that all direct effects and/or carry-over effects have the same value breaks down and that very few comparisons can be made between any pairs of direct and/or carry-over effects. In these circumstances, the experiment will be damaged severely and very little can be achieved from it. It is possible to assess the vulnerability of an original design to observation loss by a procedure of Godolphin, 2004, Godolphin, 2006 which specifies (Type I) rank reducing observation sets (RROSs) for the cross-over design. It is suggested that this information should play a part in discriminating between designs. In our opinion, it is reasonable to expect the statistical practitioner to select an original design which is not only relatively efficient but which also guards against the unwelcome possibility that, if a few observations are lost, then the eventual design may be disconnected.

The computational algorithms for generating the set of balanced and the set of nearly balanced parent Latin squares are described in Section 2 and the sizes of these sets are found from n=4 to 12. A routine for obtaining the small cyclic cross-over designs and for deriving their RROSs is also discussed. It is desirable to choose a design with the property that the average variances of elementary direct and carry-over treatment contrasts are small. Also this design should possess the property that the minimal size of the rank reducing observation sets is large. These two aims are discussed and illustrated in Section 3 by considering the choice of a design with t=7 and n=4. The consequences of a naı¨ve design choice for these values of t and n are discussed together with alternative design choices which are given by computational procedures available in the literature. A brief discussion is also given in Section 4 concerning general criteria for choosing cross-over designs where there are fewer subjects than treatments.

Section snippets

Construction of parent squares

Let Y denote a nt×1 vector of observations, given in standard form by Y=1ntμ+X1τ+X2ρ+X3α+X4β+ε,where μ is a parameter, τ,ρ,α and β are vectors of treatment direct, treatment carry-over, row and column effects, respectively, X1,X2,X3 and X4 are corresponding components of the design matrix, 1nt is the nt×1 vector, all of whose elements are unity, and ε is a vector of disturbances such that E[ε]=0 and Var[ε]=σ2I. By convention, the t treatments are labelled {0,1,,t-1}.

In this section, the choice

Naı¨ve design choices for t=7 and n=4

In this section, we illustrate the search for a design with the two properties of relatively high efficiency and relatively low vulnerability to observation loss by considering the particular case where t=7 and n=4. This case is considered in some detail by Russell (1991) so the illustration can be regarded as a contribution towards that discussion.

Our interest in this case was initiated by an experiment in which seven treatments are applied orally to each of four canine subjects in their food,

General discussion

The problem of selecting a suitable cross-over design for an experiment where subjects are scarce is a familiar one to statistical practitioners although it has had insufficient attention in the literature. The constraints p=t and n<t place the practitioner in a position where there is a lack of theoretical results, hence a choice of design must depend upon essentially computational work. For values of t and n which seem most likely to arise in practice, small cyclic designs and designs based

Acknowledgements

The authors wish to thank Heather Elliott and William Unsworth of GlaxoSmithKline Ltd for their helpful comments.

References (29)

  • J.D. Godolphin et al.

    On the connectivity of row–column designs

    Utilitas Math.

    (2001)
  • B. Gordon

    Sequences in groups with distinct partial products

    Pacific J. Math.

    (1961)
  • A. Hedayat et al.

    Repeated measurements designs, II

    Ann. Statist.

    (1978)
  • A. Hedayat et al.

    Universal optimality of balanced uniform cross-over designs

    Ann. Statist.

    (2003)
  • Cited by (6)

    • Robust assessment of two-treatment higher-order cross-over designs against missing values

      2019, Computational Statistics and Data Analysis
      Citation Excerpt :

      In higher order cross-over studies, this issue is heightened further as the number of experimental and associated washout periods are increased which can lead to trials with lengthy follow-up. Similar difficulties with drop-out during the term of the experiment can also arise when animal subjects are involved in pharmaceutical studies; see for example, Bate et al. (2008). Missing data in any experiment will result in a loss of precision of parameter contrasts in effects of interest and, in some cases, can lead to a design which is disconnected; see for example Godolphin (2004).

    View full text