Non-inferiority test based on transformations for non-normal distributions

doi:10.1016/j.csda.2016.10.004

Computational Statistics & Data Analysis

Volume 113, September 2017, Pages 73-87

https://doi.org/10.1016/j.csda.2016.10.004 Get rights and content

Abstract

Non-inferiority trials are becoming very popular for comparative effectiveness research. These trials are required to show that the effect of an experimental treatment is not worse than that of a reference treatment by more than a specified margin. Hence non-inferiority trials are of great importance, when superiority cannot be claimed. A three-arm non-inferiority trial consists of a placebo, a reference treatment, and an experimental treatment is considered. However unlike the traditional choices, it is assumed that the distributions of the end points corresponding to these treatments are unknown and suggested test procedures for a three-arm non-inferiority trial based on monotone transformations in conjunction with a normal approximation. The resulting test procedures are flexible and robust. Theoretical properties of the proposed methods are also investigated. The performance of the suggested test procedures is compared to their counterparts using simulations. In terms of type I error and power, the proposed methods perform better than their counterparts in most cases. The usefulness of the proposed methods is further illustrated through an example.

Introduction

Non-inferiority trials comparing an experimental treatment with a reference treatment are becoming very popular as more and more effective treatments become available and fewer discoveries of new treatments are made. When clear superiority of an experimental treatment is not evident, the objective may be to demonstrate the non-inferiority of an experimental treatment compared to a reference treatment. A slightly less efficacious experimental treatment might be preferred to an established reference treatment due to its other benefits, such as: less toxicity, less costly, less debilitating, and easy to administer etc. In this kind of trial non-inferiority is established by showing that the efficacy of an experimental treatment is not less than that of a reference treatment by a specified small margin, also known as non-inferiority margin. Traditionally, these types of trials do not include a placebo group due to ethical reasons and are often termed as “two-arm” trials that include the reference and experimental groups. But due to the absence of placebo arm they are unable to establish direct proof of efficacy of the reference treatment over the placebo, and require external validation which is often questionable. Hence, two-arm non-inferiority trials without placebo lack the support of internal assay sensitivity. For detailed description of the problem see Hung et al. (2003) and D’Agostino et al. (2003). Assay sensitivity refers to the ability of a trial to distinguish between effective and ineffective treatments. This can be established by assuming the constancy condition, i.e. patient population in the current active control trial and in the past placebo control trial remains unchanged (ICH, 2000). However, in practice it is difficult to validate this constancy assumption. As a consequence, it is often suggested to include placebo group whenever it is feasible and ethically justifiable as addressed in several regulatory guidelines (ICHE10, 2000, EMEA, 2005).

Pigeot et al. (2003) and Koch and Rohmel (2004) considered three-arm non-inferiority trials with the inclusion of the placebo group. These three-arm non-inferiority trials are useful as they are free from some of the difficulties described above. When placebo is included, one approach to establish the non-inferiority of experimental treatment is to show that the ratio, $(μ_{E} - μ_{P}) / (μ_{R} - μ_{P})$ is greater than $θ \in (0, 1)$ , where $μ_{E}$ , $μ_{R}$ , and $μ_{P}$ are the mean effects corresponding to the experimental (E), reference (R), and placebo (P) groups respectively and $θ$ is determined through clinical reasoning that considers knowledge about the diseases. For further details about the ratio method see Pigeot et al. (2003) and the references therein. A crucial assumption of some exiting methods (see Pigeot et al., 2003, Koti, 2007, Hasler et al., 2008) for testing non-inferiority hypothesis is that the endpoints are normally distributed. This assumption can lead us to unreliable conclusions when the normality of the endpoints is questionable. In this article, we consider a general situation without assuming any distributions corresponding to the endpoints. When the distributions of the endpoints are unknown, usually the non-inferiority testing can be performed based on a normal approximation using large sample theory. However, the true probability of incorrectly rejecting the null hypothesis (type I error) associated with an approximate test is never equal to the chosen nominal level for finite samples. Therefore the true type I error of an approximate test is either smaller or greater than the nominal level. If level error that is the difference between true type I error and nominal level is negative (or positive) then an approximate test is conservative (or anticonservative). A conservative test would lack power and an anticonservative test gives inflated type I error. Our analysis in Section 2 shows that magnitude of the level error associated the normal approximation test depends on the degree of skewness of underlying distributions and sample sizes associated with the three arms. Moreover, how large the sample size has to be also depends on the skewness of the distribution (Boos and Hughes-Oliver, 2000) and the hypothesized effect size.

Hence, the primary objective of this article is to reduce the level error associated to the non-inferiority test based on the normal approximation by removing the effects of skewness of underlying distributions. One way to achieve this goal is to convert the test statistic using a monotone transformation so that the resulting distribution of transformed test statistic is nearly symmetric and finally we can construct the critical point by inverting back the transformation. For one sample case, Hall (1992) investigated the effects of such monotone transformations to construct confidence intervals for the mean parameter, albeit not for the non-inferiority test setup. In a sense this article extends Hall’s (1992) method to a three-arm trial in the non-inferiority context. The proposed tests are alternatives to the nonparametric non-inferiority test suggested by Munzel (2009) for three arm non-inferiority trials based on ranks. Munzel’s method tackles the breakdown of the normality assumption via rank based test method that defines non-inferiority hypothesis in terms of relative treatment effects rather than usual mean effects for continuous data. Relative treatment effects are defined in terms of expectation of asymptotic rank transformation (page 3646, Munzel, 2009). Apart from lack of straight forward interpretation of relative treatment effects, our simulation studies show that Munzel’s method tends to be conservative when the effect size $(μ_{R} - μ_{p})$ is moderate to large (see Cohen, 1988).

The rest of the paper is organized as follows. Section 2 reviews three-arm non-inferiority trials. Section 3 develops our proposed test procedures for a three-arm non-inferiority trial based on transformations. In Section 4, we discuss a three-arm inferiority test based on ranks proposed by Munzel (2009). Simulation results are reported in Section 5. The analysis approach is illustrated in Section 6 using a data from a bone health study. A discussion follows in Section 7. For brevity, derivations of the theoretical results are provided in Appendix A.

Section snippets

Three-arm non-inferiority trials

To facilitate the discussion of a three-arm non-inferiority trial, let $X_{E, i}$ , $X_{R, j}$ , and $X_{P, k}$ $(i = 1, \dots, n_{E}, j = 1, \dots, n_{R}, k = 1, \dots, n_{P})$ denote the observations corresponding to the treatment response in the experimental (E), reference (R), and placebo (P) groups, respectively. We assume that $X_{E, i} \overset{i.i.d.}{\sim} F_{E} (μ_{E}, σ_{E}^{2}), X_{R, j} \overset{i.i.d.}{\sim} F_{R} (μ_{R}, σ_{R}^{2}), and X_{P, k} \overset{i.i.d.}{\sim} F_{P} (μ_{P}, σ_{P}^{2}),$ where $μ_{l} = E (X_{l})$ , $σ_{l}^{2} = V (X_{l})$ , and $F_{l}$ are absolute continuous as $X_{l}$ are continuous random variables, $l \in {E, R, P}$ . Without loss of generality, we assume that

Proposed test procedures

In this Subsection, we derive an Edgeworth expansion of the test statistic $S$ and then we construct monotone transformations using the sample version of $C$ in order to suggest new test procedures for the hypothesis in Eq. (2.4). We begin this section with the asymptotic expansion of the distribution of $S$ . Let us define $λ_{n, l} = n_{l} / n$ and assume that $λ_{n, l} \to λ_{l}$ , for $l \in {E, R, P}$ . This assumption ensures the fact that the $n_{E}$ , $n_{R}$ , and $n_{P}$ tend to infinity at the same rate.

Proposition 1

Assume that the distribution of $X_{l}$ is

Ranked-based method

In our simulation studies we compare the proposed test procedures with the nonparametric non-inferiority test based on ranks. To briefly describe the method, Munzel (2009) suggested a rank-based test for a three-arm non-inferiority trial using relative treatment effects, which are Kruskal–Wallis-type functionals. The relative treatment effects can be defined as $p_{l} = \int H d F_{l},$ where $H = \frac{n_{E}}{n} F_{E} + \frac{n_{R}}{n} F_{R} + \frac{n_{P}}{n} F_{P},$ where $l \in {E, R, P}$ . Based on these relative treatment effects, Munzel (2009) considered the following

Simulation studies

In this Subsection, we conduct simulation studies with the objective of evaluating the behavior of the test procedures given in Eqs. (2.5), (3.2), and (4.3) in terms of type I error and power. For our convenience we refer the tests in Eqs. (2.5), (4.3) as $T_{N}$ and $T_{M}$ . The proposed two tests, $S > g_{1}^{- 1} (z_{α})$ and $S > g_{2}^{- 1} (z_{α})$ , given in Eq. (3.2) are represented by $T_{g_{1}}$ and $T_{g_{2}}$ , respectively. We also consider a t-approximation using a Satterthwaite–Welch degree of freedom as an approximation of the

Example

This data example comes from an ongoing bone health study. The original goal of the study was to examine the influence of calcium intake and vitamin D exposure on biomarkers of calcium sufficiency. As it is well know that vitamin D enables calcium absorption, studies were done previously to test whether combination of calcium and vitamin D supplementation has any significant effect on bone turnover measured by PTH (serum parathyroid hormone test). Clinical trials of fracture prevention

Discussion

We have developed two test procedures for three-arm non-inferiority trials without imposing any distributional assumptions (except continuity) to the treatment responses. Our suggested test procedures are based on monotone transformations in conjunction with normal approximation. We have studied their theoretically properties and shown that, they are asymmetrically equivalent with respect to level error, i.e., both are second order accurate and provide better type I error compared with normal

Acknowledgments

The research of last author is partly supported by PCORI Grant Number ME-1409-21410 and NIH Grant Number P30-ES020957. We would like to thank the Editor, Associate Editor and two anonymous referees for their careful reading and constructive suggestions which improved the readability of the paper.

References (21)

J. Aloia et al.
The relative influence of calcium intake and vitamin d status on serum parathyroid hormone and bone turnover biomarkers in a double-blind, placebo-controlled parallel group, longitudinal factorial design
J. Clin. Endocrinol. Metab.
(2008)
D. Boos et al.
How large does $n$ have to be for $z$ and $t$ intervals?
Amer. Statist.
(2000)
O.E. Brandorff-Nielsen et al.
Asymptotic Techniques for Use in Statistics
(1989)
J. Cohen
Statistical Power Analysis for the Behavioral Sciences
(1988)
EMEA 2005.:Guideline on the choice of the noninferiority margin. Available...
R.B. D’Agostino et al.
Noninferiority trials: Design concepts and issues-the encounters of academic consultants in statistics
Stat. Med.
(2003)
Y. Dodge et al.
The complications of the fourth central moment
Amer. Statist.
(1999)
P. Hall
On the removal of skewness by transformation
J. R. Stat. Soc. Ser. B Stat. Methodol.
(1992)
P. Hall
The Bootstrap and Edgeworth Expansion
(1992)
M. Hasler et al.
Assessing non-inferiority of a new treatment in a three-arm trial in the presence of heteroscedasticity
Stat. Med.
(2008)

There are more references available in the full text version of this article.

Cited by (4)

A hierarchical testing procedure for three arm non-inferiority trials
2022, Computational Statistics and Data Analysis
Non-inferiority trials are becoming very popular for comparative effectiveness research. Non-inferiority trials establish that the effect of an experimental treatment is not worse than that of a reference treatment by more than a specified margin. A three-arm non-inferiority trial that includes the placebo, experimental treatment, and a reference treatment is considered. It has been criticized that the conventional approach for three-arm non-inferiority trials loses power for the non-inferiority hypothesis test unless the power of the assay sensitivity test is close to one. In order to overcome this situation, a novel hierarchical testing procedure with two stages for three-arm non-inferiority trials is developed. The family-wise error rate (FWER) is investigated analytically and numerically of the proposed test procedure. Numerical studies indicate that the suggested method controls FWER and has more power than the traditional approach particularly when the power of that assay sensitivity test is not close to one. Through these empirical studies, it is shown that the proposed method can be successfully applied in practice.
Advances in Medical Statistics
2017, Computational Statistics and Data Analysis
New approaches for testing non-inferiority for three-arm trials with Poisson distributed outcomes
2022, Biostatistics
A three-arm non-inferiority test for heteroscedastic data
2018, Advances in Decision Sciences

^☆: The R-code for our proposed methods is available in the supplementary material: R-code for the proposed non-inferiority tests (see Appendix B).

View full text

Non-inferiority test based on transformations for non-normal distributions☆

Abstract

Introduction

Section snippets

Three-arm non-inferiority trials

Proposed test procedures

Ranked-based method

Simulation studies

Example

Discussion

Acknowledgments

The relative influence of calcium intake and vitamin d status on serum parathyroid hormone and bone turnover biomarkers in a double-blind, placebo-controlled parallel group, longitudinal factorial design

J. Clin. Endocrinol. Metab.

How large does n have to be for z and t intervals?

Amer. Statist.

Asymptotic Techniques for Use in Statistics

Statistical Power Analysis for the Behavioral Sciences

Noninferiority trials: Design concepts and issues-the encounters of academic consultants in statistics

Stat. Med.

The complications of the fourth central moment

Amer. Statist.

On the removal of skewness by transformation

J. R. Stat. Soc. Ser. B Stat. Methodol.

The Bootstrap and Edgeworth Expansion

Assessing non-inferiority of a new treatment in a three-arm trial in the presence of heteroscedasticity

Stat. Med.

How large does $n$ have to be for $z$ and $t$ intervals?