Solutions to the Behrens–Fisher problem
Introduction
When testing the equality of the means from two independent normally distributed populations given that the variances of the two populations are unknown but assumed equal the classical Student's two sample t-test is recommended. Student's t-test is asymptotically robust, and for finite m and n it possesses type I error robustness if m=n or if the distribution is symmetric; if m≠n and the distribution is skewed, the effects of departure from normality may be considerable [1]. If the underlying population distributions are normal with unequal and unknown variances, either Welch's t-statistic [2] or Satterthwaite's Approximate F test [3] is suggested. However, Welch's procedure is non-robust under most non-normal distributions [4], [5]. Actual data are more often non-smooth, multi-modal, highly skewed, and have heavy tails [6], [7]. This has serious consequences because even slight departures from normality are known to substantially reduce power when testing hypotheses about means.
Research often relies on Student's t-test to judge treatments and recommend new therapies. These tests are often abused and there is reasonable criticism of their use when the underlying assumptions are not met. Fortunately, there are loose criteria applied to the strict assumptions of data independence, homogeneity of variances, identically and normal distributions.
Every elementary statistics text book addresses the problem of testing the equality of the means from two populations. Few, however, offer alternatives when one or more of the underlying assumptions are not defensible. We have developed an executable FORTRAN code for producing the statistics outlined in this paper suggested by Cressie and Whitford [8], Yuen and Dixon [9], and Yuen [5]. An executable FORTRAN is available from the author on request.
Section snippets
Numerical methods
The two-sample t-test assumes that both samples (Xi, i=1, 2, . . ., m; Yj, j=1, 2, . . ., n) are random and are jointly independent, are identically distributed, are from normal populations (X∼N(μX, σ2), Xi=1, 2, . . ., m and Yj∼N(μY, σ2), i=1, 2, . . ., n), and have equal variances (σ2X=σ2Y=σ2). When these conditions hold, then the test of whether H0: μx=μy against H1: μx≠μy or H1: μx>μy is T.
Letwhere
Example
Data for the example come from an experiment reported by Dolkart, Halperin and Perlman [11]. Two groups of mice (normal mice and diabetic mice) were treated with bovine serum albumen (BSA) for 28 days. On the 29th day the amount of BSA nitrogen bound, in μg/ml of undiluted mouse serum, was measured. The hypothesis of interest was that the average amount of BSA bound in normal mice would be greater than the BSA bound by diabetic mice. The data were as follows: normal mice {155.76, 282.00,
Discussion
Testing the equality of two means from independent samples is a common statistical procedure and covered in every elementary statistics textbook. When the underlying distributions are normally distributed with equal population variances, we would use Student's t-test (T). When m=n, a test of equal means based on T possess levels robust against heterogeneous variances. However, T is sensitive to non-normality.
The first exact solution to the case where the distributions are normal but the
Program
We have written and tested a FORTRAN program that produces the statistics outlined in the Numerical Methods section. Both the program in an executable format and sample data sets are available from the author on request (e-mail only). The input data file (TEST.DTA) is in free format form (treatment group, outcome) where the treatment variable is an integer (either a 1 or 2), and the outcome variable is continuous.
Acknowledgements
Anonymous referees serve as a valuable resource to this journal. I am very grateful for the thorough review, comments and suggestions that have improved the presentation of this paper. Thanks.
References (14)
- et al.
Robust Inference
(1986) The significance of the difference between two means when the population variances are unequal
Biometrika
(1937)An approximate distribution of estimates of variance components
Biomet. Bull.
(1946)- et al.
How to use the two sample t-test
Biomet. J.
(1986) The two sample trimmed t for unequal population variances
Biometrika
(1974)- et al.
Robustness in real life: a study of clinical laboratory data
Biometrics
(1982) Some results on the Tukey–McLalughlin and Yuen methods for trimmed means when distributions are skewed
Biomet. J.
(1990)
Cited by (6)
Statistical inference on difference or ratio of means from heteroscedastic normal populations
2010, Journal of Statistical Planning and InferenceCitation Excerpt :Zumbo and Coulombe (1997) discuss the robust rank-order test for non-normal populations with unequal variances. Reed (2003) discusses the solution of BF problem. Lix et al. (2005) discuss robust tests for the multivariate BF problem.
Robust tests for the multivariate Behrens-Fisher problem
2005, Computer Methods and Programs in BiomedicineHigher order corrections to the Welch-Satterthwaite formula
2005, MetrologiaEffect sizes for research: A broad practical approach
2014, Effect Sizes for Research: A Broad Practical ApproachEffect sizes for research: Univariate and multivariate applications, second edition
2012, Effect Sizes for Research: Univariate and Multivariate Applications, Second EditionApplication of robust statistical methods for sensitivity analysis of health-related quality of life outcomes
2006, Quality of Life Research