Optimal testing strategies for large, sparse multinomial models

doi:10.1016/j.csda.2003.08.002

Computational Statistics & Data Analysis

Volume 46, Issue 3, 15 June 2004, Pages 605-620

https://doi.org/10.1016/j.csda.2003.08.002 Get rights and content

Abstract

Much has been written in the literature on testing for independence in contingency tables. A number of related topics have been studied, including the choice of test statistic, the appropriateness of asymptotic results versus exact tests, and the use of conditional or unconditional analyses. Much of the work to date has focused on relatively small contingency tables.

The literature on testing for Hardy–Weinberg equilibrium (HWE) is substantive as well. Most of the focus to date has been on loci with relatively small numbers of alleles. The increased use of genetic markers with large numbers of alleles (e.g., forensic DNA profiling) is quickly dating this previous work. The size of the multinomial vectors under consideration grows quickly. Even relatively large samples will produce somewhat sparse multinomial data sets with small expected frequencies, calling asymptotic results into question.

In the face of large, sparse multinomials, we provide a comprehensive comparison of test statistics and testing strategies (asymptotic versus exact, conditional versus unconditional) to test for independence (i.e., the presence of HWE) at one locus with many alleles. Attained significance level and power are evaluated. We find that Fisher's exact test is most appropriate for small samples, and the asymptotic chi-square goodness-of-fit statistic works well with large samples, relative to the number of multinomial categories. The log-likelihood ratio statistic performs poorly.

Section snippets

Introduction and notation

The analysis of categorical data, and associated statistical tests, has been debated extensively in the literature. Three key questions arise from the debate:

1.
What is the most appropriate test statistic to use, in terms of attained significance level and power?
2.
Do asymptotic results apply, or should one use an exact test?
3.
Should one perform a conditional or unconditional analysis?

Much of the theoretical and simulation work to address these issues has been restricted to consideration of relatively

Statistical issues

Many authors have studied tests involving categorical data and the related issues presented in this paper. Much of the work to date has been in the context of relatively small contingency tables, or perhaps 2×n tables. Recent extensions have been made to higher dimension tables (Zelterman et al., 1995; Parshall et al., 1999). In addition, a number of authors have reviewed tests for HWE, which present essentially the same categorical data problem. Again, much of this work has been limited to

Statistical methods for unconditional tests

The presence of nuisance parameters in the null hypothesis given by Eq. (2) makes the problem of testing for HWE with many alleles more complex. One does not want to specify the allele frequencies; only the functional form of HWE needs to hold. A true exact unconditional test for HWE calculates $p -value = sup_{p→} Pr_{P→} (T⩾t)$ (Suissa and Shuster, 1985; Berger and Boos, 1994), where T represents the test statistic and t is its observed value. The calculation of the supremum over a parameter space of just

Results

In this section, we summarize the results by comparing conditional and unconditional tests, exact and asymptotic tests, and the test statistics themselves. Table 4 lists attained significance levels for each of the 11 tests, and for each combination of settings for n and $p →$ . Table 5, Table 6 list the empirical power when f=0.05 and 0.10, respectively, for each test and for each combination of n and $p →$ . Results for f=0.01 (not shown) are similar with respect to guidelines, with an obvious

Acknowledgements

This work was supported in part by National Institutes of Health grant GM32518.

References (35)

A. Agresti et al.
An empirical investigation of some effects of sparseness in contingency tables
Comput. Statist. Data Anal.
(1987)
J. Berkson
In dispraise of the exact test
J. Statist. Plann. Inference
(1978)
K. Wakimoto et al.
Testing the goodness of fit of the multinomial distribution based on graphical representation
Comput. Statist. Data Anal.
(1987)
A. Agresti
A survey of exact inference for contingency tables
Statist. Sci.
(1992)
A. Agresti et al.
Some exact conditional tests of independence for R× C cross-classification tables
Psychometrika
(1977)
R.L. Berger et al.
P values maximized over a confidence set for the nuisance parameter
J. Amer. Statist. Assoc.
(1994)
Budowle, B., Moretti, T.R., 1999. Genotype profiles for six population groups at the 13 CODIS short tandem repeat core...
Budowle, B., Monson, K.L., Anoe, K.S., Baechtel, F.S., Bergman, D.L., Buel E., Campbell, P.A., Clement, M.E., Coey,...
R. Chakraborty et al.
Statistical power of an exact test of Hardy–Weinberg proportions of genotypic data at a multiallelic locus
Hum. Heredity
(1994)
J.W. Chapman
A comparison of the X², $−2 log R$ , and multinomial probability criteria for significance tests when expected frequencies are small
J. Amer. Statist. Assoc.
(1976)

W.G. Cochran

Some methods for strengthening the common χ² tests

Biometrics

(1954)

N. Cressie et al.

Pearson's X² and the loglikelihood ratio statistic G²a comparative review

Internat. Statist. Rev.

(1989)

N. Cressie et al.

Multinomial goodness of fit tests

J. Roy. Statist. Soc. B

(1984)

R.B. D'Agostino et al.

The appropriateness of some common procedures for testing the equality of two independent binomial populations

Amer. Statist.

(1988)

T.H. Emigh

A comparison of tests for Hardy–Weinberg equilibrium

Biometrics

(1980)

J.D. Gibbons et al.

P-valuesinterpretation and methodology

Amer. Statist.

(1975)

S.W. Guo et al.

Performing the exact test of Hardy–Weinberg proportion for multiple alleles

Biometrics

(1992)

Cited by (7)

Testing departure from Hardy-Weinberg proportions
2017, Methods in Molecular Biology
A Monte Carlo Permutation Test for Random Mating Using Genome Sequences
2013, PLoS ONE
Global diversity and distribution of three necrotrophic effectors in Phaeosphaeria nodorum and related species
2013, New Phytologist
Testing departure from hardy-weinberg proportions
2012, Methods in Molecular Biology
Exact tests for Hardy-Weinberg proportions
2009, Genetics
Choice of test for association in small sample unordered r × c tables
2007, Statistics in Medicine

View all citing articles on Scopus

View full text

Optimal testing strategies for large, sparse multinomial models

Abstract

Section snippets

Introduction and notation

Statistical issues

Statistical methods for unconditional tests

Results

Acknowledgements

Comput. Statist. Data Anal.

J. Statist. Plann. Inference

Comput. Statist. Data Anal.

A survey of exact inference for contingency tables

Statist. Sci.

Some exact conditional tests of independence for R× C cross-classification tables

Psychometrika

P values maximized over a confidence set for the nuisance parameter

J. Amer. Statist. Assoc.

Statistical power of an exact test of Hardy–Weinberg proportions of genotypic data at a multiallelic locus

Hum. Heredity

A comparison of the X2, −2logR, and multinomial probability criteria for significance tests when expected frequencies are small

J. Amer. Statist. Assoc.

Some methods for strengthening the common χ2 tests

Biometrics

Pearson's X2 and the loglikelihood ratio statistic G2a comparative review

Internat. Statist. Rev.

Multinomial goodness of fit tests

J. Roy. Statist. Soc. B

The appropriateness of some common procedures for testing the equality of two independent binomial populations

Amer. Statist.

A comparison of tests for Hardy–Weinberg equilibrium

Biometrics

P-valuesinterpretation and methodology

Amer. Statist.

Performing the exact test of Hardy–Weinberg proportion for multiple alleles

Biometrics

A comparison of the X², $−2 log R$ , and multinomial probability criteria for significance tests when expected frequencies are small

Some methods for strengthening the common χ² tests

Pearson's X² and the loglikelihood ratio statistic G²a comparative review