Bootstrap confidence regions in multinomial sampling☆
Introduction
Let Y1,Y2,…,Yn be independent and identically distributed random variables with realizations on the sample space . Consider the probability vector p=(p1,…,pk), with components pj=P(Yi=j)>0, and the random variablesThe statistic (X1,…,Xk) is sufficient for p in the statistical model under consideration and is multinomially distributed; that is,for integers x1,…,xk⩾0 such that x1+⋯+xk=n.
For testingthe classical goodness-of-fit Pearson's statistic iswhere p0=(p10,…,pk0), and , j=1,…,k. Other common tests are based on φ-divergences between observed and theoretical proportions and p0 respectively. This family of divergences was introduced by Csiszár [2] and Ali and Silvey [1], for convex functions φ:(0,∞)→R, by the formulawhere is the class of all convex functions φ(x), x>0, such that at x=1, φ(1)=0, φ″(1)>0, and at x=0, φ(0/0)=0 and φ(p/0)=limu→∞φ(u)/u. Their properties and statistical applications have been extensively studied in Liese and Vajda [5] and Vajda [8]. For every that is differentiable at x=1, the functionalso belongs to . Then , and ψ has the additional property that ψ′(1)=0. Because the two divergence measures are equivalent, we may consider the set to be equivalent to the set
In statistical inference, an important family of φ-divergences is the one introduced and studied by Cressie and Read [3]. Functions φ in the so called power-divergence family areObserve that φλ(x) and ψλ(x)≡φλ(x)−(x−1)(λ+1)−1 define the same divergence measure. In the following, the power-divergence measures are denoted by
Zografos et al. [11] established under (1.1) thatis asymptotically chi-square distributed with k−1 degrees of freedom. Obviously the Pearson's statistic (1.2) coincides with Tφ,n for . Alsois asymptotically chi-square distributed with k−1 degrees of freedom. This result was established by Cressie and Read [3].
In a similar way to Jhun and Jeong [4], in this paper we are interested in constructing confidence regions for p based on the asymptotic distribution of (1.5) as well as on bootstrap methods. These authors present a simulation study based on the Pearson statistic given in (1.2). From a chronological point of view, Watson and Nguyen [9] and Watson [10] were the first authors who considered the problem of constructing confidence regions for p=(p1,p2,p3) in Trinomial distributions. Their method was based on the asymptotic distribution of Pearson's statistic given in (1.2). Medak and Cressie [6] extended their results by using the power-divergence family of statistics given in (1.6).
In Section 2 confidence regions are introduced. In Section 3 two Monte Carlo simulation experiments are carried out to calculate average coverage probabilities and to make comparisons.
Section snippets
Simultaneous confidence regions
Let {S(x1,…,xk)} be a family of subsets of the parameter spacewhere x1,…,xk⩾0 are integers such that x1+⋯+xk=n. {S(x1,…,xk)} is said to be a family of confidence regions for p at confidence level 1−α, if
Confidence regions for proportions of a Multinomial population is one of the basic tools in statistical inference for categorical data. Divergence measures play also an important role in this area (see, e.g. Read and Cressie [7]
Monte Carlo investigation
In this section two Monte Carlo simulations are done in order to study the performance of (2.1) and (2.2) for φλ(x) given in (1.4) and . Some of these selected values of λ correspond to well known goodness-of-fit test statistics, like Neyman modified X2 (λ=−2), Minimum discrimination information (λ=−1), Freeman–Tukey (λ=−1/2), log-likelihood ratio (λ=0), Cressie–Read (λ=2/3) and Pearson's X2 (λ=1).
In the first experiment 10 000 samples (replications) are drawn
References (11)
- et al.
Applications of bootstrap methods for categorical data analysis
Computational Statistics & Data Analysis
(2000) - et al.
A general class of coefficient of divergence of one distribution from another
Journal of Royal Statistical Society, Series B
(1966) Eine Informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizität von Markoffschen Ketten
Publications of the Mathematical Institute of Hungarian Academy of Sciences, Series A
(1963)- et al.
Multinomial goodness-of-fit tests
Journal of the Royal Statistical Society, Series B
(1984) - et al.
Convex Statistical Distances
(1987)
Cited by (0)
- ☆
This work was supported by the grants BMF2003-00892 and BMF 2003-04820.