Resampling methods for ranked set samples
Introduction
Ranked set sampling (RSS) introduced by McIntyre (1952) has many applications in ecological and environmental studies (e.g., Dell and Clutter, 1972, Al-Saleh and Zheng, 2002), reliability theory (Kvam and Samaniego, 1994) and medical studies (Samawi and Al-Sagheer, 2001). RSS is useful when a measurement of interest is difficult or expensive to obtain, but its value can be easily ordered by some means without actual quantification. The RSS procedure is briefly described as follow.
RSS is a two-stage sampling procedure. In the first stage, units are identified and ranked, and in the second stage, measurements are taken from a fraction of the ranked elements. Let units be randomly identified from the population. The units are then randomly divided into k groups of mk units each. In the first group, units are further randomly divided into m subgroups of size k. Units in each of the m subgroups are ordered by any means other than actual quantification, and actual measurement is taken from the unit having the lowest rank within each subgroup. The resulting measurements from the first group are labeled as . This step is repeated for the second group; but this time, actual measurement is taken from the unit having the second lowest rank within each subgroup. Continuing this procedure until all k groups are processed such that, in the th group, actual measurement is taken from the unit having the th lowest rank within each subgroup, yielding measurements . The resulting ranked set sample is denoted by . Note that the resulting measurements are independently distributed. However, they are not identically distributed.
Existing results show that one can often improve the accuracy of the analysis using RSS. The available results for RSS focus on inferences on population characteristics either under specific parametric assumptions or asymptotic results, with few results available for finite size samples when the underlying distribution of the observed data is unknown. For applications with small m, however, asymptotic inference may not be valid and for complex statistics with sampling distribution , their standard errors may not be known. Bootstrap offers an alternative approach to estimate by replacing F with its estimate , the bootstrap estimate of is .
Bootstrap is a viable approach to obtain the sampling distribution of test statistics in SRS. It is important to study its use with RSS. Several methods of bootstrapping a RSS suggest themselves. What are the algorithms of these methods? Which algorithm has better coverage probability and produces more accurate confidence intervals? We study the use of bootstrap to draw inference under RSS. Chen et al. (2004) introduced the method of bootstrapping a RSS row-wise. Hui et al. (2005) considered bootstrapping as a way to construct confidence interval for estimation of the population mean via linear regression under RSS. To motivate the use of resampling in RSS and to illustrate these resampling methods, we focus our investigation on the trimmed mean as members of a class of L-estimators in RSS. Simulation results are given on the 20% sample trimmed mean for symmetric distributions and 10% one-sided sample trimmed mean for a skewed distribution.
The article is arranged as follow. Section 2 describes and discusses the properties of three methods of bootstrap: BRSSR (bootstrap RSS by row), BRSS (bootstrap RSS), and MRBRSS (mixed row bootstrap RSS). Section 3 describes linear estimators under RSS and presents the results of a simulation study to compare the three methods. An application using these resampling methods are discussed in Section 4. Summary and concluding remarks are given in Section 5.
Section snippets
Resampling methods for ranked set samples
Given drawn from an unknown distribution F, a SRS can be obtained by randomly sampling n units from with replacement and equal probability. A ranked set sample, on the other hand, has a complex structure, one that contains information about measurements as well as their partial ordering. A RSS is composed of k independent random samples that are drawn from different distributions. In this respect, we may regard RSS as a stratified sampling design, for which the standard
Linear estimators
Let be the order statistics of n independently and identically distributed variates from a continuous distribution F, an L-estimator is defined as where are known constants. Depending on how these weighing constants are defined, both the usage of and its asymptotic properties differ. Asymptotic properties of can be found in Serfling (1980). Unfortunately, it is difficult to obtain analytical expressions for the standard error of L-estimators.
Application
Mode et al. (1999) described an RSS application that measures habitat sizes which are known to be linked to salmon production. Many of these habitats are near streams and forests in the Pacific Northwest, and measuring habitat areas are labor intensive and time consuming. Unfortunately the actual dataset is unavailable. However, Mode et al.'s paper note that extreme-valued habitat sizes occur and the LogNormal distribution may provide a good fit to the distribution of habitat sizes. We use a
Concluding remarks
RSS is concerned with samples collected from the field under cost, time or other logistic restrictions. Once collected, however, one must be able to draw inference from it. In this article, we assume a RSS is available and address the issue of obtaining standard errors for RSS estimates based on the bootstrap. We discuss three methods of bootstrapping a RSS. The row-wise bootstrap, BRSSR obtains bootstrap resamples by sampling m observations from each of k rows. True to its stratified nature,
Acknowledgements
We thank the editor and two anonymous referees whose helpful comments and suggestions improved the paper.
References (16)
On ranked-set sample quantiles and their applications
J. Statist. Plann. Infer.
(2000)- et al.
Estimation of bivariate characteristics using ranked set sampling
Austral. New Zealand J. Statist.
(2002) - et al.
Some asymptotic theory for the bootstrap
Annals Statist.
(1981) - et al.
Ranked Set Sampling: Theory and Applications
(2004) - et al.
Ranked set sampling theory with order statistics background
Biometrics
(1972) The Bootstrap and Edgeworth Expansion
(1992)- et al.
Bootstrap confidence interval estimation of mean via ranked set sampling linear regression
J. Statist. Comput. Simulation
(2005) - et al.
The exact bootstrap mean and variance of an L-estimator
J. Roy. Statist. Soc. Ser. B
(2000)
Cited by (0)
- 1
Research was supported in part by a grant from the USEPA and was completed while visiting the Center for Statistical Ecology and Environmental Statistics, Department of Statistics, The Pennsylvania state University.