Skip to main content
Log in

Efficient tests for one sample correlated binary data with applications

  • Published:
Statistical Methods & Applications Aims and scope Submit manuscript

Abstract

Four testing procedures are considered for testing the response rate of one sample correlated binary data with a cluster size of one or two, which often occurs in otolaryngologic and ophthalmologic studies. Although an asymptotic approach is often used for statistical inference, it is criticized for unsatisfactory type I error control in small sample settings. An alternative to the asymptotic approach is an unconditional approach. The first unconditional approach is the one based on estimation, also known as parametric bootstrap (Lee and Young in Stat Probab Lett 71(2):143–153, 2005). The other two unconditional approaches considered in this article are an approach based on maximization (Basu in J Am Stat Assoc 72(358):355–366, 1977), and an approach based on estimation and maximization (Lloyd in Biometrics 64(3):716–723, 2008a). These two unconditional approaches guarantee the test size and are generally more reliable than the asymptotic approach. We compare these four approaches in conjunction with a test proposed by Lee and Dubin (Stat Med 13(12):1241–1252, 1994) and a likelihood ratio test derived in this article, in regards to type I error rate and power for sample sizes from small to medium. An example from an otolaryngologic study is provided to illustrate the various testing procedures. The unconditional approach based on estimation and maximization using the test in Lee and Dubin (Stat Med 13(12):1241–1252, 1994) is preferable due to the power advantageous.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Agresti A (2002) Categorical data analysis. Wiley series in probability and statistics, 2nd edn. Wiley, Hoboken

    Google Scholar 

  • Barnard GA (1945) A new test for 2 \(\times \) 2 tables. Nature 156:177

    Article  MATH  MathSciNet  Google Scholar 

  • Basu D (1977) On the elimination of nuisance parameters. J Am Stat Assoc 72(358):355–366

    Article  MATH  Google Scholar 

  • Cochran WG (1977) Sampling techniques, 3rd edn. Wiley,New York

    MATH  Google Scholar 

  • Corcoran C, Ryan L, Senchaudhuri P, Mehta C, Patel N, Molenberghs G (2001) An exact trend test for correlated binary data. Biometrics 57(3):941–948

    Article  MATH  MathSciNet  Google Scholar 

  • Donner A, Klar N (1993) Confidence interval construction for effect measures arising from cluster randomization trials. J Clin Epidemiol 46(2):123–131

    Article  Google Scholar 

  • Evans RJ, Forcina A (2013) Two algorithms for fitting constrained marginal models. Comput Stat Data Anal 66:1–7

    Article  MathSciNet  Google Scholar 

  • Fisher RA (1970) Statistical methods for research workers, 14th edn. Hafner Press, New York

    Google Scholar 

  • Jung SH, Ahn C (2000) Estimation of response probability in correlated binary data: a new approach. Drug Inf J 34:599–604

    Google Scholar 

  • Kang SH, Park SM (2000) Exact likelihood ratio test of independence of binary responses within clusters. Comput Stat Data Anal 33:15–23

    Article  MATH  Google Scholar 

  • Kang S-HH, Chung S-JJ, Ahn CW (2005) Exact tests for one sample correlated binary data. Biom J Biometrische Zeitschrift 47(2):188–193

    Article  MathSciNet  Google Scholar 

  • Lee EW, Dubin N (1994) Estimation and sample size considerations for clustered binary responses. Stat Med 13(12):1241–1252

    Article  Google Scholar 

  • Lee S, Young G (2005) Parametric bootstrapping with nuisance parameters. Stat Probab Lett 71(2):143–153

    Article  MATH  MathSciNet  Google Scholar 

  • Lloyd CJ (2008a) A new exact and more powerful unconditional test of no treatment effect from binary matched pairs. Biometrics 64(3):716–723

    Article  MATH  MathSciNet  Google Scholar 

  • Lloyd CJ (2008b) Exact p-values for discrete models obtained by estimation and maximization. Aust N Z J Stat 50(4):329–345

    Article  MathSciNet  Google Scholar 

  • Lloyd CJ, Moldovan MV (2008) A more powerful exact test of noninferiority from binary matched-pairs data. Stat Med 27(18):3540–3549

    Article  MathSciNet  Google Scholar 

  • Mak TK (1988) Analysing Intraclass Correlation for Dichotomous Variables. J R Stat Soc Ser C (Appl Stat) 37(3):344–352

  • Mandel EM, Bluestone CD, Rockette HE, Blatter MM, Reisinger KS, Wucher FP, Harper J (1982) Duration of effusion after antibiotic treatment for acute otitis media: comparison of cefaclor and amoxicillin. Pediatr Infect Dis 1(5):310–316

    Article  Google Scholar 

  • Qu Y, Piedmonte M, Williams G (1994) Small sample validity of latent variable models for correlated binary data. Commun Stat Simul Comput 23(1):243–269

    Article  MATH  Google Scholar 

  • Rosner B (1982) Statistical methods in ophthalmology: an adjustment for the intraclass correlation between eyes. Biometrics 38(1):105–114

    Article  Google Scholar 

  • Shan G (2013a) A note on exact conditional and unconditional tests for Hardy-Weinberg equilibrium. Hum Hered 76(1):10–17

    Article  Google Scholar 

  • Shan G (2013b) Exact unconditional testing procedures for comparing two independent Poisson rates. J Stat Comput Simul. doi:10.1080/00949655.2013.855776

  • Shan G (2013c) More efficient unconditional tests for exchangeable binary data with equal cluster sizes. Stat Probab Lett 83(2):644–649

    Article  MATH  MathSciNet  Google Scholar 

  • Shan G, Ma C (2012) Unconditional tests for comparing two ordered multinomials. Stat Methods Med Res. doi:10.1177/0962280212450957

  • Shan G, Ma C (2013) Exact methods for testing the equality of proportions for binary clustered data from otolaryngologic studies. Stat Biopharm Res. doi:10.1080/19466315.2013.861767

  • Shan G, Ma C, Hutson AD, Wilding GE (2012) An efficient and exact approach for detecting trends with binary endpoints. Stat Med 31(2):155–164

    Article  MathSciNet  Google Scholar 

  • Shan G, Ma C, Hutson AD, Wilding GE (2013) Some tests for detecting trends based on the modified BaumgartnerWeißSchindler statistics. Comput Stat Data Anal 57(1):246–261

    Article  MathSciNet  Google Scholar 

  • Tang N-S, Tang M-L, Qiu S-F (2008) Testing the equality of proportions for correlated otolaryngologic data. Comput Stat Data Anal 52(7):3719–3729

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the Editor and two referees for their valuable comments and suggestions that improved this article significantly.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guogen Shan.

Appendix

Appendix

Likelihood ratio test statistic \(\mathbf{T}_\mathbf{LR}\). The log-likelihood is expressed as

$$\begin{aligned} l(\pi , \rho | \mathbf{{N_1}},\mathbf{{N_2}})&= log(\frac{(n-n_1)!}{N_{22}! N_{21}! N_{20}!})+N_{11}\log (\pi )+N_{10}\log (1-\pi )\\&+N_{22}\log [\pi ^2+\rho \pi (1-\pi )]+N_{21}\log [2(1-\rho )(1-\pi )]\\&+N_{20}\log [(1-\pi )^2+\rho \pi (1-\pi )]. \end{aligned}$$

Differentiating \(l(\pi , \rho )\) with respect to \((\pi , \rho )\) yields the score function

$$\begin{aligned} \frac{\partial l}{\partial \pi }&= \frac{N_{11}+2N_{22}}{\pi } - \frac{N_{10}+N_{21}+2N_{20}}{1-\pi } - \frac{N_{22} \rho }{\pi (\pi +\rho -\pi \rho )}\\&+\frac{N_{20}\rho }{(1-\pi )(1-\pi +\pi \rho )},\\ \frac{\partial l}{\partial \rho }&= -\frac{N_{21}}{1-\rho }+\frac{N_{22}(1-\pi )}{\pi +\rho -\pi \rho } + \frac{N_{20}\pi }{1-\pi +\pi \rho }. \end{aligned}$$

The unrestricted MLE of \((\pi , \rho )\), denoted by \((\hat{\pi }, \hat{\rho })\) is the solution to the following equations which can be obtained by Fishing-Score method,

$$\begin{aligned} \frac{\partial l}{\partial \pi } = 0 \quad \hbox { and }\quad \frac{\partial l}{\partial \rho } = 0. \end{aligned}$$

After a lengthy algebra calculation, the \(\hat{\rho }\) can be derived as a solution of a third-order polynomial

$$\begin{aligned} a \rho ^3 + b\rho ^2 + c\rho +d = 0, \end{aligned}$$

where

$$\begin{aligned} a&= N_{10}(N_{11}-N_{21})n_2,\\ b&= N_{10}^2N_{21}\!+\!N_{10}^2N_{22} +N_{10}N_{11}N_{21} +N_{10}N_{20}N_{21}+2N_{10}N_{20}N_{22} +3N_{10}N_{21}N_{22}\\&+2N_{10}N_{22}^2+N_{11}^2N_{20}+N_{11}^2N_{21}+2N_{11}N_{20}^2 +N_{11}N_{20}N_{21}+2N_{11}N_{20}N_{22}\\&-N_{11}N_{21}^2+N_{11}N_{21}N_{22}-2N_{20}^2N_{21}-2N_{20}N_{21}^2 -2N_{20}N_{21}N_{22}-N_{21}^2N_{22},\\ c&= N_{10}^2N_{21}-N_{10}N_{11}N_{20}+N_{10}N_{11}N_{21} -N_{10}N_{11}N_{22}+4N_{10}N_{20}N_{21}\\&+2N_{10}N_{20}N_{22}+N_{10}N_{21}^2+2N_{10}N_{21}N_{22} -2N_{10}N_{22}^2+N_{11}^2N_{21}-2N_{11}N_{20}^2\\&+N_{11}N_{20}N_{21}+2N_{11}N_{20}N_{22}+3N_{11}N_{21}N_{22} +4N_{20}^2N_{21}+4N_{20}^2N_{22}\\&+2N_{20}N_{21}^2+6N_{20}N_{21}N_{22}+4N_{20}N_{22}^2 +2N_{21}N_{22}^2,\\ d&= -N_{10}^2N_{22}+N_{10}N_{11}N_{21} -4N_{10}N_{20}N_{22}-N_{11}^2N_{20}+2N_{11}N_{20}N_{21}\\&-4N_{11}N_{20}N_{22}+N_{11}N_{21}^2-4N_{20}^2N_{22} -4N_{20}N_{22}^2+N_{21}^2N_{22}, \end{aligned}$$

and

$$\begin{aligned} \hat{\pi }=\frac{(N_{21}+2N_{22}-n_2\hat{\rho })\pm \sqrt{(\hat{\rho }n_2)^2+2\hat{\rho }n_2N_{21}-4N_{20}N_{22}+N_{21}^2}}{2n_2(1-\hat{\rho })}. \end{aligned}$$

We then compute the log-likelihoods for the solutions in the parameters’ space, and the parameter with the largest value is the solution. Another method may be used to derive the LR test (Evans and Forcina 2013).

Under null hypothesis \(H_0: \pi =\pi _0\), the MLE of \(\rho \) is given by

$$\begin{aligned} \hat{\rho }_{H_0}=-\frac{N_{21}+N_{22}-\pi _0(N_{20}+2N_{21}+3N_{22})+2n_2\pi _0^2\pm \sqrt{f}}{2\pi _0(1-\pi _0)n_2}, \end{aligned}$$

where \(f= (4N_{21}n_2+(N_{20}-N_{22})^2)\pi _0^2 + 2(N_{20}(N_{21}+2N_{22})-n_2(2N_{21}+N_{22}))\pi _0+(N_{21}+N_{22})^2\) and only keep the solution in the parameter space. When both \(\hat{\rho }_{H_0}\) are in the parameter space, the one with the larger null log likelihood is the solution.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shan, G., Ma, C. Efficient tests for one sample correlated binary data with applications. Stat Methods Appl 23, 175–188 (2014). https://doi.org/10.1007/s10260-013-0251-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10260-013-0251-6

Keywords

Navigation