Skip to main content
Log in

Evaluation of confidence intervals for the kappa statistic when the assumption of marginal homogeneity is violated

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

This article studies the robustness of confidence interval construction for the intraclass kappa statistic based on a dichotomous response when the assumption of marginal homogeneity across two raters is violated. Two methods of construction are considered: the goodness-of-fit approach and the modified Wald method. Evaluation was done by exact calculation of the confidence interval coverage produced by these approaches. It was found that under mild departures from marginal homogeneity (differences in rater success rates of \(<\)10 %), the goodness- of-fit approach can be recommended. Moreover, under these same conditions, Cohen’s kappa tends to be less biased as a point estimator than the intraclass kappa statistic.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Blackman N, Koval J (1993) Estimating rater agreement in \(2\times 2\) tables: correction for chance and intraclass correlation. Appl Psychol Meas 17:211–233

    Article  Google Scholar 

  • Bloch DA, Kraemer HC (1989) \(2 \times 2\) coefficients of agreement or association. Biometrics 45:269–287

    Article  MATH  Google Scholar 

  • Burton A, Altman D, Royston P, Holder R (2006) The design of simulation studies in medical statistics. Stat Med 25:4279–4292

    Article  MathSciNet  Google Scholar 

  • Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20:37–46

    Article  Google Scholar 

  • Cornfield J (1956) A statistical problem arising from retrospective studies. In: Neyman J (ed) Proceedings of the third Berkeley symposium on mathematical statistics and probability, vol 4, pp 135–148

  • Donner A, Eliasziw M (1992) A goodness-of-fit approach to inference procedures for the kappa statistic: confidence interval construction, significance-testing and sample size estimation. Stat Med 11:1511–1519

    Article  Google Scholar 

  • Donner A, Zou G (2002) Interval estimation for a difference between intraclass kappa statistics. Biometrics 58:209–215

    Article  MathSciNet  Google Scholar 

  • Flack V (1987) Confidence intervals for the inter-rater agreement measure kappa. Commun Stat Theory Methods 16:953–968

    Article  MATH  Google Scholar 

  • Fleiss J (1975) Measuring agreement between two judges on the presence or absence of a trait. Biometrics 31:651–659

    Article  MathSciNet  Google Scholar 

  • Hale C, Fleiss J (1993) Interval estimation under two study designs for kappa with binary classifications. Biometrics 49:523–534

    Article  Google Scholar 

  • Koval J, Blackman N (1996) Estimators of kappa-exact small sample properties. J Stat Comput Simul 55:315–336

    Article  MathSciNet  MATH  Google Scholar 

  • McNemar Q (1947) Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12:153–157

    Article  Google Scholar 

  • Scott W (1955) Reliability of content analysis; the case of nominal scale coding. Public Opin Q 19:321–325

    Article  Google Scholar 

  • Warrens MJ (2010) Inequalities between kappa and kappa-like statistics for k \(\times \) k tables. Psychometrika 75:176–185

    Article  MathSciNet  MATH  Google Scholar 

  • Zwick R (1988) Another look at interrater agreement. Pyschol Bull 103:374–378

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sameer Parpia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Parpia, S., Koval, J.J. & Donner, A. Evaluation of confidence intervals for the kappa statistic when the assumption of marginal homogeneity is violated. Comput Stat 28, 2709–2718 (2013). https://doi.org/10.1007/s00180-013-0424-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-013-0424-7

Keywords

Navigation