Skip to main content
Log in

Measuring the Reliability of Diagnostic Classification Model Examinee Estimates

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

Over the past decade, diagnostic classification models (DCMs) have become an active area of psychometric research. Despite their use, the reliability of examinee estimates in DCM applications has seldom been reported. In this paper, a reliability measure for the categorical latent variables of DCMs is defined. Using theory-and simulation-based results, we show how DCMs uniformly provide greater examinee estimate reliability than IRT models for tests of the same length, a result that is a consequence of the smaller range of latent variable values examinee estimates can take in DCMs. We demonstrate this result by comparing DCM and IRT reliability for a series of models estimated with data from an end-of-grade test, culminating with a discussion of how DCMs can be used to change the character of large scale testing, either by shortening tests that measure examinees unidimensionally or by providing more reliable multidimensional measurement for tests of the same length.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • ACKERMAN, T. (2009), "Using Confirmatory MIRT Modeling to Provide Diagnostic Information in Large Scale Assessment”, paper presented at the April 2009 meeting of the National Council for Measurement in Education, San Diego CA.

  • AMERICAN EDUCATIONAL RESEARCH ASSOCIATION, AMERICAN PSYCHOLOGICAL ASSOCIATION, and NATIONAL COUNCIL ON MEASUREMENT IN EDUCATION (1999), Standards for Educational and Psychological Testing, Washington DC: Authors.

  • BIRNBAUM, A. (1968), “Some Latent Trait Models and Their Use in Inferring an Examinee’s Ability”, in Statistical Theories of Mental Test Scores, eds. F.M. Lord and M.R. Novick, Reading MA: Addison-Wesley, pp. 397–479.

    Google Scholar 

  • DE AYALA, R.J. (2009), Theory and Practice of Item Response Theory, New York: Guilford.

    Google Scholar 

  • HABERMAN, S.J., VON DAVIER, M., and LEE, Y.-H. (2008), “Comparison of Multidimensional Item Response Models: Multivariate Normal Ability Distributions Versus Multivariate Polytomous Ability Distributions”, Research Report 08–45, Princeton NJ: Educational Testing Service.

  • HAERTEL, E. (1989), “Using Restricted Latent Class Models to Map the Skill Structure of Achievement Items”, Journal of Educational Measurement, 26, 333–352.

    Article  Google Scholar 

  • HAMBLETON, R.K., SWAMINATHAN, H., and ROGERS, H.J. (1991), Fundamentals of Item Response Theory, Newbury Park CA: Sage.

    Google Scholar 

  • HARTZ, S.M. (2002), A Bayesian Framework for The Unified Model for Assessing Cognitive Abilities: Blending Theory with Practicality, unpublished doctoral dissertation, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL.

  • HENSON, R., TEMPLIN, J., and WILLSE, J. (2009a), “Defining a Family of Cognitive Diagnosis Models Using Log-Linear Models with Latent Variables”, Psychometrika, 74, 191–210.

    Article  MathSciNet  MATH  Google Scholar 

  • HENSON, R., TEMPLIN, J., and WILLSE, J. (2009b), “Ancillary Random Effects: A Way to Obtain Diagnostic Information from Existing Large Scale Tests”, paper presented at the April 2009 meeting of the National Council for Measurement in Education, San Diego CA.

  • JUNKER, B.W., and SIJTSMA, K. (2001), “Cognitive Assessment Models with Few Assumptions, and Connections with Nonparametric Item Response Theory”, Applied Psychological Measurement, 25, 258–272.

    Article  MathSciNet  Google Scholar 

  • LEIGHTON, J.P., and GIERL, M.J. (Eds.) (2007), Cognitive Diagnostic Assessment for Education: Theory and Practices, Cambridge: Cambridge University Press.

    Google Scholar 

  • LORD, F.M. (1980), Applications of Item Response Theory to Practical Testing Problems”, Hillsdale NJ: Erlbaum.

    Google Scholar 

  • MACREADY, G.B., and DAYTON, C.M. (1977), “The Use of Probabilistic Models in the Assessment of Mastery”, Journal of Educational Statistics, 2, 99–120.

    Article  Google Scholar 

  • MARIS, E. (1999), “Estimating Multiple Classification Latent Class Models”, Psychometrika, 64, 197–212.

    Article  MathSciNet  Google Scholar 

  • MAYDEU-OLIVARES, A., and JOE, H. (2005), “Limited- and Full-Information Estimation and Goodness-of-Fit Testing in 2n Contingency Tables: A Unified Framework”, Journal of the American Statistical Association, 100, 1009–1020.

    Article  MathSciNet  MATH  Google Scholar 

  • MISLEVY, R.J., BEATON, A.E., KAPLAN, B., and SHEEHAN, K.M. (1992), “Estimating Population Characteristics from Sparse Matrix Samples of Item Responses”, Journal of Educational Measurement, 29, 133–161.

    Article  Google Scholar 

  • MUTHÉN, L.K., and MUTHÉN, B.O. (2010), “Mplus User’s Guide” (Version 5.21, Computer software and manual), Los Angeles CA: Muthén and Muthén.

    Google Scholar 

  • ROUSSOS, L., DIBELLO, L., STOUT, W., HARTZ, S., HENSON, R., and TEMPLIN, J. (2007), “The Fusion Model Skills Diagnosis System”, in Cognitive Diagnostic Assessment in Education, eds. J. Leighton and M. Gierl, New York NY: Cambridge University Press, pp. 275–318.

    Chapter  Google Scholar 

  • RUPP, A., and TEMPLIN, J. (2008), “Unique Characteristics of Diagnostic Models: A Review of the Current State-of-the-Art”, Measurement, 6, 219–262.

    Google Scholar 

  • RUPP, A., TEMPLIN, J., and HENSON, R. (2010), Diagnostic Measurement: Theory, Methods, and Applications, New York: Guilford.

    Google Scholar 

  • SINHARAY, S., and HABERMAN, S. J. (2009), “How Much Can We Reliably Know About What Examinees Know?”, Measurement, 7, 49–53.

    Google Scholar 

  • TEMPLIN, J. (2004), Generalized Linear Mixed Proficiency Models for Cognitive Diagnosis, unpublished doctoral dissertation, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL.

  • TEMPLIN, J. (2006), CDM User's Guide, Lawrence KS: University of Kansas.

    Google Scholar 

  • TEMPLIN, J., and HENSON, R. (2006), “Measurement of Psychological Disorders Using Cognitive Diagnosis Models”, Psychological Methods, 11, 287–305.

    Article  Google Scholar 

  • VON DAVIER, M. (2005), “A General Diagnostic Model Applied to Language Testing Data”, ETS Research Report RR-05-16.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jonathan Templin.

Additional information

We would like to thank Terry Ackerman, Allan Cohen, Jeff Douglas, Robert Henson, John Poggio, and John Willse for their helpful comments and critiques of the concepts and text presented in this paper. Complete syntax for running all analyses herein and resulting program output are available at the first author’s website.

This research was funded by National Science Foundation grants DRL-0822064; SES-0750859; and SES-1030337. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Templin, J., Bradshaw, L. Measuring the Reliability of Diagnostic Classification Model Examinee Estimates. J Classif 30, 251–275 (2013). https://doi.org/10.1007/s00357-013-9129-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-013-9129-4

Keywords

Navigation