Skip to main content

Advertisement

Log in

The Lack of Cross-Validation Can Lead to Inflated Results and Spurious Conclusions: A Re-Analysis of the MacArthur Violence Risk Assessment Study

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

Cross-validation is an important evaluation strategy in behavioral predictive modeling; without it, a predictive model is likely to be overly optimistic. Statistical methods have been developed that allow researchers to straightforwardly cross-validate predictive models by using the same data employed to construct the model. In the present study, cross-validation techniques were used to construct several decision-tree models with data from the MacArthur Violence Risk Assessment Study (Monahan et al., 2001). The models were then compared with the original (non-cross-validated) Classification of Violence Risk assessment tool. The results show that the measures of predictive model accuracy (AUC, misclassification error, sensitivity, specificity, positive and negative predictive values) degrade considerably when applied to a testing sample, compared with the training sample used to fit the model initially. In addition, unless false negatives (that is, incorrectly predicting individuals to be nonviolent) are considered more costly than false positives (that is, incorrectly predicting individuals to be violent), the models generally make few predictions of violence. The results suggest that employing cross-validation when constructing models can make an important contribution to increasing the reliability and replicability of psychological research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • BANKS, S., ROBBINS, P.C., SILVER, E., VESSELINOV, R., STEADMAN, H.J., MONAHAN, J., and ROTH, L.H. (2004), “A Multiple-Models Approach to Violence Risk Assessment Among People With Mental Disorder”, Criminal Justice and Behavior, 31, 324–340.

    Article  Google Scholar 

  • BERK, R. (2011), “Asymmetric Loss Functions for Forecasting in Criminal Justice Settings”, Journal of Quantitative Criminology, 27, 107–123.

    Article  Google Scholar 

  • BERK, R. (2012), Criminal Justice Forecasts of Risk: A Machine Learning Approach, New York, NY: Springer.

    Book  Google Scholar 

  • BREIMAN, L. (1996), “Bagging Predictors”, Machine Learning, 26, 123–140.

    MATH  Google Scholar 

  • BREIMAN, L. (2001), “Random Forests”, Machine Learning, 45, 5–32.

    Article  MATH  Google Scholar 

  • BREIMAN, L., FRIEDMAN, J.H., OLSHEN, R.A., and STONE, C.J. (1984), Classification and Regression Trees, Belmont, CA: Wadsworth and Brooks.

    MATH  Google Scholar 

  • BREIMAN, L., and SPECTOR, P. (1992), “Submodel Selection and Evaluation in Regression. The X-Random Case”, International Statistical Review, 291–319.

  • DOYLE, M., SHAW, J., CARTER, S., and DOLAN, M. (2010), “Investigating the Validity of the Classification of Violence Risk in a UK Sample”, International Journal of Forensic Mental Health, 9, 316–323.

    Article  Google Scholar 

  • FERNÁNDEZ-DELGADO, M., CERNADAS, E., BARRO, S., and AMORIM, D. (2014), “Do We Need Hundreds of Classifiers to Solve Real World Classification Problems?”, The Journal of Machine Learning Research, 15, 3133–3181.

    MathSciNet  MATH  Google Scholar 

  • GARDNER, W., LIDZ, C.W., MULVEY, E.P., and SHAW, E.C. (1996), “A Comparison of Actuarial Methods for Identifying Repetitively Violent Patients with Mental Illnesses”, Law and Human Behavior, 20, 35–48.

    Article  Google Scholar 

  • GINI, C. (1912), Variability and Mutability: Contribution to the Study of Distributions and Report Statistics, Bologna, Italy: C. Cuppini.

    Google Scholar 

  • HARE, R.D. (1980), “A Research Scale for the Assessment of Psychopathy in Criminal Populations”, Personality and Individual Differences, 1, 111–119.

    Article  Google Scholar 

  • HARRIS, G.T., and RICE, M.E. (2013), “Bayes and Base Rates: What is an Informative Prior for Actuarial Violence Risk Assessment?”, Behavioral Sciences and the Law, 31, 103-124.

    Article  Google Scholar 

  • HASTIE, T. , TIBSHIRANI, R., and FRIEDMAN, J. (2009), The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd ed.), New York, NY: Springer.

    Book  MATH  Google Scholar 

  • JAMES, G., WITTEN, D., HASTIE, T., and TIBSHIRANI, R. (2013), An Introduction to Statistical Learning, New York, NY: Springer.

    Book  MATH  Google Scholar 

  • KUHN, M., and JOHNSON, K. (2013), Applied Predictive Modeling, New York, NY: Springer.

    Book  MATH  Google Scholar 

  • MCCUSKER, P.J. (2007), “Issues Regarding the Clinical Use of the Classification of Violence Risk (COVR) Assessment Instrument”, International Journal of Offender Therapy and Comparative Criminology, 51, 676–685.

    Article  Google Scholar 

  • MCDERMOTT, B.E., DUALAN, I.V., and SCOTT, C.L. (2011), “The Predictive Ability of the Classification of Violence Risk (COVR) in a Forensic Psychiatric Hospital”, Psychiatric Services, 62, 430–433.

    Article  Google Scholar 

  • MEEHL, P.E., and ROSEN, A.(1955), “Antecedent Probability and the Efficiency of Psychometric Signs, Patterns, or Cutting Scores”, Psychological Bulletin, 52, 194–215.

    Article  Google Scholar 

  • MONAHAN, J., STEADMAN, H.J., APPELBAUM, P.S., GRISSO, T., MULVEY, E.P., ROTH, L.H., and SILVER, E. (2006), “The Classification of Violence Risk”, Behavioral Sciences and the Law, 24, 721–730.

    Article  Google Scholar 

  • MONAHAN, J., STEADMAN, H.J., ROBBINS, P.C., APPELBAUM, P.S., BANKS, S., GRISSO, T., and SILVER, E. (2005), “An Actuarial Model of Violence Risk Assessment for Persons with Mental Disorders”, Psychiatric Services, 56, 810–815.

    Article  Google Scholar 

  • MONAHAN, J., STEADMAN, H.J., ROBBINS, P.C., SILVER, E., APPELBAUM, P.S., GRISSO, T., and ROTH, L.H. (2000), “Developing a Clinically Useful Actuarial Tool for Assessing Violence Risk”, The British Journal of Psychiatry, 176, 312–319.

    Article  Google Scholar 

  • MONAHAN, J., STEADMAN, H.J., SILVER, E., APPELBAUM, P.S., ROBBINS, P.C., MULVEY, E.P., and BANKS, S. (2001), Rethinking Risk Assessment: The MacArthur Study of Mental Disorder and Violence, New York, NY: Oxford University Press.

    Google Scholar 

  • MOSSMAN, D. (2006), “Critique of Pure Risk Assessment or, Kant Meets Tarasoff”, University of Cincinnati Law Review, 75, 523–609.

    Google Scholar 

  • MOSSMAN, D. (2013), “Evaluating Risk Assessments Using Receiver Operating Characteristic Analysis: Rationale, Advantages, Insights, and Limitations”, Behavioral Sciences and the Law, 31, 23–39.

    Article  Google Scholar 

  • PASHLER, H., and WAGENMAKERS, E.J. (2012), “Editors’ Introduction to the Special Section on Replicability in Psychological Science: A Crisis of Confidence?”, Perspectives on Psychological Science, 7, 528–530.

    Article  Google Scholar 

  • POLLACK, I., and NORMAN, D.A. (1964), “A Non-Parametric Analysis of Recognition Experiments”, Psychonomic Science, 1, 125–126.

    Article  Google Scholar 

  • R CORE TEAM (2014), R: A Language and Environment for Statistical Computing (Version 3.1.1), Vienna, Austria, http://www.R-project.org/.

  • ROBERTS, S., and PASHLER, H. (2000), “How Persuasive is a Good Fit? A Comment on Theory Testing”, Psychological Review. 107, 358–367.

    Article  Google Scholar 

  • SNOWDEN, R.J., GRAY, N.S., TAYLOR, J., and FITZGERALD, S. (2009), “Assessing Risk of Future Violence Among Forensic Psychiatric Inpatients with the Classification of Violence Risk (COVR)”, Psychiatric Services, 60, 1522–1526.

    Article  Google Scholar 

  • SPSS, INC. (1993), SPSS for Windows (Release 6.0), Chicago, IL: SPSS, Inc.

    Google Scholar 

  • STEADMAN, H.J., SILVER, E., MONAHAN, J., APPELBAUM, P.S., ROBBINS, P.C., MULVEY, E.P., and BANKS, S. (2000), “A Classification Tree Approach to the Development of Actuarial Violence Risk Assessment Tools”, Law and Human Behavior, 24, 83–100.

    Article  Google Scholar 

  • STURUP, J., KRISTIANSSON, M., and LINDQVIST, P. (2011), “Violent Behaviour by General Psychiatric Patients in Sweden: Validation of Classification of Violence Risk (COVR) Software”, Psychiatry Research, 188, 161–165.

    Article  Google Scholar 

  • VRIEZE, S.I., and GROVE, W.M. (2008), “Predicting Sex Offender Recidivism. I. Correcting for Item Overselection and Accuracy Overestimation in Scale Development. II. Sampling Error-Induced Attenuation of Predictive Validity over Base Rate Information”, Law and Human Behavior, 32, 266–278.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ehsan Bokhari.

Additional information

Ehsan Bokhari is now a Senior Analyst with the Los Angeles Dodgers in Los Angeles, California.

Electronic supplementary material

ESM 1

(PDF 826 kb)

ESM 2

(PDF 852 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bokhari, E., Hubert, L. The Lack of Cross-Validation Can Lead to Inflated Results and Spurious Conclusions: A Re-Analysis of the MacArthur Violence Risk Assessment Study. J Classif 35, 147–171 (2018). https://doi.org/10.1007/s00357-018-9252-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-018-9252-3

Keywords

Navigation