skip to main content
10.1145/3322640.3326705acmconferencesArticle/Chapter ViewAbstractPublication PagesicailConference Proceedingsconference-collections
research-article

Why Machine Learning May Lead to Unfairness: Evidence from Risk Assessment for Juvenile Justice in Catalonia

Published: 17 June 2019 Publication History

Abstract

In this paper we study the limitations of Machine Learning (ML) algorithms for predicting juvenile recidivism. Particularly, we are interested in analyzing the trade-off between predictive performance and fairness. To that extent, we evaluate fairness of ML models in conjunction with SAVRY, a structured professional risk assessment framework, on a novel dataset originated in Catalonia. In terms of accuracy on the prediction of recidivism, the ML models slightly outperform SAVRY; the results improve with more data or more features available for training (AUCROC of 0.64 with SAVRY vs. AUCROC of 0.71 with ML models). However, across three fairness metrics used in other studies, we find that SAVRY is in general fair, while the ML models tend to discriminate against male defendants, foreigners, or people of specific national groups. For instance, foreigners who did not recidivate are almost twice as likely to be wrongly classified as high risk by ML models than Spanish nationals. Finally, we discuss potential sources of this unfairness and provide explanations for them, by combining ML interpretability techniques with a thorough data analysis. Our findings provide an explanation for why ML techniques lead to unfairness in data-driven risk assessment, even when protected attributes are not used in training.

References

[1]
S Ægisdóttir et al. 2006. The meta-analysis of clinical judgment project: Fifty-six years of accumulated research on clinical versus statistical prediction. The Counseling Psychologist 34, 3 (2006), 341--382.
[2]
A Agarwal, A Beygelzimer, M Dudík, J Langford, and H Wallach. 2018. A reductions approach to fair classification. arXiv preprint arXiv:1803.02453 (2018).
[3]
J Angwin, J Larson, S Mattu, and L Kirchner. 2016. Machine bias. ProPublica, May 23 (2016).
[4]
S Barocas and A Selbst. 2016. Big Data's Disparate Impact. California law review 104, 1 (2016), 671--729.
[5]
R Berk, H Heidari, S Jabbari, M Kearns, and A Roth. 2017. Fairness in Criminal Justice Risk Assessments: The State of the Art. (2017), 1--42. arXiv:1703.09207 http://arxiv.org/abs/1703.09207
[6]
K Björkqvist, KM Lagerspetz, and A Kaukiainen. 1992. Do girls manipulate and boys fight? Developmental trends in regard to direct and indirect aggression. Aggressive behavior 18, 2 (1992), 117--127.
[7]
M Blanch, M Capdevila, M Ferrer, B Framis, Ú Ruiz, J Mora, A Batlle, and B López. 2017. La reincidència en la justícia de menors. CEJFE.
[8]
R Borum. 2006. Manual for the structured assessment of violence risk in youth (SAVRY). (2006).
[9]
Randy Borum, Patrick A Bartel, and Adelle E Forth. 2005. Structured assessment of violence risk in youth. Mental health screening and assessment in juvenile justice (2005), 311--323.
[10]
T Brennan and W Dieterich. 2018. Correctional Offender Management Profiles for Alternative Sanctions (COMPAS). Handbook of Recidivism Risk/Needs Assessment Tools (2018), 49.
[11]
T Brennan and WL Oliver. 2013. The emergence of machine learning techniques in criminology: Implications of complexity in our data and in research questions. Criminology & Public Policy 12, 3 (2013), 551--562.
[12]
C S Chevalier. 2017. The Association between Structured Professional Judgment Measure Total Scores and Summary Risk Ratings: Implications for Predictive Validity. Ph.D. Dissertation.
[13]
A Chouldechova. 2017. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data 5, 2 (2017), 153--163.
[14]
S Corbett-Davies and S Goel. 2018. The measure and mismeasure of fairness: A critical review of fair machine learning. arXiv preprint arXiv:1808.00023 (2018).
[15]
S Corbett-Davies, E Pierson, A Feller, S Goel, and A Huq. 2017. Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 797--806.
[16]
S Danziger, J Levav, and L Avnaim-Pesso. 2011. Extraneous factors in judicial decisions. Proceedings of the National Academy of Sciences 108, 17 (2011), 6889--6892.
[17]
MDeMichele, P Baumgartner, M Wenger, K Barrick, M Comfort, and S Misra. 2018. The Public Safety Assessment: A Re-Validation and Assessment of Predictive Utility and Differential Prediction by Race and Gender in Kentucky. (2018).
[18]
S L Desmarais, K L Johnson, and J P Singh. 2016. Performance of recidivism risk assessment instruments in US correctional settings. Psychological Services 13, 3 (2016), 206.
[19]
C Dwork, M Hardt, T Pitassi, O Reingold, and R Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference. ACM, 214--226.
[20]
S A Friedler, C Scheidegger, S Venkatasubramanian, S Choudhary, EP Hamilton, and D Roth. 2018. A comparative study of fairness-enhancing interventions in machine learning. arXiv preprint arXiv:1802.04422 (2018).
[21]
B Green and Y Chen. 2019. Disparate Interactions: An Algorithm-in-the-Loop Analysis of Fairness in Risk Assessments. In ACM Conference on Fairness, Accountability, and Transparency.
[22]
B Green and L Hu. 2018. The myth in the methodology: towards a recontextualization of fairness in machine learning. In Proceedings of the Machine Learning: The Debates Workshop.
[23]
L S Guy. 2008. Performance indicators of the structured professional judgment approach for assessing risk for violence to others: A meta-analytic survey. Ph.D. Dissertation. Dept. of Psychology-Simon Fraser University.
[24]
M Hardt, E Price, N Srebro, et al. 2016. Equality of opportunity in supervised learning. In Advances in neural information processing systems. 3315--3323.
[25]
N Kallus and A Zhou. 2018. Residual Unfairness in Fair Machine Learning from Prejudiced Data. arXiv preprint arXiv:1806.02887 (2018).
[26]
J Kleinberg, H Lakkaraju, J Leskovec, J Ludwig, and S Mullainathan. 2017. Human decisions and machine predictions. The quarterly journal of economics 133, 1 (2017), 237--293.
[27]
J Kleinberg, S Mullainathan, and M Raghavan. 2016. Inherent trade-offs in the fair determination of risk scores. In Proceedings of Innovations in Theoretical Computer Science (ITCS).
[28]
P Langley and H A Simon. 1995. Applications of machine learning and rule induction. Commun. ACM 38, 11 (1995), 54--64.
[29]
ZC Lipton. 2016. The mythos of model interpretability. arXiv preprint arXiv:1606.03490 (2016).
[30]
Za Lipton, J McAuley, and A Chouldechova. 2018. Does mitigating ML's impact disparity require treatment disparity?. In Advances in Neural Information Processing Systems. 8136--8146.
[31]
LT Liu, S Dean, E Rolf, M Simchowitz, and M Hardt. 2018. Delayed impact of fair machine learning. arXiv preprint arXiv:1803.04383 (2018).
[32]
A Narayanan. 2018. 21 fairness definitions and their politics. Technical Report. Conference on Fairness, Accountability and Transparency 2018, Tutorial.
[33]
Northpoint, Inc. 2012. COMPAS Risk and Need Assessment System. Technical Report. Northpoint, Inc.
[34]
M E Olver, K C Stockdale, and JS Wormith. 2009. Risk assessment with young offenders: A meta-analysis of three assessment measures. Criminal Justice and Behavior 36, 4 (2009), 329--353.
[35]
R T Perrault, G M Vincent, and L S Guy. 2017. Are risk assessments racially biased?: Field study of the SAVRY and YLS/CMI in probation. Psychological assessment 29, 6 (2017), 664.
[36]
M T Ribeiro, S Singh, and C Guestrin. 2016. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 1135--1144.
[37]
C Robert. 2014. Machine learning, a probabilistic perspective.
[38]
D C Rowe, A T Vazsonyi, and D J Flannery. 1995. Sex differences in crime: Do means and within-sex variation have similar causes? Journal of research in Crime and Delinquency 32, 1 (1995), 84--100.
[39]
J P Singh et al. 2014. International perspectives on the practical application of violence risk assessment: A global survey of 44 countries. International Journal of Forensic Mental Health 13, 3 (2014), 193--206.
[40]
J Skeem, J Monahan, and C Lowenkamp. 2016. Gender, risk assessment, and sanctioning: The cost of treating women like men. Law and human behavior 40, 5 (2016), 580.
[41]
S Tolan. 2018. Fair and Unbiased Algorithmic Decision Making: Current State and Future Challenges. JRC Digital Economy Working Paper 10 (2018).
[42]
G M Vincent, L S Guy, B G Gershenson, and P McCabe. 2012. Does risk assessment make a difference? Results of implementing the SAVRY in juvenile probation. Behavioral sciences & the law 30, 4 (2012), 384--405.
[43]
E M Wright, E J Salisbury, and P Van V. 2007. Predicting the prison misconducts of women offenders: The importance of gender-responsive needs. Journal of Contemporary Criminal Justice 23, 4 (2007), 310--340.
[44]
M B Zafar, I Valera, M Gomez Rodriguez, and K P Gummadi. 2017. Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1171--1180.
[45]
I Žliobaitė. 2015. On the relation between accuracy and fairness in binary classification. arXiv preprint arXiv:1505.05723 (2015).
[46]
I Žliobaitė and B Custers. 2016. Using sensitive personal data may be necessary for avoiding discrimination in data-driven decision models. Artificial Intelligence and Law 24, 2 (2016), 183--201.

Cited By

View all
  • (2024)Algorithms and Recidivism: A Multi-Disciplinary Systematic ReviewProceedings of the 2024 AAAI/ACM Conference on AI, Ethics, and Society10.5555/3716662.3716775(1292-1305)Online publication date: 21-Oct-2024
  • (2024)On the maximal local disparity of fairness-aware classifiersProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692959(22115-22144)Online publication date: 21-Jul-2024
  • (2024)What hides behind unfairness? exploring dynamics fairness in reinforcement learningProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/432(3908-3916)Online publication date: 3-Aug-2024
  • Show More Cited By

Index Terms

  1. Why Machine Learning May Lead to Unfairness: Evidence from Risk Assessment for Juvenile Justice in Catalonia

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ICAIL '19: Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law
      June 2019
      312 pages
      ISBN:9781450367547
      DOI:10.1145/3322640
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      In-Cooperation

      • Univ. of Montreal: University of Montreal
      • AAAI
      • IAAIL: Intl Asso for Artifical Intel & Law

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 17 June 2019

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. algorithmic bias
      2. algorithmic fairness
      3. criminal recidivism
      4. machine learning
      5. risk assessment

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      ICAIL '19
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 69 of 169 submissions, 41%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)120
      • Downloads (Last 6 weeks)19
      Reflects downloads up to 08 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Algorithms and Recidivism: A Multi-Disciplinary Systematic ReviewProceedings of the 2024 AAAI/ACM Conference on AI, Ethics, and Society10.5555/3716662.3716775(1292-1305)Online publication date: 21-Oct-2024
      • (2024)On the maximal local disparity of fairness-aware classifiersProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692959(22115-22144)Online publication date: 21-Jul-2024
      • (2024)What hides behind unfairness? exploring dynamics fairness in reinforcement learningProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/432(3908-3916)Online publication date: 3-Aug-2024
      • (2024)Fair Column Subset SelectionProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3672005(2189-2199)Online publication date: 25-Aug-2024
      • (2024)A Critical Survey on Fairness Benefits of Explainable AIProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3658990(1579-1595)Online publication date: 3-Jun-2024
      • (2024)Fairness in Machine Learning: A SurveyACM Computing Surveys10.1145/361686556:7(1-38)Online publication date: 9-Apr-2024
      • (2024)"I know even if you don't tell me": Understanding Users' Privacy Preferences Regarding AI-based Inferences of Sensitive Information for PersonalizationProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642180(1-21)Online publication date: 11-May-2024
      • (2024)Graph Fairness Learning under Distribution ShiftsProceedings of the ACM Web Conference 202410.1145/3589334.3645508(676-684)Online publication date: 13-May-2024
      • (2024)Fairness and Bias in Robot LearningProceedings of the IEEE10.1109/JPROC.2024.3403898112:4(305-330)Online publication date: Apr-2024
      • (2024)Fairness issues, current approaches, and challenges in machine learning modelsInternational Journal of Machine Learning and Cybernetics10.1007/s13042-023-02083-215:8(3095-3125)Online publication date: 31-Jan-2024
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media