skip to main content
research-article

People Perceive Algorithmic Assessments as Less Fair and Trustworthy Than Identical Human Assessments

Published:04 October 2023Publication History
Skip Abstract Section

Abstract

Algorithmic risk assessments are being deployed in an increasingly broad spectrum of domains including banking, medicine, and law enforcement. However, there is widespread concern about their fairness and trustworthiness, and people are also known to display algorithm aversion, preferring human assessments even when they are quantitatively worse. Thus, how does the framing of who made an assessment affect how people perceive its fairness? We investigate whether individual algorithmic assessments are perceived to be more or less accurate, fair, and interpretable than identical human assessments, and explore how these perceptions change when assessments are obviously biased against a subgroup. To this end, we conducted an online experiment that manipulated how biased risk assessments are in a loan repayment task, and reported the assessments as being made either by a statistical model or a human analyst. We find that predictions made by the model are consistently perceived as less fair and less interpretable than those made by the analyst despite being identical. Furthermore, biased predictive errors were more likely to widen this perception gap, with the algorithm being judged even more harshly for making a biased mistake. Our results illustrate that who makes risk assessments can influence perceptions of how acceptable those assessments are - even if they are identically accurate and identically biased against subgroups. Additional work is needed to determine whether and how decision aids should be presented to stakeholders so that the inherent fairness and interpretability of their recommendations, rather than their framing, determines how they are perceived.

References

  1. Pankaj Ajit. 2016. Prediction of employee turnover in organizations using machine learning algorithms. algorithms, Vol. 4, 5 (2016), C5.Google ScholarGoogle Scholar
  2. Veronika Alexander, Collin Blinder, and Paul J Zak. 2018. Why trust an algorithm? Performance, cognition, and neurophysiology. Computers in Human Behavior, Vol. 89 (2018), 279--288.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Ali Alkhatib and Michael Bernstein. 2019. Street-level algorithms: A theory at the gaps between policy and decisions. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1--13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. 2016. Machine bias. ProPublica, May, Vol. 23, 2016 (2016), 139--159.Google ScholarGoogle Scholar
  5. Theo Araujo, Natali Helberger, Sanne Kruikemeier, and Claes H De Vreese. 2020. In AI we trust? Perceptions about automated decision-making by artificial intelligence. AI & SOCIETY, Vol. 35, 3 (2020), 611--623.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Hal R Arkes, Robyn M Dawes, and Caryn Christensen. 1986. Factors influencing the use of a decision rule in a probabilistic task. Organizational behavior and human decision processes, Vol. 37, 1 (1986), 93--110.Google ScholarGoogle Scholar
  7. Chelsea Barabas, Madars Virza, Karthik Dinakar, Joichi Ito, and Jonathan Zittrain. 2018. Interventions over predictions: Reframing the ethical debate for actuarial risk assessment. In Conference on Fairness, Accountability and Transparency. PMLR, 62--76.Google ScholarGoogle Scholar
  8. Solon Barocas and Andrew D Selbst. 2016. Big data's disparate impact. Calif. L. Rev., Vol. 104 (2016), 671.Google ScholarGoogle Scholar
  9. David W Bates, Suchi Saria, Lucila Ohno-Machado, Anand Shah, and Gabriel Escobar. 2014. Big data in health care: using analytics to identify and manage high-risk and high-cost patients. Health affairs, Vol. 33, 7 (2014), 1123--1131.Google ScholarGoogle Scholar
  10. Rachel KE Bellamy, Kuntal Dey, Michael Hind, Samuel C Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilović, et al. 2019. AI Fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias. IBM Journal of Research and Development, Vol. 63, 4/5 (2019), 4--1.Google ScholarGoogle ScholarCross RefCross Ref
  11. Benedikt Berger, Martin Adam, Alexander Rühr, and Alexander Benlian. 2021. Watch Me Improve-Algorithm Aversion and Demonstrating the Ability to Learn. Business & Information Systems Engineering, Vol. 63, 1 (2021), 55--68.Google ScholarGoogle ScholarCross RefCross Ref
  12. Richard Berk, Hoda Heidari, Shahin Jabbari, Michael Kearns, and Aaron Roth. 2021. Fairness in criminal justice risk assessments: The state of the art. Sociological Methods & Research, Vol. 50, 1 (2021), 3--44.Google ScholarGoogle ScholarCross RefCross Ref
  13. Richard A Berk, Susan B Sorenson, and Geoffrey Barnes. 2016. Forecasting domestic violence: A machine learning approach to help inform arraignment decisions. Journal of Empirical Legal Studies, Vol. 13, 1 (2016), 94--115.Google ScholarGoogle ScholarCross RefCross Ref
  14. Marianne Bertrand and Sendhil Mullainathan. 2004. Are Emily and Greg more employable than Lakisha and Jamal? A field experiment on labor market discrimination. American economic review, Vol. 94, 4 (2004), 991--1013.Google ScholarGoogle Scholar
  15. Reuben Binns. 2018. Fairness in machine learning: Lessons from political philosophy. In Conference on Fairness, Accountability and Transparency. PMLR, 149--159.Google ScholarGoogle Scholar
  16. Reuben Binns, Max Van Kleek, Michael Veale, Ulrik Lyngs, Jun Zhao, and Nigel Shadbolt. 2018. 'It's Reducing a Human Being to a Percentage' Perceptions of Justice in Algorithmic Decisions. In Proceedings of the 2018 Chi conference on human factors in computing systems. 1--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. 2016. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Advances in neural information processing systems, Vol. 29 (2016), 4349--4357.Google ScholarGoogle Scholar
  18. Sylvie Borau, Tobias Otterbring, Sandra Laporte, and Samuel Fosso Wamba. 2021. The most human bot: Female gendering increases humanness perceptions of bots and acceptance of AI. Psychology & Marketing, Vol. 38, 7 (2021), 1052--1068.Google ScholarGoogle ScholarCross RefCross Ref
  19. Felix Brandt, Vincent Conitzer, Ulle Endriss, Jérôme Lang, and Ariel D Procaccia. 2016. Handbook of computational social choice. Cambridge University Press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Joy Buolamwini and Timnit Gebru. 2018. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on fairness, accountability and transparency. PMLR, 77--91.Google ScholarGoogle Scholar
  21. Ramon Casanova, Fang-Chi Hsu, Kaycee M Sink, Stephen R Rapp, Jeff D Williamson, Susan M Resnick, Mark A Espeland, and Alzheimer's Disease Neuroimaging Initiative. 2013. Alzheimer's disease risk assessment using large-scale machine learning methods. PloS one, Vol. 8, 11 (2013), e77949.Google ScholarGoogle ScholarCross RefCross Ref
  22. Noah Castelo, Maarten W Bos, and Donald R Lehmann. 2019. Task-dependent algorithm aversion. Journal of Marketing Research, Vol. 56, 5 (2019), 809--825.Google ScholarGoogle ScholarCross RefCross Ref
  23. Lingwei Cheng and Alexandra Chouldechova. 2023. Overcoming Algorithm Aversion: A Comparison between Process and Outcome Control. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1--27.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Alexandra Chouldechova. 2017. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, Vol. 5, 2 (2017), 153--163.Google ScholarGoogle Scholar
  25. Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. 2017. Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd acm sigkdd international conference on knowledge discovery and data mining. 797--806.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Cynthia S Crowson, Elizabeth J Atkinson, and Terry M Therneau. 2016. Assessing calibration of prognostic risk scores. Statistical methods in medical research, Vol. 25, 4 (2016), 1692--1706.Google ScholarGoogle Scholar
  27. Xolani Dastile, Turgay Celik, and Moshe Potsane. 2020. Statistical and machine learning models in credit scoring: A systematic literature survey. Applied Soft Computing, Vol. 91 (2020), 106263.Google ScholarGoogle ScholarCross RefCross Ref
  28. Sarah Desmarais and Jay Singh. 2013. Risk assessment instruments validated and implemented in correctional settings in the United States. (2013).Google ScholarGoogle Scholar
  29. Sarah L Desmarais, Kiersten L Johnson, and Jay P Singh. 2018. Performance of recidivism risk assessment instruments in US correctional settings. Handbook of recidivism risk/needs assessment tools (2018), 1--29.Google ScholarGoogle Scholar
  30. Dalia L Diab, Shuang-Yueh Pui, Maya Yankelevich, and Scott Highhouse. 2011. Lay perceptions of selection decision aids in US and non-US samples. International Journal of Selection and Assessment, Vol. 19, 2 (2011), 209--216.Google ScholarGoogle ScholarCross RefCross Ref
  31. William Dieterich, Christina Mendoza, and Tim Brennan. 2016. COMPAS risk scales: Demonstrating accuracy equity and predictive parity. Northpointe Inc, Vol. 7, 7.4 (2016), 1.Google ScholarGoogle Scholar
  32. Berkeley J Dietvorst and Soaham Bharti. 2020. People reject algorithms in uncertain decision domains because they have diminishing sensitivity to forecasting error. Psychological science, Vol. 31, 10 (2020), 1302--1314.Google ScholarGoogle Scholar
  33. Berkeley J Dietvorst, Joseph P Simmons, and Cade Massey. 2015. Algorithm aversion: People erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General, Vol. 144, 1 (2015), 114.Google ScholarGoogle ScholarCross RefCross Ref
  34. Berkeley J Dietvorst, Joseph P Simmons, and Cade Massey. 2018. Overcoming algorithm aversion: People will use imperfect algorithms if they can (even slightly) modify them. Management Science, Vol. 64, 3 (2018), 1155--1170.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Jonathan Dodge, Q Vera Liao, Yunfeng Zhang, Rachel KE Bellamy, and Casey Dugan. 2019. Explaining models: an empirical study of how explanations impact fairness judgment. In Proceedings of the 24th international conference on intelligent user interfaces. 275--285.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference. 214--226.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Benjamin Edelman, Michael Luca, and Dan Svirsky. 2017. Racial discrimination in the sharing economy: Evidence from a field experiment. American economic journal: applied economics, Vol. 9, 2 (2017), 1--22.Google ScholarGoogle Scholar
  38. Alexander Erlei, Richeek Das, Lukas Meub, Avishek Anand, and Ujwal Gadiraju. 2022. For what it's worth: humans overwrite their economic self-interest to avoid bargaining with ai systems. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1--18.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Alexander Erlei, Franck Nekdem, Lukas Meub, Avishek Anand, and Ujwal Gadiraju. 2020. Impact of algorithmic decision making on human behavior: Evidence from ultimatum bargaining. In Proceedings of the AAAI conference on human computation and crowdsourcing, Vol. 8. 43--52.Google ScholarGoogle ScholarCross RefCross Ref
  40. Anthony W Flores, Kristin Bechtel, and Christopher T Lowenkamp. 2016. False positives, false negatives, and false analyses: A rejoinder to machine bias: There's software used across the country to predict future criminals. and it's biased against blacks. Fed. Probation, Vol. 80 (2016), 38.Google ScholarGoogle Scholar
  41. Jerome H Friedman. 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics (2001), 1189--1232.Google ScholarGoogle Scholar
  42. S Connor Gorber, Mark Tremblay, David Moher, and B Gorber. 2007. A comparison of direct vs. self-report measures for assessing height, weight and body mass index: a systematic review. Obesity reviews, Vol. 8, 4 (2007), 307--326.Google ScholarGoogle ScholarCross RefCross Ref
  43. Ben Green and Yiling Chen. 2019a. Disparate interactions: An algorithm-in-the-loop analysis of fairness in risk assessments. In Proceedings of the conference on fairness, accountability, and transparency. 90--99.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Ben Green and Yiling Chen. 2019b. The principles and limits of algorithm-in-the-loop decision making. Proceedings of the ACM on Human-Computer Interaction, Vol. 3, CSCW (2019), 1--24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Ben Green and Yiling Chen. 2021. Algorithmic risk assessments can alter human decision-making processes in high-stakes government contexts. Proceedings of the ACM on Human-Computer Interaction, Vol. 5, CSCW2 (2021), 1--33.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Martie G Haselton, Daniel Nettle, and Damian R Murray. 2015. The evolution of cognitive bias. The handbook of evolutionary psychology (2015), 1--20.Google ScholarGoogle Scholar
  47. Robert R Hoffman, Matthew Johnson, Jeffrey M Bradshaw, and Al Underbrink. 2013. Trust in automation. IEEE Intelligent Systems, Vol. 28, 1 (2013), 84--88.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Clemens S Hong, Allison L Siegel, and Timothy G Ferris. 2014. Caring for high-need, high-cost patients: what makes for a successful care management program? (2014).Google ScholarGoogle Scholar
  49. Yoyo Tsung-Yu Hou and Malte F Jung. 2021. Who is the expert? Reconciling algorithm aversion and algorithm appreciation in AI-supported decision making. Proceedings of the ACM on Human-Computer Interaction, Vol. 5, CSCW2 (2021), 1--25.Google ScholarGoogle Scholar
  50. Samantha Jaroszewski, Danielle Lottridge, Oliver L Haimson, and Katie Quehl. 2018. " Genderfluid" or" Attack Helicopter" Responsible HCI Research Practice with Non-binary Gender Variation in Online Communities. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1--15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. IM Jawahar and Jonny Mattsson. 2005. Sexism and beautyism effects in selection as a function of self-monitoring level of decision maker. Journal of Applied Psychology, Vol. 90, 3 (2005), 563.Google ScholarGoogle ScholarCross RefCross Ref
  52. Ji-Youn Jung, Sihang Qiu, Alessandro Bozzon, and Ujwal Gadiraju. 2022. Great chain of agents: The role of metaphorical representation of agents in conversational crowdsourcing. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1--22.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Ekaterina Jussupow, Izak Benbasat, and Armin Heinzl. 2020. Why are we averse towards Algorithms? A comprehensive literature Review on Algorithm aversion. (2020).Google ScholarGoogle Scholar
  54. Os Keyes. 2018. The misgendering machines: Trans/HCI implications of automatic gender recognition. Proceedings of the ACM on human-computer interaction, Vol. 2, CSCW (2018), 1--22.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Pranav Khadpe, Ranjay Krishna, Li Fei-Fei, Jeffrey T Hancock, and Michael S Bernstein. 2020. Conceptual metaphors impact perceptions of human-AI collaboration. Proceedings of the ACM on Human-Computer Interaction, Vol. 4, CSCW2 (2020), 1--26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Jon Kleinberg, Himabindu Lakkaraju, Jure Leskovec, Jens Ludwig, and Sendhil Mullainathan. 2018a. Human decisions and machine predictions. The quarterly journal of economics, Vol. 133, 1 (2018), 237--293.Google ScholarGoogle Scholar
  57. Jon Kleinberg, Jens Ludwig, Sendhil Mullainathan, and Ashesh Rambachan. 2018b. Algorithmic fairness. In Aea papers and proceedings, Vol. 108. 22--27.Google ScholarGoogle Scholar
  58. Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. 2016. Inherent trade-offs in the fair determination of risk scores. arXiv preprint arXiv:1609.05807 (2016).Google ScholarGoogle Scholar
  59. Matt J Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. 2017. Counterfactual fairness. Advances in neural information processing systems, Vol. 30 (2017).Google ScholarGoogle Scholar
  60. Vivian Lai and Chenhao Tan. 2019. On human predictions with explanations and predictions of machine learning models: A case study on deception detection. In Proceedings of the conference on fairness, accountability, and transparency. 29--38.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Himabindu Lakkaraju, Ece Kamar, Rich Caruana, and Jure Leskovec. 2017. Interpretable & explorable approximations of black box models. arXiv preprint arXiv:1707.01154 (2017).Google ScholarGoogle Scholar
  62. Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature, Vol. 521, 7553 (2015), 436--444.Google ScholarGoogle Scholar
  63. Min Kyung Lee. 2018. Understanding perception of algorithmic decisions: Fairness, trust, and emotion in response to algorithmic management. Big Data & Society, Vol. 5, 1 (2018), 2053951718756684.Google ScholarGoogle ScholarCross RefCross Ref
  64. Min Kyung Lee and Su Baykal. 2017. Algorithmic mediation in group decisions: Fairness perceptions of algorithmically mediated vs. discussion-based social division. In Proceedings of the 2017 acm conference on computer supported cooperative work and social computing. 1035--1048.Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Min Kyung Lee, Anuraag Jain, Hea Jin Cha, Shashank Ojha, and Daniel Kusbit. 2019. Procedural justice in algorithmic fairness: Leveraging transparency and outcome control for fair algorithmic mediation. Proceedings of the ACM on Human-Computer Interaction, Vol. 3, CSCW (2019), 1--26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Martin Leo, Suneel Sharma, and Koilakuntla Maddulety. 2019. Machine learning in banking risk management: A literature review. Risks, Vol. 7, 1 (2019), 29.Google ScholarGoogle ScholarCross RefCross Ref
  67. Brian Y Lim, Anind K Dey, and Daniel Avrahami. 2009. Why and why not explanations improve the intelligibility of context-aware intelligent systems. In Proceedings of the SIGCHI conference on human factors in computing systems. 2119--2128.Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. E Allan Lind, Tom R Tyler, and Yuen J Huo. 1997. Procedural context and culture: Variation in the antecedents of procedural justice judgments. Journal of personality and social psychology, Vol. 73, 4 (1997), 767.Google ScholarGoogle ScholarCross RefCross Ref
  69. Han Liu, Vivian Lai, and Chenhao Tan. 2021. Understanding the effect of out-of-distribution examples and interactive explanations on human-ai decision making. Proceedings of the ACM on Human-Computer Interaction, Vol. 5, CSCW2 (2021), 1--45.Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Jennifer M Logg, Julia A Minson, and Don A Moore. 2019. Algorithm appreciation: People prefer algorithmic to human judgment. Organizational Behavior and Human Decision Processes, Vol. 151 (2019), 90--103.Google ScholarGoogle ScholarCross RefCross Ref
  71. Henrietta Lyons, Eduardo Velloso, and Tim Miller. 2021. Conceptualising contestability: Perspectives on contesting algorithmic decisions. Proceedings of the ACM on Human-Computer Interaction, Vol. 5, CSCW1 (2021), 1--25.Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2021. A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR), Vol. 54, 6 (2021), 1--35.Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Lillio Mok and Ashton Anderson. 2021. The complementary nature of perceived and actual time spent online in measuring digital well-being. Proceedings of the ACM on human-computer interaction, Vol. 5, CSCW1 (2021), 1--27.Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Vincenzo Moscato, Antonio Picariello, and Giancarlo Sperl'i. 2021. A benchmark of machine learning approaches for credit score prediction. Expert Systems with Applications, Vol. 165 (2021), 113986.Google ScholarGoogle ScholarCross RefCross Ref
  75. Menaka Narayanan, Emily Chen, Jeffrey He, Been Kim, Sam Gershman, and Finale Doshi-Velez. 2018. How do humans understand explanations from machine learning systems? an evaluation of the human-interpretability of explanation. arXiv preprint arXiv:1802.00682 (2018).Google ScholarGoogle Scholar
  76. Mahsan Nourani, Joanie King, and Eric Ragan. 2020. The role of domain expertise in user trust and the impact of first impressions with intelligent systems. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, Vol. 8. 112--121.Google ScholarGoogle ScholarCross RefCross Ref
  77. Robert Nozick. 1973. Distributive justice. Philosophy & Public Affairs (1973), 45--126.Google ScholarGoogle Scholar
  78. Ziad Obermeyer, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. 2019. Dissecting racial bias in an algorithm used to manage the health of populations. Science, Vol. 366, 6464 (2019), 447--453.Google ScholarGoogle Scholar
  79. Alexandra Olteanu, Carlos Castillo, Fernando Diaz, and Emre Kiciman. 2019. Social data: Biases, methodological pitfalls, and ethical boundaries. Frontiers in Big Data, Vol. 2 (2019), 13.Google ScholarGoogle ScholarCross RefCross Ref
  80. Christina A Pan, Sahil Yakhmi, Tara P Iyer, Evan Strasnick, Amy X Zhang, and Michael S Bernstein. 2022. Comparing the Perceived Legitimacy of Content Moderation Processes: Contractors, Algorithms, Expert Panels, and Digital Juries. Proceedings of the ACM on Human-Computer Interaction, Vol. 6, CSCW1 (2022), 1--31.Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. Louis A Penner, John F Dovidio, Tessa V West, Samuel L Gaertner, Terrance L Albrecht, Rhonda K Dailey, and Tsveti Markova. 2010. Aversive racism and medical interactions with Black patients: A field study. Journal of experimental social psychology, Vol. 46, 2 (2010), 436--440.Google ScholarGoogle ScholarCross RefCross Ref
  82. Paweł Pławiak, Moloud Abdar, Joanna Pławiak, Vladimir Makarenkov, and U Rajendra Acharya. 2020. DGHNL: A new deep genetic hierarchical network of learners for prediction of credit scoring. Information sciences, Vol. 516 (2020), 401--418.Google ScholarGoogle Scholar
  83. Forough Poursabzi-Sangdeh, Daniel G Goldstein, Jake M Hofman, Jennifer Wortman Vaughan, and Hanna Wallach. 2021. Manipulating and measuring model interpretability. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1--52.Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. Emilee Rader, Kelley Cotter, and Janghee Cho. 2018. Explanations as mechanisms for supporting algorithmic transparency. In Proceedings of the 2018 CHI conference on human factors in computing systems. 1--13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. Alvin Rajkomar, Jeffrey Dean, and Isaac Kohane. 2019. Machine learning in medicine. New England Journal of Medicine, Vol. 380, 14 (2019), 1347--1358.Google ScholarGoogle ScholarCross RefCross Ref
  86. JenHo Reiczigel, Ildikó Zakariás, and Lajos Rózsa. 2005. A bootstrap test of stochastic equality of two populations. The American Statistician, Vol. 59, 2 (2005), 156--161.Google ScholarGoogle ScholarCross RefCross Ref
  87. John E Roemer. 1998. Theories of distributive justice. Harvard University Press.Google ScholarGoogle Scholar
  88. Chris Russell. 2019. Efficient search for diverse coherent explanations. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 20--28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. Nripsuta Ani Saxena, Karen Huang, Evan DeFilippis, Goran Radanovic, David C Parkes, and Yang Liu. 2019. How do fairness definitions fare? Examining public attitudes towards algorithmic definitions of fairness. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. 99--106.Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. Victoria A Shaffer, C Adam Probst, Edgar C Merkle, Hal R Arkes, and Mitchell A Medow. 2013. Why do patients derogate physicians who use a computer-based diagnostic support system? Medical Decision Making, Vol. 33, 1 (2013), 108--118.Google ScholarGoogle ScholarCross RefCross Ref
  91. Naeem Siddiqi. 2012. Credit risk scorecards: developing and implementing intelligent credit scoring. Vol. 3. John Wiley & Sons.Google ScholarGoogle Scholar
  92. Lawrence B Solum. 2004. Procedural justice. S. CAl. l. reV., Vol. 78 (2004), 181.Google ScholarGoogle Scholar
  93. Katta Spiel, Oliver L Haimson, and Danielle Lottridge. 2019. How to do better with gender on surveys: a guide for HCI researchers. Interactions, Vol. 26, 4 (2019), 62--65.Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. Megan Stevenson. 2018. Assessing risk assessment in action. Minn. L. Rev., Vol. 103 (2018), 303.Google ScholarGoogle Scholar
  95. Megan T Stevenson and Jennifer L Doleac. 2021. Algorithmic risk assessment in the hands of humans. Available at SSRN 3489440 (2021).Google ScholarGoogle Scholar
  96. Harini Suresh, Natalie Lao, and Ilaria Liccardi. 2020. Misplaced Trust: Measuring the Interference of Machine Learning in Human Decision-Making. In 12th ACM Conference on Web Science. 315--324.Google ScholarGoogle Scholar
  97. Nicolas Suzor, Tess Van Geelen, and Sarah Myers West. 2018. Evaluating the legitimacy of platform governance: A review of research and a shared research agenda. International Communication Gazette, Vol. 80, 4 (2018), 385--400.Google ScholarGoogle ScholarCross RefCross Ref
  98. Latanya Sweeney. 2013. Discrimination in online ad delivery. Commun. ACM, Vol. 56, 5 (2013), 44--54.Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. Neil Thurman, Judith Moeller, Natali Helberger, and Damian Trilling. 2019. My friends, editors, algorithms, and I: Examining audience attitudes to news selection. Digital Journalism, Vol. 7, 4 (2019), 447--469.Google ScholarGoogle ScholarCross RefCross Ref
  100. Berk Ustun, Alexander Spangher, and Yang Liu. 2019. Actionable recourse in linear classification. In Proceedings of the conference on fairness, accountability, and transparency. 10--19.Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. Sahil Verma, Varich Boonsanong, Minh Hoang, Keegan E Hines, John P Dickerson, and Chirag Shah. 2020. Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review. arXiv preprint arXiv:2010.10596 (2020).Google ScholarGoogle Scholar
  102. Sahil Verma and Julia Rubin. 2018. Fairness definitions explained. In 2018 ieee/acm international workshop on software fairness (fairware). IEEE, 1--7.Google ScholarGoogle ScholarDigital LibraryDigital Library
  103. Christine Vogeli, Alexandra E Shields, Todd A Lee, Teresa B Gibson, William D Marder, Kevin B Weiss, and David Blumenthal. 2007. Multiple chronic conditions: prevalence, health consequences, and implications for quality, care management, and costs. Journal of general internal medicine, Vol. 22 (2007), 391--395.Google ScholarGoogle ScholarCross RefCross Ref
  104. Kerstin N Vokinger, Stefan Feuerriegel, and Aaron S Kesselheim. 2021. Mitigating bias in machine learning for medicine. Communications medicine, Vol. 1, 1 (2021), 25.Google ScholarGoogle Scholar
  105. Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2017. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harv. JL & Tech., Vol. 31 (2017), 841.Google ScholarGoogle Scholar
  106. Ruotong Wang, F Maxwell Harper, and Haiyi Zhu. 2020. Factors influencing perceived fairness in algorithmic decision-making: Algorithm outcomes, development procedures, and individual differences. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  107. Blaise Agüera y Arcas, Margaret Mitchell, and Alexander Todorov. 2017. Physiognomy's new clothes. Medium (2017).Google ScholarGoogle Scholar
  108. Ke Yang and Julia Stoyanovich. 2017. Measuring fairness in ranked outputs. In Proceedings of the 29th international conference on scientific and statistical database management. 1--6.Google ScholarGoogle ScholarDigital LibraryDigital Library
  109. John Zerilli, Alistair Knott, James Maclaurin, and Colin Gavaghan. 2019. Transparency in algorithmic and human decision-making: is there a double standard? Philosophy & Technology, Vol. 32 (2019), 661--683.Google ScholarGoogle ScholarCross RefCross Ref
  110. Qiaoning Zhang, Matthew L Lee, and Scott Carter. 2022. You complete me: Human-ai teams and complementary expertise. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1--28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  111. Yunfeng Zhang, Q Vera Liao, and Rachel KE Bellamy. 2020. Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 295--305.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. People Perceive Algorithmic Assessments as Less Fair and Trustworthy Than Identical Human Assessments

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image Proceedings of the ACM on Human-Computer Interaction
          Proceedings of the ACM on Human-Computer Interaction  Volume 7, Issue CSCW2
          CSCW
          October 2023
          4055 pages
          EISSN:2573-0142
          DOI:10.1145/3626953
          Issue’s Table of Contents

          Copyright © 2023 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 4 October 2023
          Published in pacmhci Volume 7, Issue CSCW2

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader