research-article

Public Access

Owning Mistakes Sincerely: Strategies for Mitigating AI Errors

Authors:
Amama Mahmood

Computer Science, Johns Hopkins University, United States

Computer Science, Johns Hopkins University, United States
View Profile

,
Jeanie W Fung

Johns Hopkins University, United States

Johns Hopkins University, United States
View Profile

,
Isabel Won

Johns Hopkins University, United States

Johns Hopkins University, United States
View Profile

,
Chien-Ming Huang

Computer Science, Johns Hopkins University, United States

Computer Science, Johns Hopkins University, United States
View Profile

CHI '22: Proceedings of the 2022 CHI Conference on Human Factors in Computing SystemsApril 2022Article No.: 578Pages 1–11https://doi.org/10.1145/3491102.3517565

Published:29 April 2022Publication History

CHI '22: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems

Pages 1–11

ABSTRACT

Interactive AI systems such as voice assistants are bound to make errors because of imperfect sensing and reasoning. Prior human-AI interaction research has illustrated the importance of various strategies for error mitigation in repairing the perception of an AI following a breakdown in service. These strategies include explanations, monetary rewards, and apologies. This paper extends prior work on error mitigation by exploring how different methods of apology conveyance may affect people’s perceptions of AI agents; we report an online study (N=37) that examines how varying the sincerity of an apology and the assignment of blame (on either the agent itself or others) affects participants’ perceptions and experience with erroneous AI agents. We found that agents that openly accepted the blame and apologized sincerely for mistakes were thought to be more intelligent, likeable, and effective in recovering from errors than agents that shifted the blame to others.

Supplemental Material

3491102.3517565-talk-video.mp4

mp4

21.7 MB

Download

References

Gavin Abercrombie, Amanda Cercas Curry, Mugdha Pandya, and Verena Rieser. 2021. Alexa, Google, Siri: What are Your Pronouns? Gender and Anthropomorphism in the Design and Perception of Conversational Assistants. arXiv preprint arXiv:2106.02578(2021).Google Scholar
Karin Aijmer. 2019. ‘Ooh whoops I’m sorry! Teenagers’ use of English apology expressions. Journal of Pragmatics 142 (2019), 258–269.Google ScholarCross Ref
Christoph Bartneck, Dana Kulić, Elizabeth Croft, and Susana Zoghbi. 2009. Measurement instruments for the anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robots. International journal of social robotics 1, 1 (2009), 71–81.Google ScholarCross Ref
Dennis Benner, Edona Elshan, Sofia Schöbel, and Andreas Janson. 2021. What do you mean? A Review on Recovery Strategies to Overcome Conversational Breakdowns of Conversational Agents. In International Conference on Information Systems (ICIS).Google Scholar
Alison Wood Brooks, Hengchen Dai, and Maurice E Schweitzer. 2014. I’m sorry about the rain! Superfluous apologies demonstrate empathic concern and increase trust. Social Psychological and Personality Science 5, 4 (2014), 467–474.Google ScholarCross Ref
Daniel J Brooks, Momotaz Begum, and Holly A Yanco. 2016. Analysis of reactions towards failures and recovery strategies for autonomous robots. In 2016 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN). IEEE, 487–492.Google ScholarDigital Library
David Cameron, Stevienna de Saille, Emily C Collins, Jonathan M Aitken, Hugo Cheung, Adriel Chua, Ee Jing Loh, and James Law. 2021. The effect of social-cognitive recovery strategies on likability, capability and trust in social robots. Computers in Human Behavior 114 (2021), 106561.Google ScholarDigital Library
James L Heskett Christopher W Hart and Jr W Earl Sasser. 1990. The profitable art of service recovery. (1990), 148–156.Google Scholar
Jacob Cohen. 1988. Statistical power analysis for the behavioral sciences. England: Routledge (1988).Google Scholar
Filipa Correia, Carla Guerra, Samuel Mascarenhas, Francisco S Melo, and Ana Paiva. 2018. Exploring the impact of fault justification in human-robot trust. In Proceedings of the 17th international conference on autonomous agents and multiagent systems. 507–513.Google ScholarDigital Library
Ryan Fehr and Michele J Gelfand. 2010. When apologies work: How matching apology components to victims’ self-construals facilitates forgiveness. Organizational behavior and human decision processes 113, 1(2010), 37–50.Google Scholar
Joel E Fischer, Stuart Reeves, Martin Porcheron, and Rein Ove Sikveland. 2019. Progressivity for voice interface design. In Proceedings of the 1st International Conference on Conversational User Interfaces. 1–8.Google ScholarDigital Library
John Fought. 1972. Erving Goffman, Relations in public: microstudies of the public order. New York: Basic Books, 1971. Pp. xvii 396.Language in Society 1, 2 (1972), 266–271. https://doi.org/10.1017/S0047404500000543Google ScholarCross Ref
Caleb Furlough, Thomas Stokes, and Douglas J Gillan. 2021. Attributing blame to robots: I. The influence of robot autonomy. Human factors 63, 4 (2021), 592–602.Google Scholar
Xiang Ge, Dan Li, Daisong Guan, Shihui Xu, Yanyan Sun, and Moli Zhou. 2019. Do smart speakers respond to their errors properly? A study on human-computer dialogue strategy. In International Conference on Human-Computer Interaction. Springer, 440–455.Google ScholarDigital Library
Petra Gieselmann. 2006. Comparing error-handling strategies in human-human and human-robot dialogues. In Proc. 8th Conf. Nat. Language Process.(KONVENS). Konstanz, Germany. 24–31.Google Scholar
Trudy Govier and Wilhelm Verwoerd. 2002. The promise and pitfalls of apology. Journal of social philosophy 33, 1 (2002), 67–82.Google ScholarCross Ref
David Griol and José Manuel Molina. 2016. A framework for improving error detection and correction in spoken dialog systems. Soft Computing 20, 11 (2016), 4229–4241.Google ScholarDigital Library
Victoria Groom, Jimmy Chen, Theresa Johnson, F Arda Kara, and Clifford Nass. 2010. Critic, compatriot, or chump?: Responses to robot blame attribution. In 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, 211–217.Google ScholarCross Ref
Yaou Hu, Hyounae Min, and Na Su. 2021. How Sincere is an Apology? Recovery Satisfaction in A Robot Service Failure Context. Journal of Hospitality & Tourism Research(2021), 10963480211011533.Google Scholar
Gilhwan Hwang, Jeewon Lee, Cindy Yoonjung Oh, and Joonhwan Lee. 2019. It sounds like a woman: Exploring gender stereotypes in South Korean voice assistants. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems. 1–6.Google ScholarDigital Library
Amazon Inc.[n.d.]. Alexa Design Guide. https://developer.amazon.com/en-US/docs/alexa/alexa-design/get-started.html. Accessed: 2021-12-01.Google Scholar
Google Inc.[n.d.]. Conversation Design. https://developers.google.com/assistant/conversation-design/welcome. Accessed: 2021-12-01.Google Scholar
Jiepu Jiang, Wei Jeng, and Daqing He. 2013. How do users respond to voice input errors? Lexical and phonetic query reformulation in voice search. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. 143–152.Google ScholarDigital Library
Ing-Marie Jonsson, Clifford Nass, Jack Endo, Ben Reaves, Helen Harris, Janice Le Ta, Nicholas Chan, and Sean Knapp. 2004. Don’t blame me I am only the Driver: Impact of Blame Attribution on Attitudes and Attention to Driving Task. In CHI’04 extended abstracts on Human factors in computing systems. 1219–1222.Google Scholar
Poornima Kaniarasu and Aaron M Steinfeld. 2014. Effects of blame on trust in human robot interaction. In The 23rd IEEE international symposium on robot and human interactive communication. IEEE, 850–855.Google ScholarCross Ref
Peter H Kim, Kurt T Dirks, Cecily D Cooper, and Donald L Ferrin. 2006. When more blame is better than less: The implications of internal vs. external attributions for the repair of trust after a competence-vs. integrity-based trust violation. Organizational behavior and human decision processes 99, 1 (2006), 49–65.Google Scholar
Taenyun Kim and Hayeon Song. 2021. How should intelligent agents apologize to restore trust? Interaction effects between anthropomorphism and apology attribution on trust repair. Telematics and Informatics 61 (2021), 101595.Google ScholarCross Ref
Dimosthenis Kontogiorgos, Sanne van Waveren, Olle Wallberg, Andre Pereira, Iolanda Leite, and Joakim Gustafson. 2020. Embodiment effects in interactions with failing robots. In Proceedings of the 2020 CHI conference on human factors in computing systems. 1–14.Google ScholarDigital Library
Anastasia Kuzminykh, Jenny Sun, Nivetha Govindaraju, Jeff Avery, and Edward Lank. 2020. Genie in the bottle: Anthropomorphized perceptions of conversational agents. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–13.Google ScholarDigital Library
Min Kyung Lee, Sara Kiesler, Jodi Forlizzi, Siddhartha Srinivasa, and Paul Rybski. 2010. Gracefully mitigating breakdowns in robotic services. In 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, 203–210.Google Scholar
Xuying Leo and Young Eun Huh. 2020. Who gets the blame for service failures? Attribution of responsibility toward robot versus human service providers and service firms. Computers in Human Behavior 113 (2020), 106520.Google ScholarCross Ref
Gina-Anne Levow. 1998. Characterizing and recognizing spoken corrections in human-computer dialogue. In 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1. 736–742.Google Scholar
Roderick JA Little. 1988. A test of missing completely at random for multivariate data with missing values. Journal of the American statistical Association 83, 404(1988), 1198–1202.Google ScholarCross Ref
James G Maxham III and Richard G Netemeyer. 2003. Firms reap what they sow: the effects of shared values and perceived organizational justice on customers’ evaluations of complaint handling. Journal of Marketing 67, 1 (2003), 46–62.Google ScholarCross Ref
Steven J Migacz, Suiwen Zou, and James F Petrick. 2018. The “terminal” effects of service failure on airlines: Examining service recovery with justice theory. Journal of Travel Research 57, 1 (2018), 83–98.Google ScholarCross Ref
Nicole Mirnig, Gerald Stollnberger, Markus Miksch, Susanne Stadler, Manuel Giuliani, and Manfred Tscheligi. 2017. To err is robot: How humans assess and act toward an erroneous social robot. Frontiers in Robotics and AI 4 (2017), 21.Google ScholarCross Ref
Wade J Mitchell, Chin-Chang Ho, Himalaya Patel, and Karl F MacDorman. 2011. Does social desirability bias favor humans? Explicit–implicit evaluations of synthesized speech support a new HCI model of impression management. Computers in Human Behavior 27, 1 (2011), 402–412.Google ScholarDigital Library
Flora Moon. [n.d.]. Sound of Text. https://soundoftext.com/. Accessed: 2021-08-01.Google Scholar
Bob Moore and Raphael Arar. [n.d.]. Conversation design Guidelines. https://conversational-ux.mybluemix.net/design/conversational-ux/. Accessed: 2021-12-01.Google Scholar
Chelsea Myers, Anushay Furqan, Jessica Nebolsky, Karina Caro, and Jichen Zhu. 2018. Patterns for how users overcome obstacles in voice user interfaces. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–7.Google ScholarDigital Library
Cathy Pearl. 2016. Designing voice user interfaces: Principles of conversational experiences. ” O’Reilly Media, Inc.”.Google Scholar
Valentina Pitardi and Hannah R Marriott. 2021. Alexa, she’s not human but… Unveiling the drivers of consumers’ trust in voice-based artificial intelligence. Psychology & Marketing 38, 4 (2021), 626–642.Google ScholarCross Ref
Martin Porcheron, Joel E Fischer, Stuart Reeves, and Sarah Sharples. 2018. Voice interfaces in everyday life. In proceedings of the 2018 CHI conference on human factors in computing systems. 1–12.Google ScholarDigital Library
Martin Porcheron, Joel E Fischer, and Sarah Sharples. 2017. ” Do Animals Have Accents?” Talking with Agents in Multi-Party Conversation. In Proceedings of the 2017 ACM conference on computer supported cooperative work and social computing. 207–219.Google Scholar
Marianne Promberger and Jonathan Baron. 2006. Do patients trust computers?Journal of Behavioral Decision Making 19, 5 (2006), 455–468.Google Scholar
Holger Roschk and Susanne Kaiser. 2013. The nature of an apology: An experimental study on how to apologize after a service failure. Marketing Letters 24, 3 (2013), 293–309.Google ScholarCross Ref
Donald B Rubin. 2004. Multiple imputation for nonresponse in surveys. Vol. 81. John Wiley & Sons.Google Scholar
Jagdish N Sheth and CH Kellstadt. 1994. A normative model of retaining customer satisfaction. Gamma News JournalJuly-August (1994), 4–7.Google Scholar
Richard Spreng, Gilbert Harrell, and Robert Mackoy. 1995. Service Recovery: Impact on Satisfaction and Intentions. Journal of Services Marketing 9 (03 1995), 15–23. https://doi.org/10.1108/08876049510079853Google ScholarCross Ref
Stephen S Tax, Stephen W Brown, and Murali Chandrashekaran. 1998. Customer evaluations of service complaint experiences: implications for relationship marketing. Journal of marketing 62, 2 (1998), 60–76.Google ScholarCross Ref
Joost R van Ginkel, Marielle Linting, Ralph CA Rippe, and Anja van der Voort. 2020. Rebutting existing misconceptions about multiple imputation as a method for handling missing data. Journal of personality assessment 102, 3 (2020), 297–308.Google ScholarCross Ref

Index Terms

Owning Mistakes Sincerely: Strategies for Mitigating AI Errors
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
    1. Redundancy
  2. Embedded and cyber-physical systems
    1. Embedded systems
    2. Robotics
2. Networks
  1. Network properties
    1. Network reliability

Recommendations

Cue effectiveness in mitigating postcompletion errors in a routine procedural task

Postcompletion errors, which are omissions of actions required after the completion of a task's main goal, occur in a variety of everyday procedural tasks. Previous research has demonstrated the difficulty of reducing their frequency by means other than ...
Read More
Am I Wrong, or Is the Autograder Wrong? Effects of AI Grading Mistakes on Learning
ICER '23: Proceedings of the 2023 ACM Conference on International Computing Education Research - Volume 1

Errors in AI grading and feedback often have an intractable set of causes and are, by their nature, difficult to completely avoid. Since inaccurate feedback potentially harms learning, there is a need for designs and workflows that mitigate these ...
Read More
Human-AI Interaction and AI Avatars
HCI International 2023 – Late Breaking Papers
Abstract
Human-Computer Interaction has been evolving rapidly with the advancement of artificial intelligence and metaverse. Human-AI Interaction is a new area in Human-Computer Interaction. In this paper, we look at AI avatars, which are human-like ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CHI '22: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems
April 2022
10459 pages
ISBN:9781450391573
DOI:10.1145/3491102
Editors:
Simone Barbosa
PUC-Rio, Brazil
,
Cliff Lampe
University of Michigan, USA
,
Caroline Appert
Université Paris-Saclay, France
,
David A. Shamma
Toyota Research Institute, USA
,
Steven Drucker
Microsoft Research, USA
,
Julie Williamson
University of Glasgow, UK
,
Koji Yatani
University of Tokyo, Japan
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 April 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Human-AI interaction
apologies
blame assignment
error mitigation
voice interactions
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate6,199of26,314submissions,24%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 8
  Total Citations
  View Citations
- 1,551
  Total Downloads
- Downloads (Last 12 months)1,079
- Downloads (Last 6 weeks)230
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Owning Mistakes Sincerely: Strategies for Mitigating AI Errors

CHI '22: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Cue effectiveness in mitigating postcompletion errors in a routine procedural task

Am I Wrong, or Is the Autograder Wrong? Effects of AI Grading Mistakes on Learning

Human-AI Interaction and AI Avatars