Abstract
This paper studies the effect of user’s own task-related faults over her satisfaction with a fault-prone agent in a human-agent collaborative setting. Through a series of extensive experiments we find that user faults make the user more tolerant to agent faults, and consequently more satisfied with the collaboration, in particular compared to the case where the user is performing faultlessly. This finding can be utilized for improving the design of collaborative agents. In particular, we present a proof-of-concept for such augmented design, where the agent, whenever in charge of allocating the tasks or can pick its own tasks, deliberately leave the user with a relatively difficult task for increasing the chance for a user fault, which in turn increases user satisfaction.
Preliminary results of this work (in particular, the results of experiments T2 and T3) were presented as a poster at HAI 2021 [1].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Or, as an alternative to score, present the number of captchas successfully solved so far.
- 2.
A challenging captcha is one that combines some characters that are difficult to distinguish. For example in “bXF0yrl” it is not clear if the middle character is the letter “o” or the number 0. Or, one where it is difficult to distinguish between lower case “L” and upper case “I”. While these are difficult to humans to distinguish, a computer agent will easily learn their pattern and distinguishing pixels.
- 3.
The actual increase in satisfaction equals the decrease in dissatisfaction, and vice versa. However the relative increase and decrease are different, as the calculation takes a different baseline to begin with.
- 4.
Out of the 272 subjects of T1 and T4, only 14 experienced more than two agent faults, hence their results were added to the pool of 73 subjects who experienced two agent faults, forming the category “two or more agent faults”. Only eight subjects made two and more faults hence their category is excluded from the graph.
- 5.
Meaning that subjects that already had a single fault from earlier rounds (hence did not receive a challenging captcha as no intervention was needed) are excluded from this analysis.
References
Asraf, R., Rozenshtein, C., Sarne, D.: On the effect of user faults on her perception of agent’s faults in collaborative settings. In: Proceedings of the 9th International Conference on Human-Agent Interaction, HAI 2021, pp. 372–376. Association for Computing Machinery, New York (2021)
Azaria, A., Krishnamurthy, J., Mitchell, T.M.: Instructable intelligent personal agent. In: Thirtieth AAAI Conference on Artificial Intelligence (2016)
Azaria, A., Richardson, A., Kraus, S.: An agent for the prospect presentation problem. Technical report (2014)
Cassenti, D.N.: Recovery from automation error after robot neglect. In: Proceedings of the Human Factors and Ergonomics Society Annual Meeting, vol. 51, pp. 1096–1100. Sage Publications, Los Angeles (2007)
Correia, F., Guerra, C., Mascarenhas, S., Melo, F.S., Paiva, A.: Exploring the impact of fault justification in human-robot trust. In: Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems, pp. 507–513 (2018)
Dabholkar, P.A., Spaid, B.I.: Service failure and recovery in using technology-based self-service: effects on user attributions and satisfaction. Serv. Ind. J. 32(9), 1415–1432 (2012)
Frost, R.O., Turcotte, T.A., Heimberg, R.G., Mattia, J.I., Holt, C.S., Hope, D.A.: Reactions to mistakes among subjects high and low in perfectionistic concern over mistakes. Cogn. Ther. Res. 19(2), 195–205 (1995). https://doi.org/10.1007/BF02229694
Giuliani, M., Mirnig, N., Stollnberger, G., Stadler, S., Buchner, R., Tscheligi, M.: Systematic analysis of video data from different human-robot interaction studies: a categorization of social signals during error situations. Front. Psychol. 6, 931 (2015)
Gompei, T., Umemuro, H.: A robot’s slip of the tongue: effect of speech error on the familiarity of a humanoid robot. In: 2015 24th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), pp. 331–336. IEEE (2015)
Hamacher, A., Bianchi-Berthouze, N., Pipe, A.G., Eder, K.: Believing in BERT: using expressive communication to enhance trust and counteract operational error in physical human-robot interaction. In: 2016 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), pp. 493–500. IEEE (2016)
Honig, S., Oron-Gilad, T.: Understanding and resolving failures in human-robot interaction: literature review and model development. Front. Psychol. 9, 861 (2018)
Law, N., Yuen, A., Shum, M., Lee, Y.: Final report on phase (ii) study on evaluating the effectiveness of the “empowering learning and teaching with information technology” strategy (2004/2007). Education Bureau HKSAR, Hong Kong (2007)
Lee, M.K., Kiesler, S., Forlizzi, J., Srinivasa, S., Rybski, P.: Gracefully mitigating breakdowns in robotic services. In: 2010 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 203–210. IEEE (2010)
LeeTiernan, S., Cutrell, E., Czerwinski, M., Hoffman, H.G.: Effective notification systems depend on user trust. In: INTERACT, pp. 684–685 (2001)
Mangos, P.M., Steele-Johnson, D.: The role of subjective task complexity in goal orientation, self-efficacy, and performance relations. Hum. Perform. 14(2), 169–185 (2001)
Mendoza, J.P., Veloso, M., Simmons, R.: Plan execution monitoring through detection of unmet expectations about action outcomes. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 3247–3252. IEEE (2015)
Mirnig, N., Stollnberger, G., Miksch, M., Stadler, S., Giuliani, M., Tscheligi, M.: To err is robot: how humans assess and act toward an erroneous social robot. Front. Robot. AI 4, 21 (2017)
Peltason, J., Wrede, B.: The curious robot as a case-study for comparing dialog systems. AI Mag. 32(4), 85–99 (2011)
Ragni, M., Rudenko, A., Kuhnert, B., Arras, K.O.: Errare humanum est: erroneous robots in human-robot interaction. In: 2016 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), pp. 501–506. IEEE (2016)
Robinette, P., Howard, A.M., Wagner, A.R.: Timing is key for robot trust repair. In: ICSR 2015. LNCS (LNAI), vol. 9388, pp. 574–583. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25554-5_57
Rosenfeld, A., Zuckerman, I., Segal-Halevi, E., Drein, O., Kraus, S.: NegoChat-A: a chat-based negotiation agent with bounded rationality. Auton. Agent. Multi-Agent Syst. 30(1), 60–81 (2016). https://doi.org/10.1007/s10458-015-9281-9
Ross, R., Collier, R., O’Hare, G.M.: Demonstrating social error recovery with agentfactory. In: Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, vol. 3, pp. 1424–1425 (2004)
Rossi, A., Dautenhahn, K., Koay, K.L., Walters, M.L.: Human perceptions of the severity of domestic robot errors. In: Kheddar, A., et al. (eds.) ICSR 2017. LNCS, vol. 10652, pp. 647–656. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70022-9_64
Salem, M., Lakatos, G., Amirabdollahian, F., Dautenhahn, K.: Would you trust a (faulty) robot? Effects of error, task type and personality on human-robot cooperation and trust. In: 2015 10th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 1–8. IEEE (2015)
Sarne, D., Rozenshtein, C.: Incorporating failure events in agents’ decision making to improve user satisfaction. In: Bessiere, C. (ed.) Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-2020, pp. 1549–1555. International Joint Conferences on Artificial Intelligence Organization, July 2020. Main track
Schütte, N., Mac Namee, B., Kelleher, J.: Robot perception errors and human resolution strategies in situated human-robot dialogue. Adv. Robot. 31(5), 243–257 (2017)
Shiomi, M., Nakagawa, K., Hagita, N.: Design of a gaze behavior at a small mistake moment for a robot. Interact. Stud. 14(3), 317–328 (2013)
Steinbauer, G.: A survey about faults of robots used in RoboCup. In: Chen, X., Stone, P., Sucar, L.E., van der Zant, T. (eds.) RoboCup 2012. LNCS (LNAI), vol. 7500, pp. 344–355. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39250-4_31
Takayama, L., Groom, V., Nass, C.: I’m sorry, Dave: i’m afraid i won’t do that: social aspects of human-agent conflict. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2099–2108 (2009)
Wellman, M.P.: Putting the agent in agent-based modeling. Auton. Agent. Multi-Agent Syst. 30(6), 1175–1189 (2016). https://doi.org/10.1007/s10458-016-9336-6
van der Woerdt, S., Haselager, P.: Lack of effort or lack of ability? Robot failures and human perception of agency and responsibility. In: Bosse, T., Bredeweg, B. (eds.) BNAIC 2016. CCIS, vol. 765, pp. 155–168. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67468-1_11
Zhou, S., Bickmore, T., Paasche-Orlow, M., Jack, B.: Agent-user concordance and satisfaction with a virtual hospital discharge nurse. In: Bickmore, T., Marsella, S., Sidner, C. (eds.) IVA 2014. LNCS (LNAI), vol. 8637, pp. 528–541. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-09767-1_63
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Appendices
Appendices
A Experimental Framework Interface
Below is a screenshot of the experimental framework interface appeared to the participants (Fig. 5).
B Experimental Treatments Comparison
C State Machines
Figure 6 shows the state machines associated with each treatment flow. For simplicity, arrows pointing at the end state do not appear in all relevant states. However, the game does end when the pre-specified score is reached.
D Measures
-
Competence. (Question: “To what extent did you find the virtual player to be a competent partner?”) - this measure aims to capture the agent’s degree of ability, from the participant’s point-of-view.
-
Satisfaction. (Question: “To what extent are you satisfied with the virtual player?”) - this measure reflects the overall user experience and degree of happiness with the agent.
-
Recommendation. (Question: “To what extent would you recommend the virtual player to a friend, as a partner to work with?”) - this measure reflects the user’s loyalty.
E Complementary Graphs
While the graphs provided in the paper applied to the main measure of interest, user satisfaction, the equivalent graphs for the other two measures (user’s estimate for the agent’s competence and her willingness to recommend the agent to other users) reveal similar phenomena, qualitatively. We therefore provide these figures here.
1.1 E.1 Competence
Average user competence as a function of the number of user and agent faults (equivalent to Fig. 1a in the paper).
Average user competence as a function of the number of user faults in treatment T2 (equivalent to Fig. 2 in the paper).
Average user competence as a function of the number of user faults in treatment T3 (equivalent to Fig. 3 in the paper).
Comparison of user competence in T1 and T4, providing a proof of concept for the success of designs that incorporate user’s own faults effect over her competence from the collaborative agent (equivalent to Fig. 4 in the paper).
1.2 E.2 Recommendation
Average user recommendation as a function of the number of user and agent faults (equivalent to Fig. 1a in the paper).
Average user recommendation as a function of the number of user faults in treatment T2 (equivalent to Fig. 2 in the paper).
Average user recommendation as a function of the number of user faults in treatment T3 (equivalent to Fig. 3 in the paper).
Comparison of user recommendation in T1 and T4, providing a proof of concept for the success of designs that incorporate user’s own faults effect over her recommendation from the collaborative agent (equivalent to Fig. 4 in the paper).
F Participants’ Qualitative (Textual) Responses
-
Agent competence per se - several subjects justified their rating solely based on the agent performance, i.e., on how they perceived their agent’s competency. Some of them expressed it explicitly (“The virtual player made several mistakes”, “He only got one wrong and was quick”), while others implicitly (“This virtual player didn’t do very well”, “My virtual partner was quick and accurate. A great team player!”).
-
Agent competence vs. participant competence - many subjects explained their rating by comparing the agent competence (mostly based on its number of faults) to their own. This pattern was observed in various cases, in which the number of the agent’s faults was greater, equal, or smaller than that of the participant. For example, “The player did some extra mistake compared to me.” (two agent faults vs. one participant fault), “It was as competent as I was, only making one mistake.” (one agent fault vs. one participant fault), “Both of us solved with just one error.” (one agent fault vs. one participant fault), “played almost as good as me” (one agent fault vs. zero participant faults).
-
Expectations from an automated agent - participants also correlated their ratings with their initial assumptions about and expectations from the agent. In particular, most participants emphasizing this aspect mentioned they expected the agent to be flawless. For example, “The virtual player only made one mistake, but I don’t think a virtual player should have made any mistakes at all”, “Baffled how a machine could err on such a simple task”, “If it truly was virtual, as in AI, it shouldn’t have missed any.”, “I expect a computer to be competent”. Several subjects compared the automated agent to a real person (“Virtual players is like real player”, “It kind of felt like the virtual player was a real player even though I knew it wasn’t.”), once again expressing their disappointment from agent faults.
-
The complexity of the agent’s tasks - some participants correlated their rating with the relative complexity of the agent’s tasks compared to their own task. In particular, the fact that players (in T1 and T4) typically received the easier captcha seemed to had some influence on rating. For example, “He did as well I did, and it looked like he had harder puzzles”, “He only made a single mistake and had what looked like more complex captchas”, “It only made one mistake and always seemed to have longer ones to solve than I did.”.
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Asraf, R., Rozenshtein, C., Sarne, D. (2022). The Positive Effect of User Faults over Agent Perception in Collaborative Settings and Its Use in Agent Design. In: Chen, J., Lang, J., Amato, C., Zhao, D. (eds) Distributed Artificial Intelligence. DAI 2021. Lecture Notes in Computer Science(), vol 13170. Springer, Cham. https://doi.org/10.1007/978-3-030-94662-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-94662-3_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-94661-6
Online ISBN: 978-3-030-94662-3
eBook Packages: Computer ScienceComputer Science (R0)