skip to main content
research-article

Teachable Conversational Agents for Crowdwork: Effects on Performance and Trust

Published: 11 November 2022 Publication History

Abstract

Traditional crowdsourcing has mostly been viewed as requester-worker interaction where requesters publish tasks to solicit input from human crowdworkers. While most of this research area is catered towards the interest of requesters, we view this workflow as a teacher-learner interaction scenario where one or more human-teachers solve Human Intelligence Tasks to train machine learners. In this work, we explore how teachable machine learners can impact their human-teachers, and whether they form a trustable relation that can be relied upon for task delegation in the context of crowdsourcing. Specifically, we focus our work on teachable agents that learn to classify news articles while also guiding the teaching process through conversational interventions. In a two-part study, where several crowd workers individually teach the agent, we investigate whether this learning by teaching approach benefits human-machine collaboration, and whether it leads to trustworthy AI agents that crowd workers would delegate tasks to. Results demonstrate the benefits of the learning by teaching approach, in terms of perceived usefulness for crowdworkers, and the dynamics of trust built through the teacher-learner interaction.

Supplementary Material

MP4 File (v6cscw2331.mp4)
Supplemental video

References

[1]
Ernest Adams. 2014. Fundamentals of game design. Pearson Education.
[2]
Alan Agresti. 2003. Categorical data analysis. Vol. 482. John Wiley & Sons.
[3]
Saleema Amershi, Maya Cakmak, William Bradley Knox, and Todd Kulesza. 2014. Power to the people: The role of humans in interactive machine learning. AI Magazine 35, 4 (2014), 105--120.
[4]
Frank J Balbach and Thomas Zeugmann. 2009. Recent developments in algorithmic teaching. In International Conference on Language and Automata Theory and Applications. Springer, 1--18.
[5]
John A Bargh and Yaacov Schul. 1980. On the cognitive benefits of teaching. Journal of Educational Psychology 72, 5 (1980), 593.
[6]
Gautam Biswas, Krittaya Leelawong, Daniel Schwartz, Nancy Vye, and The Teachable Agents Group at Vanderbilt. 2005. Learning by teaching: A new agent paradigm for educational software. Applied Artificial Intelligence 19, 3--4 (2005), 363--392.
[7]
Sean Brophy, Gautam Biswas, Thomas Katzlberger, John Bransford, and Daniel Schwartz. 1999. Teachable agents: Combining insights from learning theory and computer science. In Artificial intelligence in education, Vol. 50. Citeseer, 21--28.
[8]
Maya Cakmak, Crystal Chao, and Andrea L Thomaz. 2010. Designing interactions for robot active learners. IEEE Transactions on Autonomous Mental Development 2, 2 (2010), 108--118.
[9]
Maya Cakmak and Andrea L Thomaz. 2010. Optimality of human teachers for robot learners. In 2010 IEEE 9th International Conference on Development and Learning. IEEE, 64--69.
[10]
Maya Cakmak and Andrea L Thomaz. 2014. Eliciting good teaching from humans for machine learners. Artificial Intelligence 217 (2014), 198--215.
[11]
Davide Calvaresi, Yazan Mualla, Amro Najjar, Stéphane Galland, and Michael Schumacher. 2019. Explainable Multi-Agent Systems through Blockchain Technology. In Proceedings of the 1st International Workshop on eXplanable TRansparent Autonomous Agents and Multi-Agent Systems (EXTRAAMAS 2019).
[12]
Justine Cassell. 2000. More than just another pretty face: Embodied conversational interface agents. Commun. ACM 43, 4 (2000), 70--78.
[13]
Kimmy Wa Chan, Stella Yiyan Li, Jian Ni, and John JianJun Zhu. 2021. What feedback matters? The role of experience in motivating crowdsourcing innovation. Production and Operations Management 30, 1 (2021), 103--126.
[14]
Catherine C Chase, Doris B Chin, Marily A Oppezzo, and Daniel L Schwartz. 2009. Teachable agents and the protégé effect: Increasing the effort towards learning. Journal of Science Education and Technology 18, 4 (2009), 334--352.
[15]
Veena Chattaraman, Wi-Suk Kwon, Juan E Gilbert, and Yishuang Li. 2014. Virtual shopping agents. Journal of Research in Interactive Marketing (2014).
[16]
Nalin Chhibber. 2019. Towards the Learning, Perception, and Effectiveness of Teachable Conversational Agents. Master's thesis. University of Waterloo.
[17]
Nalin Chhibber and Edith Law. 2019. Using conversational agents to support learning by teaching. arXiv preprint arXiv:1909.13443 (2019).
[18]
Nalin Chhibber and Edith Law. 2021. Towards Teachable Conversational Agents. arXiv preprint arXiv:2102.10387 (2021).
[19]
Michelene TH Chi, Nicholas De Leeuw, Mei-Hung Chiu, and Christian LaVancher. 1994. Eliciting self-explanations improves understanding. Cognitive science 18, 3 (1994), 439--477.
[20]
Leigh Clark, Abdulmalik Ofemile, Svenja Adolphs, and Tom Rodden. 2016. A multimodal approach to assessing user experiences with agent helpers. ACM Transactions on Interactive Intelligent Systems (TiiS) 6, 4 (2016), 29.
[21]
Peter A Cohen, James A Kulik, and Chen-Lin C Kulik. 1982. Educational outcomes of tutoring: A meta-analysis of findings. American educational research journal 19, 2 (1982), 237--248.
[22]
Benjamin R Cowan, Nadia Pantidi, David Coyle, Kellie Morrissey, Peter Clarke, Sara Al-Shehri, David Earley, and Natasha Bandeira. 2017. What can i help you with?: infrequent users' experiences of intelligent personal assistants. In Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services. ACM, 43.
[23]
C Brad Crisp and Sirkka L Jarvenpaa. 2013. Swift trust in global virtual teams: Trusting beliefs and normative actions. Journal of Personnel Psychology 12, 1 (2013), 45.
[24]
Edward L Deci and Richard M Ryan. 2012. Self-determination theory. (2012).
[25]
Shayan Doroudi, Ece Kamar, Emma Brunskill, and Eric Horvitz. 2016. Toward a learning science for complex crowdsourcing tasks. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 2623--2634.
[26]
Steven Dow, Anand Kulkarni, Scott Klemmer, and Björn Hartmann. 2012. Shepherding the crowd yields better work. In Proceedings of the ACM 2012 conference on computer supported cooperative work. 1013--1022.
[27]
Rochelle E Evans and Philip Kortum. 2010. The impact of voice characteristics on user response in an interactive voice response system. Interacting with Computers 22, 6 (2010), 606--614.
[28]
Jerry Alan Fails and Dan R Olsen Jr. 2003. Interactive machine learning. In Proceedings of the 8th international conference on Intelligent user interfaces. ACM, 39--45.
[29]
Pedro Fialho, Luísa Coheur, Sérgio Curto, Pedro Cláudio, Ângela Costa, Alberto Abad, Hugo Meinedo, and Isabel Trancoso. 2013. Meet EDGAR, a tutoring agent at MONSERRATE. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 61--66.
[30]
Ujwal Gadiraju, Besnik Fetahu, and Ricardo Kawase. 2015. Training workers for improving performance in crowd-sourcing microtasks. In Design for Teaching and Learning in a Networked World. Springer, 100--114.
[31]
Mahtab Ghazizadeh, John D Lee, and Linda Ng Boyle. 2012. Extending the Technology Acceptance Model to assess automation. Cognition, Technology & Work 14, 1 (2012), 39--49.
[32]
Sally A Goldman and Michael J Kearns. 1995. On the complexity of teaching. J. Comput. System Sci. 50, 1 (1995), 20--31.
[33]
Sandy JJ Gould, Anna L Cox, and Duncan P Brumby. 2016. Diminished control in crowdsourcing: An investigation of crowdworker multitasking behavior. ACM Transactions on Computer-Human Interaction (TOCHI) 23, 3 (2016), 1--29.
[34]
Andrew Guillory and Jeff A Bilmes. 2011. Simultaneous learning and covering with adversarial noise. (2011).
[35]
Chien-Ju Ho, Aleksandrs Slivkins, Siddharth Suri, and Jennifer Wortman Vaughan. 2015. Incentivizing high quality crowdwork. In Proceedings of the 24th International Conference on World Wide Web. 419--429.
[36]
Kevin Anthony Hoff and Masooda Bashir. 2015. Trust in automation: Integrating empirical evidence on factors that influence trust. Human factors 57, 3 (2015), 407--434.
[37]
Deanna Hood, Séverin Lemaignan, and Pierre Dillenbourg. 2015. When children teach a robot to write: An autonomous teachable humanoid which uses simulated handwriting. In Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction. ACM, 83--90.
[38]
Matthias Jerusalem and Ralf Schwarzer. 1979. The general self-efficacy scale (GSE).[Updated 2006 Oct 7].
[39]
Jiun-Yin Jian, Ann M Bisantz, and Colin G Drury. 2000. Foundations for an empirically determined scale of trust in automated systems. International Journal of Cognitive Ergonomics 4, 1 (2000), 53--71.
[40]
Edith Law, Parastoo Baghaei Ravari, Nalin Chhibber, Dana Kulic, Stephanie Lin, Kevin D Pantasdo, Jessy Ceha, Sangho Suh, and Nicole Dillen. 2020. Curiosity Notebook: A Platform for Learning by Teaching Conversational Agents. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems Extended Abstracts. 1--9.
[41]
Edith Law, Ming Yin, Joslin Goh, Kevin Chen, Michael A Terry, and Krzysztof Z Gajos. 2016. Curiosity killed the cat, but makes crowdwork better. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 4098--4110.
[42]
Ludovic Le Bigot, Patrice Terrier, Virginie Amiel, Gérard Poulain, Eric Jamet, and Jean-François Rouet. 2007. Effect of modality on collaboration with a dialogue system. International Journal of Human-Computer Studies 65, 12 (2007), 983--991.
[43]
Brian Y Lim, Anind K Dey, and Daniel Avrahami. 2009. Why and why not explanations improve the intelligibility of context-aware intelligent systems. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 2119--2128.
[44]
Jae-Eun Lim, Joonhwan Lee, and Dongwhan Kim. 2021. The effects of feedback and goal on the quality of crowdsourcing tasks. International Journal of Human--Computer Interaction (2021), 1--13.
[45]
Irene Lopatovska, Katrina Rink, Ian Knight, Kieran Raines, Kevin Cosenza, Harriet Williams, Perachya Sorsche, David Hirsch, Qi Li, and Adrianna Martinez. 2018. Talk to me: Exploring user interactions with the Amazon Alexa. Journal of Librarianship and Information Science (2018), 0961000618759414.
[46]
Giuseppe Lugano. 2017. Virtual assistants and self-driving cars. In 2017 15th International Conference on ITS Telecommunications (ITST). IEEE, 1--5.
[47]
Ewa Luger and Abigail Sellen. 2016. Like having a really bad PA: the gulf between user expectation and experience of conversational agents. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 5286--5297.
[48]
James N MacGregor. 1988. The effects of order on learning classifications by example: heuristics for finding the optimal order. Artificial Intelligence 34, 3 (1988), 361--370.
[49]
Maria Madsen and Shirley Gregor. 2000. Measuring human-computer trust. In 11th australasian conference on information systems, Vol. 53. Citeseer, 6--8.
[50]
Dominic W Massaro, Michael M Cohen, Sharon Daniel, and Ronald A Cole. 1999. Developing and evaluating conversational agents. In Human performance and ergonomics. Elsevier, 173--194.
[51]
H David Mathias. 1997. A model of interactive teaching. journal of computer and system sciences 54, 3 (1997), 487--501.
[52]
Masaki Matsubara, Ria Mae Borromeo, Sihem Amer-Yahia, and Atsuyuki Morishima. 2021. Task Assignment Strategies for Crowd Worker Ability Improvement. Proceedings of the ACM on Human-Computer Interaction 5, CSCW2 (2021), 1--20.
[53]
Panagiotis Mavridis, Owen Huang, Sihang Qiu, Ujwal Gadiraju, and Alessandro Bozzon. 2019. Chatterbox: Conversational interfaces for microtask crowdsourcing. In Proceedings of the 27th ACM Conference on User Modeling, Adaptation and Personalization. 243--251.
[54]
Daniel J McAllister. 1995. Affect-and cognition-based trust as foundations for interpersonal cooperation in organizations. Academy of management journal 38, 1 (1995), 24--59.
[55]
Robert R Morris, Mira Dontcheva, and Elizabeth M Gerber. 2012. Priming for better performance in microtask crowdsourcing environments. IEEE Internet Computing 16, 5 (2012), 13--19.
[56]
Evayyaoc Movpeaatoc. 2019. Quality of work in online labor markets: an empirical study in paid crowdsourcing environments. Ph.D. Dissertation. http://hdl.handle.net/10889/13366
[57]
Krista R Muis, Cynthia Psaradellis, Marianne Chevrier, Ivana Di Leo, and Susanne P Lajoie. 2016. Learning by preparing to teach: Fostering self-regulatory processes and achievement during complex mathematics problem solving. Journal of Educational Psychology 108, 4 (2016), 474.
[58]
John F Nestojko, Dung C Bui, Nate Kornell, and Elizabeth Ligon Bjork. 2014. Expecting to teach enhances learning and organization of knowledge in free recall of text passages. Memory & Cognition 42, 7 (2014), 1038--1048.
[59]
Sharon Oviatt, Colin Swindells, and Alex Arthur. 2008. Implicit user-adaptive system engagement in speech and pen interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 969--978.
[60]
Aannemarie Sullivan Palinscar and Ann L Brown. 1984. Reciprocal teaching of comprehension-fostering and comprehension-monitoring activities. Cognition and instruction 1, 2 (1984), 117--175.
[61]
Neil Patel, Sheetal Agarwal, Nitendra Rajput, Amit Nanavati, Paresh Dave, and Tapan S Parikh. 2009. A comparative study of speech and dialed input voice interfaces in rural India. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 51--54.
[62]
Paul A Pavlou. 2003. Consumer acceptance of electronic commerce: Integrating trust and risk with the technology acceptance model. International journal of electronic commerce 7, 3 (2003), 101--134.
[63]
Reid Porter, James Theiler, and Don Hush. 2013. Interactive machine learning in data exploitation. Computing in Science & Engineering 15, 5 (2013), 12--20.
[64]
Denise M Rousseau, Sim B Sitkin, Ronald S Burt, and Colin Camerer. 1998. Not so different after all: A cross-discipline view of trust. Academy of management review 23, 3 (1998), 393--404.
[65]
Richard M Ryan. 1982. Control and information in the intrapersonal sphere: An extension of cognitive evaluation theory. Journal of personality and social psychology 43, 3 (1982), 450.
[66]
Stefan Schaffer, Robert Schleicher, and Sebastian Möller. 2015. Modeling input modality choice in mobile graphical and speech interfaces. International Journal of Human-Computer Studies 75 (2015), 21--34.
[67]
F David Schoorman, Roger C Mayer, and James H Davis. 2007. An integrative model of organizational trust: Past, present, and future.
[68]
Alex Sciuto, Arnita Saini, Jodi Forlizzi, and Jason I Hong. 2018. Hey Alexa, What's Up?: A mixed-methods studies of in-home conversational agent usage. In Proceedings of the 2018 Designing Interactive Systems Conference. ACM, 857--868.
[69]
John R Searle, Ferenc Kiefer, Manfred Bierwisch, et al. 1980. Speech act theory and pragmatics. Vol. 10. Springer.
[70]
Burr Settles. 2009. Active learning literature survey. Technical Report. University of Wisconsin-Madison Department of Computer Sciences.
[71]
Patrice Y Simard, Saleema Amershi, David M Chickering, Alicia Edelman Pelton, Soroush Ghorashi, Christopher Meek, Gonzalo Ramos, Jina Suh, Johan Verwey, Mo Wang, et al . 2017. Machine teaching: A new paradigm for building machine learning systems. arXiv preprint arXiv:1707.06742 (2017).
[72]
David Traum, Priti Aggarwal, Ron Artstein, Susan Foutz, Jillian Gerten, Athanasios Katsamanis, Anton Leuski, Dan Noren, and William Swartout. 2012. Ada and Grace: Direct interaction with museum visitors. In International conference on intelligent virtual agents. Springer, 245--251.
[73]
Weiquan Wang, Lingyun Qiu, Dongmin Kim, and Izak Benbasat. 2016. Effects of rational and social appeals of online recommendation agents on cognition-and affect-based trust. Decision Support Systems 86 (2016), 48--60.
[74]
Noreen M Webb. 1983. Predicting learning from student interaction: Defining the interaction variables. Educational psychologist 18, 1 (1983), 33--41.
[75]
Alex C Williams, Gloria Mark, Kristy Milland, Edward Lank, and Edith Law. 2019. The perpetual work life of crowdworkers: How tooling practices increase fragmentation in crowdwork. Proceedings of the ACM on Human-Computer Interaction 3, CSCW (2019), 1--28.
[76]
Teng Ye, Sangseok You, and Lionel Robert Jr. 2017. When does more money work? Examining the role of perceived fairness in pay on the performance quality of crowdworkers. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 11.
[77]
Ming Yin, Yiling Chen, and Yu-An Sun. 2013. The effects of performance-contingent financial incentives in online labor markets. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 27.
[78]
Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. Character-level convolutional networks for text classification. In Advances in neural information processing systems. 649--657.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Human-Computer Interaction
Proceedings of the ACM on Human-Computer Interaction  Volume 6, Issue CSCW2
CSCW
November 2022
8205 pages
EISSN:2573-0142
DOI:10.1145/3571154
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 November 2022
Published in PACMHCI Volume 6, Issue CSCW2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. conversational interactions
  2. human-AI interaction
  3. interactive machine learning
  4. learning by teaching
  5. trusting conversational AI

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 224
    Total Downloads
  • Downloads (Last 12 months)85
  • Downloads (Last 6 weeks)8
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media